The public API of wreck.
Notes:
nil, this library does minimal argument
checking, since the rules for regexes vary from platform to platform, and it
is a first class requirement that callers be allowed to construct platform
specific regexes if they wish.java.util.regex.PatternSyntaxException
class. On JavaScript, these will typically be a js/SyntaxError.#"{}" (an error on the JVM, fine but
nonsensical on JS) and #"{1}" (ironically fine but nonsensical on the
JVM, but an error on JS). 🤡RegExp objects to Strings and back, something this library does
extensively. The library makes a best effort to correct JavaScript's
problematic implementation, but because it's fundamentally lossy there are
some cases that (on ClojureScript only) may change your regexes in
unexpected (though probably not semantically significant) ways.String escaping or quoting automatically.
You can use esc or qot for this.The public API of [`wreck`](https://github.com/pmonks/wreck).
Notes:
* Apart from passing through `nil`, this library does minimal argument
checking, since the rules for regexes vary from platform to platform, and it
is a first class requirement that callers be allowed to construct platform
specific regexes if they wish.
* As a result, all functions have the potential to throw platform-specific
exceptions if the resulting regex is syntactically invalid. On the JVM,
these will typically be instances of the `java.util.regex.PatternSyntaxException`
class. On JavaScript, these will typically be a `js/SyntaxError`.
* Platform specific behaviour is particularly notable for short / empty
regexes, such as `#"{}"` (an error on the JVM, fine but
nonsensical on JS) and `#"{1}"` (ironically fine but nonsensical on the
JVM, but an error on JS). 🤡
* Furthemore, JavaScript fundamentally doesn't support lossless round-tripping
of `RegExp` objects to `String`s and back, something this library does
extensively. The library makes a best effort to correct JavaScript's
problematic implementation, but because it's fundamentally lossy there are
some cases that (on ClojureScript only) may change your regexes in
unexpected (though _probably_ not semantically significant) ways.
* Regex flags are supported to the best ability of the library, but please
carefully review the [usage notes in README.md](https://github.com/pmonks/wreck?tab=readme-ov-file#regex-flags)
for various caveats when flags are used.
* None of these functions perform `String` escaping or quoting automatically.
You can use [[esc]] or [[qot]] for this.(=' _)(=' re1 re2)(=' re1 re2 & more)Equality for regexes, defined by having equal string representations and flags (including flags that cannot be embedded).
Notes:
#"..." and #".{3}" are not
considered ='.=' initially due to differing flag sets, but after
being run through embed-flags may become =', due to non-embeddable
flags being silently dropped (see embed-flags for details).Equality for regexes, defined by having equal string representations and
flags (including flags that cannot be embedded).
Notes:
* Functionally equivalent regexes (e.g. `#"..."` and `#".{3}"` are _not_
considered `='`.
* Some regexes may not be `='` initially due to differing flag sets, but after
being run through [[embed-flags]] may become `='`, due to non-embeddable
flags being silently dropped (see [[embed-flags]] for details).(alt & res)Returns a regex that will match any one of res, via alternation:
re|re|re|...Notes:
res will only appear once in the result. This
equality comparison occurs after each re is run through embed-flags.Returns a regex that will match any one of `res`, via alternation: * `re|re|re|...` Notes: * Duplicate elements in `res` will only appear once in the result. This equality comparison occurs _after_ each re is run through [[embed-flags]]. * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(and' a b)(and' a b s)Returns an 'and' regex that will match a and b in any order, and with the
separator regex s (if provided) between them:
asb|bsaNotes:
a and b must be distinct (must not match the same text) or else the
resulting regex will be logically inconsistent (will not be an 'and')alt).Returns an 'and' regex that will match `a` and `b` in any order, and with the separator regex `s` (if provided) between them: * `asb|bsa` Notes: * `a` and `b` must be distinct (must not match the same text) or else the resulting regex will be logically inconsistent (will not be an 'and') * May optimise the expression (via de-duplication in [[alt]]). * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(and-cg a b)(and-cg a b s)(asb|bsa)Notes:
-cg fns, this one does not accept any number of res.alt).[[and']] then [[cg]]: * `(asb|bsa)` Notes: * Unlike most other `-cg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(and-grp a b)(and-grp a b s)(?:asb|bsa)Notes:
-grp fns, this one does not accept any number of res.alt).[[and']] then [[grp]]: * `(?:asb|bsa)` Notes: * Unlike most other `-grp` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(and-ncg nm a b)(and-ncg nm a b s)(?<nm>asb|bsa)Notes:
-ncg fns, this one does not accept any number of res.alt).[[and']] then [[ncg]]: * `(?<nm>asb|bsa)` Notes: * Unlike most other `-ncg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(cg & res)As for grp, but uses a capturing group:
(res)As for [[grp]], but uses a capturing group: * `(res)`
(chcl & res)As for join, but encloses the joined res into a character class:
[res]Notes:
(re-matches #"[a[b[c]]]+" "abc"). As a result it's worth being
particularly careful when composing character classes programmatically, to
avoid accidentally nesting them.As for [[join]], but encloses the joined `res` into a character class: * `[res]` Notes: * ⚠️ On ClojureScript nested character classes don't work as one might expect, even though they will compile just fine. For example, this code matches as expected on ClojureJVM, but does not on ClojureScript: `(re-matches #"[a[b[c]]]+" "abc")`. As a result it's worth being particularly careful when composing character classes programmatically, to avoid accidentally nesting them.
(embed-flags re)Embeds any programmatic or ungrouped flags found in re. It does this by
removing all flags from re then wrapping it in a flag group containing those
flags that are embeddable (non-embeddable flags are silently dropped - use
has-non-embeddable-flags? if you need to check for this). Returns re if
re contains no flags.
For example on the JVM, both (Pattern/compile "[abc]+" Pattern/CASE_INSENSITIVE)
and #"(?i)[abc]+" would become #"(?i:[abc]+)".
Similarly, on ClojureScript (doto (js/RegExp.) (.compile "[abc]+" "i"))
would become #"(?i:[abc]+)".
Note:
flags-grp is almost always a better choice than this function!
embed-flags is primarily intended for internal use by wreck, but may be
useful in those rare cases where Clojure(Script) code receives a 3rd party
regex, wishes to use it as part of composing a larger regex, doesn't
know if it contains flags or not, and doesn't care that non-embeddable flags
will be silently dropped.re will be moved
to the beginning of the regex. This may alter the semantics of the regex -
for example a(?i)b will become (?i:ab), which means that a will be
matched case-insensitively by the result, which is not the same as the
original (which matches lower-case a only). This is an unavoidable
consequence of how the JVM regex engine reports flags. If you really need
to use embedded flag(s) midway through a regex, use flags-grp to ensure
proper scoping of the flag(s).LITERAL and CANON_EQ have no
embeddable equivalent, and will be silently dropped by this function.i, m, and s can be embedded. All
other flags will be silently dropped by this function.Embeds any programmatic or ungrouped flags found in `re`. It does this by removing all flags from `re` then wrapping it in a flag group containing those flags that are embeddable (non-embeddable flags are silently dropped - use [[has-non-embeddable-flags?]] if you need to check for this). Returns `re` if `re` contains no flags. For example on the JVM, both `(Pattern/compile "[abc]+" Pattern/CASE_INSENSITIVE)` and `#"(?i)[abc]+"` would become `#"(?i:[abc]+)"`. Similarly, on ClojureScript `(doto (js/RegExp.) (.compile "[abc]+" "i"))` would become `#"(?i:[abc]+)"`. Note: * **[[flags-grp]] is almost always a better choice than this function!** `embed-flags` is primarily intended for internal use by `wreck`, but may be useful in those rare cases where Clojure(Script) code receives a 3rd party regex, wishes to use it as part of composing a larger regex, doesn't know if it contains flags or not, and doesn't care that non-embeddable flags will be silently dropped. * ⚠️ On the JVM, ungrouped embedded flags in the middle of `re` will be moved to the beginning of the regex. This may alter the semantics of the regex - for example `a(?i)b` will become `(?i:ab)`, which means that `a` will be matched case-insensitively by the result, which is _not_ the same as the original (which matches lower-case `a` only). This is an unavoidable consequence of how the JVM regex engine reports flags. If you really need to use embedded flag(s) midway through a regex, use [[flags-grp]] to ensure proper scoping of the flag(s). * ⚠️ On the JVM, the programmatic flags `LITERAL` and `CANON_EQ` have no embeddable equivalent, and will be silently dropped by this function. * ⚠️ On JavaScript, only the flags `i`, `m`, and `s` can be embedded. All other flags will be silently dropped by this function.
(empty?' re)Is re nil or (=' #"")?
Notes:
Is `re` `nil` or `(=' #"")`? Notes: * Takes flags (if any) into account.
(esc s)Escapes s (a String) for use in a regex, returning a String.
Notes:
Escapes `s` (a `String`) for use in a regex, returning a `String`. Notes: * unlike most other fns in this namespace, this one does _not_ support a regex as an input, nor return a regex as an output
(exn n re)Returns a regex where re will match exactly n times:
re{n}Returns a regex where `re` will match exactly `n` times:
* `re{n}`(flags-grp flgs & res)As for grp, but prefixes the group with flgs (a String):
(?flgs:res)Returns nil if flgs is nil or empty. Throws if flgs contains an
invalid flag character, including those that (ClojureScript only) cannot be
embedded.
Notes:
(?i)) have no explicit scope and so cannot be reliably used to compose
larger regexes. wreck makes a best effort to always convert such
'unscoped' flags into their embedded (scoped) equivalents (using
embed-flags) when composing larger regexes , but using flags-grp
explicitly in the first place is easier to reason about and avoids potential
footguns.re (e.g. (?i)ab), but does not
add them to flgs if they aren't already there.re (e.g.
a(?i)b) will also be removed, which may alter the semantics of the regex.i, m and s can be embedded (this is a
limitation of the JavaScript regex engine). Other flags will result in a
js/SyntaxError being thrown.java.util.regex.Pattern JavaDoc
for the set of valid flag characters.RegExp flags reference
for the set of valid flag characters (while keeping in mind most of them
can't be embedded).As for [[grp]], but prefixes the group with `flgs` (a `String`): * `(?flgs:res)` Returns `nil` if `flgs` is `nil` or empty. Throws if `flgs` contains an invalid flag character, including those that (ClojureScript only) cannot be embedded. Notes: * If you must use regex flags, **it is STRONGLY RECOMMENDED that you use this function!** Programmatically set flags and ungrouped embedded flags (e.g. `(?i)`) have no explicit scope and so cannot be reliably used to compose larger regexes. `wreck` makes a best effort to always convert such 'unscoped' flags into their embedded (scoped) equivalents (using [[embed-flags]]) when composing larger regexes , but using `flags-grp` explicitly in the first place is easier to reason about and avoids potential footguns. * Removes any ungrouped embedded flags in `re` (e.g. `(?i)ab`), but does _not_ add them to `flgs` if they aren't already there. * ⚠️ On the JVM, ungrouped embedded flags _in the middle of `re`_ (e.g. `a(?i)b`) will also be removed, which may alter the semantics of the regex. * ⚠️ On JavaScript, only the flags `i`, `m` and `s` can be embedded (this is a limitation of the JavaScript regex engine). Other flags will result in a `js/SyntaxError` being thrown. * For the JVM, see the ['special constructs' section of the `java.util.regex.Pattern` JavaDoc](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/regex/Pattern.html#special) for the set of valid flag characters. * For JavaScript, see the [`RegExp` flags reference](https://www.w3schools.com/js/js_regexp_flags.asp) for the set of valid flag characters (while keeping in mind most of them can't be embedded).
(grp & res)As for join, but encloses the joined res into a single non-capturing
group:
(?:res)As for [[join]], but encloses the joined `res` into a single non-capturing group: * `(?:res)`
(has-non-embeddable-flags? re)Does re have non-embeddable flags?
Notes:
LITERAL and CANON_EQ.i, m, and s.Does `re` have non-embeddable flags? Notes: * On the JVM, the only non-embeddable flags are the programmatic flags `LITERAL` and `CANON_EQ`. * On JavaScript, this is every flag _except_ `i`, `m`, and `s`.
(join & res)Returns a regex that is all of the res joined together. Each element in
res can be a regex, a String or something that can be turned into a
String (including numbers, etc.). Ignores nil values in res, and
returns nil when no res are provided or they're all nil.
Notes:
Returns a regex that is all of the `res` joined together. Each element in `res` can be a regex, a `String` or something that can be turned into a `String` (including numbers, etc.). Ignores `nil` values in `res`, and returns `nil` when no `res` are provided or they're all `nil`. Notes: * ⚠️ In ClojureScript be cautious about using numbers in these calls, since JavaScript's number handling is a 🤡show. See the unit tests for examples.
(n2m n m re)Returns a regex where re will match from n to m times:
re{n,m}Returns a regex where `re` will match from `n` to `m` times:
* `re{n,m}`(ncg nm & res)As for grp, but uses a named capturing group named nm:
(?<nm>res)Returns nil if nm is nil or blank. Throws if nm is an invalid name for
a named capturing group (alphanumeric only, must start with an alphabetical
character, must be unique within the regex).
As for [[grp]], but uses a named capturing group named `nm`: * `(?<nm>res)` Returns `nil` if `nm` is `nil` or blank. Throws if `nm` is an invalid name for a named capturing group (alphanumeric only, must start with an alphabetical character, must be unique within the regex).
(nom n re)Returns a regex where re will match n or more times:
re{n,}Returns a regex where `re` will match `n` or more times:
* `re{n,}`(oom re)Returns a regex where re will match one or more times:
re+Returns a regex where `re` will match one or more times: * `re+`
(opt re)Returns a regex where re is optional:
re?Returns a regex where `re` is optional: * `re?`
(or' a b)(or' a b s)Returns an 'inclusive or' regex that will match a or b, or both, in any
order, and with the separator regex s (if provided) between them:
asb|bsa|a|bNotes:
a and b must be distinct (must not match the same text) or else the
resulting regex will be logically inconsistent (will not be an 'or')alt).Returns an 'inclusive or' regex that will match `a` or `b`, or both, in any order, and with the separator regex `s` (if provided) between them: * `asb|bsa|a|b` Notes: * `a` and `b` must be distinct (must not match the same text) or else the resulting regex will be logically inconsistent (will not be an 'or') * May optimise the expression (via de-duplication in [[alt]]). * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(or-cg a b)(or-cg a b s)(asb|bsa|a|b)Notes:
-cg fns, this one does not accept any number of res.alt).[[or']] then [[cg]]: * `(asb|bsa|a|b)` Notes: * Unlike most other `-cg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(or-grp a b)(or-grp a b s)(?:asb|bsa|a|b)Notes:
-grp fns, this one does not accept any number of res.alt).[[or']] then [[grp]]: * `(?:asb|bsa|a|b)` Notes: * Unlike most other `-grp` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(or-ncg nm a b)(or-ncg nm a b s)(?<nm>asb|bsa|a|b)Notes:
-ncg fns, this one does not accept any number of res.alt).[[or']] then [[ncg]]: * `(?<nm>asb|bsa|a|b)` Notes: * Unlike most other `-ncg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(regex? o)Is o a regex?
Notes:
regexp? predicate in cljs.core, but
ClojureJVM doesn't. See this ask.clojure.org post.Is `o` a regex? Notes: * ClojureScript already has a `regexp?` predicate in `cljs.core`, but ClojureJVM doesn't. See [this ask.clojure.org post](https://ask.clojure.org/index.php/1127/add-clojure-core-pattern-predicate).
(str' o)Returns the String representation of o, with special handling for
RegExp objects on ClojureScript in an attempt to correct JavaScript's
APPALLING default stringification.
Notes:
embed-flags).Returns the `String` representation of `o`, with special handling for `RegExp` objects on ClojureScript in an attempt to correct JavaScript's **APPALLING** default stringification. Notes: * Embeds flags (as per [[embed-flags]]).
(xor' a b)Returns an 'exclusive or' regex that will match a or b, but not both:
a|bThis is identical to alt called with 2 arguments, but is provided as a
convenience for those who might be building up large logic based regexes and
would prefer to use more easily understood logical operator names throughout.
Notes:
alt).Returns an 'exclusive or' regex that will match `a` or `b`, but _not_ both: * `a|b` This is identical to [[alt]] called with 2 arguments, but is provided as a convenience for those who might be building up large logic based regexes and would prefer to use more easily understood logical operator names throughout. Notes: * May optimise the expression (via de-duplication in [[alt]]). * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(xor-cg a b)(a|b)Notes:
-cg fns, this one does not accept any number of res.alt).[[xor']] then [[cg]]: * `(a|b)` Notes: * Unlike most other `-cg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(xor-grp a b)(?:a|b)Notes:
-grp fns, this one does not accept any number of res.alt).[[xor']] then [[grp]]: * `(?:a|b)` Notes: * Unlike most other `-grp` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(xor-ncg nm a b)(?<nm>a|b)Notes:
-ncg fns, this one does not accept any number of res.alt).[[xor']] then [[ncg]]: * `(?<nm>a|b)` Notes: * Unlike most other `-ncg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(zom re)Returns a regex where re will match zero or more times:
re*Returns a regex where `re` will match zero or more times: * `re*`
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |