The public API of wreck
.
Notes:
nil
, this library does minimal argument
checking, since the rules for regexes vary from platform to platform, and it
is a first class requirement that callers be allowed to construct platform
specific regexes if they wish.java.util.regex.PatternSyntaxException
class.js/SyntaxError
.#"{}"
(an error on the JVM, fine but
nonsensical on JS) and #"{1}"
(ironically, fine but nonsensical on the
JVM, but an error on JS). 🤡RegExp
objects to String
s and back, something this library relies
upon and does extensively. The library makes a best effort to correct
JavaScript's problematic implementation, but because it's fundamentally
lossy there are some cases that (on ClojureScript only) may change your
regexes in unexpected (though probably not semantically significant) ways.The public API of [`wreck`](https://github.com/pmonks/wreck). Notes: * Apart from passing through `nil`, this library does minimal argument checking, since the rules for regexes vary from platform to platform, and it is a first class requirement that callers be allowed to construct platform specific regexes if they wish. * As a result, all functions have the potential to throw platform-specific exceptions if the resulting regex is syntactically invalid. * On the JVM, these will typically be instances of the `java.util.regex.PatternSyntaxException` class. * On JavaScript, these will typically be a `js/SyntaxError`. * Platform specific behaviour is particularly notable for short / empty regexes, such as `#"{}"` (an error on the JVM, fine but nonsensical on JS) and `#"{1}"` (ironically, fine but nonsensical on the JVM, but an error on JS). 🤡 * Furthemore, JavaScript fundamentally doesn't support lossless round-tripping of `RegExp` objects to `String`s and back, something this library relies upon and does extensively. The library makes a best effort to correct JavaScript's problematic implementation, but because it's fundamentally lossy there are some cases that (on ClojureScript only) may change your regexes in unexpected (though _probably_ not semantically significant) ways. * Regex flags (which aren't natively supported by Clojure's regex literals, so may be uncommon) are supported to the best ability of the library, but please carefully review the [usage notes in README.md](https://github.com/pmonks/wreck?tab=readme-ov-file#usage) for various caveats, especially on ClojureScript.
(=' _)
(=' re1 re2)
(=' re1 re2 & more)
Equality for regexes, defined by having equal String
representations (as
per str'
) and flags (as per flags
). This means that equivalent
regexes (e.g. #"..."
and #".{3}"
will not be considered equal.
Equality for regexes, defined by having equal `String` representations (as per [[str']]) and flags (as per [[flags]]). This means that _equivalent_ regexes (e.g. `#"..."` and `#".{3}"` will _not_ be considered equal.
(alt & res)
Returns a regex that will match any one of res
, via alternation.
Notes:
res
will only appear once in the result.Returns a regex that will match any one of `res`, via alternation. Notes: * Duplicate elements in `res` will only appear once in the result. * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(and' a b)
(and' a b s)
Returns an 'and' regex that will match a
and b
in any order, and with the
s
eparator regex (if provided) between them. This is implemented as
ASB|BSA
, which means that A and B must be distinct (must not match the same
text).
Notes:
alt
).Returns an 'and' regex that will match `a` and `b` in any order, and with the `s`eparator regex (if provided) between them. This is implemented as `ASB|BSA`, which means that A and B must be distinct (must not match the same text). Notes: * May optimise the expression (via de-duplication in [[alt]]). * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(and-cg a b)
(and-cg a b s)
Notes:
-cg
fns, this one does not accept any number of res.alt
).[[and']] then [[cg]]. Notes: * Unlike most other `-cg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(and-grp a b)
(and-grp a b s)
Notes:
-grp
fns, this one does not accept any number of res.alt
).[[and']] then [[grp]]. Notes: * Unlike most other `-grp` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(and-ncg nm a b)
(and-ncg nm a b s)
Notes:
-ncg
fns, this one does not accept any number of res.alt
).[[and']] then [[ncg]]. Notes: * Unlike most other `-ncg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(cg & res)
As for grp
, but uses a capturing group.
As for [[grp]], but uses a capturing group.
(embed-flags re)
Embeds any flags found in re
at the start of re
in a non-capturing group
(to ensure scoping), returning a new regex. Returns re
if re
contains no
flags or is nil
.
For example #"(?i)[abc]+"
would become #"(?i:[abc]+)"
.
Note:
wreck
, but is
useful in those rare cases where Clojure code receives a 3rd party regex,
wishes to use it as part of composing a larger regex, and doesn't know if it
contains flags or not. In all other cases, flags-grp
is a better
choice.re
will be moved to the beginning of the
regex. This may alter the semantics of the regex - for example a(?i)b
will become (?i:ab)
, which means that a
will be matched case-
insensitively by the result, which is not the same as the original (which
matches lower-case a
only). This is an unavoidable consequence of how the
JVM regex engine reports embedded flags. If you really need to use an
embedded flag midway through a regex, use flags-grp
.Embeds any flags found in `re` at the start of `re` in a non-capturing group (to ensure scoping), returning a new regex. Returns `re` if `re` contains no flags or is `nil`. For example `#"(?i)[abc]+"` would become `#"(?i:[abc]+)"`. Note: * This function is only available on the JVM. JavaScript's regex engine does not support embedded flags. * This function is primarily intended for internal use by `wreck`, but is useful in those rare cases where Clojure code receives a 3rd party regex, wishes to use it as part of composing a larger regex, and doesn't know if it contains flags or not. In all other cases, [[flags-grp]] is a better choice. * Embedded flags in the middle of `re` will be moved to the beginning of the regex. This may alter the semantics of the regex - for example `a(?i)b` will become `(?i:ab)`, which means that `a` will be matched case- insensitively by the result, which is _not_ the same as the original (which matches lower-case `a` only). This is an unavoidable consequence of how the JVM regex engine reports embedded flags. If you really need to use an embedded flag midway through a regex, use [[flags-grp]].
(empty?' re)
Is re
nil
or (=' #"")
?
Notes:
Is `re` `nil` or `(=' #"")`? Notes: * Takes flags (if any) into account.
(esc s)
Escapes s
(a String
) for use in a regex, returning a String
.
Notes:
Escapes `s` (a `String`) for use in a regex, returning a `String`. Notes: * unlike most other fns in this namespace, this one does _not_ support a regex as an input, nor return a regex as an output
(exn n re)
Returns a regex where re
will match exactly n
times.
Returns a regex where `re` will match exactly `n` times.
(flags re)
Returns the flags for re
as a set of characters, or nil
if re
doesn't
have any or is not a regex.
Notes:
LITERAL
and CANON_EQ
) will cause an ex-info
to be thrown. If you
specifically need to handle these flags, raw-flags
may be useful but it
should only be used as a last resort as its behaviour is platform specific.Returns the flags for `re` as a set of characters, or `nil` if `re` doesn't have any or is not a regex. Notes: * on the JVM, flags that don't have an embedded equivalent (as of JVM 25, `LITERAL` and `CANON_EQ`) will cause an `ex-info` to be thrown. If you specifically need to handle these flags, [[raw-flags]] may be useful but it should only be used as a last resort as its behaviour is platform specific. * the JVM considers _some but not all_ embedded flags as flags. See the unit tests for details.
(flags-grp flgs & res)
As for grp
, but prefixes the group with flgs
(a set of regex flag
characters, such as those returned by flags
). See the 'special
constructs' section of the java.util.regex.Pattern
JavaDoc
for the set of valid flag characters.
Notes:
As for [[grp]], but prefixes the group with `flgs` (a set of regex flag characters, such as those returned by [[flags]]). See the ['special constructs' section of the `java.util.regex.Pattern` JavaDoc](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/regex/Pattern.html#special) for the set of valid flag characters. Notes: * This function is only available on the JVM. JavaScript's regex engine does not support embedded flags.
(grp & res)
As for join
, but encloses the joined res
into a single non-capturing
group.
As for [[join]], but encloses the joined `res` into a single non-capturing group.
(has-flags? re)
Does re
have any flags?
Notes:
false
if re
is not a regexDoes `re` have any flags? Notes: * returns `false` if `re` is not a regex * the JVM considers _some but not all_ embedded flags as flags. See the unit tests for details.
(join & res)
Returns a regex that is all of the res
joined together. Each element in
res
can be a regex, a String
or something that can be turned into a
String
(including numbers, etc.). Returns nil
when no res
are provided,
or they're all nil
.
Notes:
Returns a regex that is all of the `res` joined together. Each element in `res` can be a regex, a `String` or something that can be turned into a `String` (including numbers, etc.). Returns `nil` when no `res` are provided, or they're all `nil`. Notes: * In ClojureScript be cautious about using numbers in these calls, since JavaScript's number handling is a 🤡show. See the unit tests for examples.
(n2m n m re)
Returns a regex where re
will match from n
to m
times.
Returns a regex where `re` will match from `n` to `m` times.
(ncg nm & res)
As for grp
, but uses a named capturing group named nm
. Returns nil
if
nm
is nil
or blank. Throws if nm
is an invalid name for a named capturing
group (alphanumeric only, must start with an alphabetical character, must be
unique within the regex).
As for [[grp]], but uses a named capturing group named `nm`. Returns `nil` if `nm` is `nil` or blank. Throws if `nm` is an invalid name for a named capturing group (alphanumeric only, must start with an alphabetical character, must be unique within the regex).
(nom n re)
Returns a regex where re
will match n
or more times.
Returns a regex where `re` will match `n` or more times.
(oom re)
Returns a regex where re
will match one or more times.
Returns a regex where `re` will match one or more times.
(opt re)
Returns a regex where re
is optional.
Returns a regex where `re` is optional.
(or' a b)
(or' a b s)
Returns an 'inclusive or' regex that will match a
or b
, or both, in any
order, and with the s
eparator regex (if provided) between them. This is
implemented as ASB|BSA|A|B
, which means that A and B must be distinct (must
not match the same text).
Notes:
alt
).Returns an 'inclusive or' regex that will match `a` or `b`, or both, in any order, and with the `s`eparator regex (if provided) between them. This is implemented as `ASB|BSA|A|B`, which means that A and B must be distinct (must not match the same text). Notes: * May optimise the expression (via de-duplication in [[alt]]). * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(or-cg a b)
(or-cg a b s)
Notes:
-cg
fns, this one does not accept any number of res.alt
).[[or']] then [[cg]]. Notes: * Unlike most other `-cg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(or-grp a b)
(or-grp a b s)
Notes:
-grp
fns, this one does not accept any number of res.alt
).[[or']] then [[grp]]. Notes: * Unlike most other `-grp` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(or-ncg nm a b)
(or-ncg nm a b s)
Notes:
-ncg
fns, this one does not accept any number of res.alt
).[[or']] then [[ncg]]. Notes: * Unlike most other `-ncg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(qot re)
Quotes re
(anything that can be accepted by join
), returning a regex.
Quotes `re` (anything that can be accepted by [[join]]), returning a regex.
(raw-flags re)
Returns the raw, platform specific flags in re
. On the JVM this is an
int
, on JavaScript this is a String
. If re
has no flags, or re
is not
a regex, returns nil
.
⚠️ Because this function has platform specific behaviour, it is strongly
recommended that callers use flags
instead (that function is not
platform specific, at least in its contract). The one reasonable exception to
this guideline is on the JVM, in the narrow case where a caller needs to check
for a non-embeddable flag (as of JVM 25, LITERAL
and CANON_EQ
) - in that
case flags
throws, which may be a hindrance.
Notes:
Returns the raw, platform specific flags in `re`. On the JVM this is an `int`, on JavaScript this is a `String`. If `re` has no flags, or `re` is not a regex, returns `nil`. ⚠️ Because this function has platform specific behaviour, it is _strongly_ recommended that callers use [[flags]] instead (that function is _not_ platform specific, at least in its contract). The one reasonable exception to this guideline is on the JVM, in the narrow case where a caller needs to check for a non-embeddable flag (as of JVM 25, `LITERAL` and `CANON_EQ`) - in that case [[flags]] throws, which may be a hindrance. Notes: * the JVM considers _some but not all_ embedded flags as flags. See the unit tests for details.
(set-flags re flgs)
Sets the flags on re
to flgs
(a set of flag characters, such as those
returned by flags
- may also be nil
to strip all flags), returning a new
regex. All existing flags in re
are replaced. Returns nil
if re
is
nil
.
⚠️ Because this function has platform specific behaviour, its use is discouraged.
On the JVM, it's strongly recommended to use flags-grp
instead of this
function, since that gives explicit control over how multiple regexes with
different flag sets compose together.
On JavaScript there's no choice - JavaScript's regex engine doesn't support embedded flags so flags always apply globally. It is therefore recommended to keep flags out of regex fragments used for composition entirely, and only settings flags (if needed) globally to the final, fully composed regex, using this function.
Note:
flgs
contains invalid flag characters.flags
and set-flags
(unlike on JavaScript).Sets the flags on `re` to `flgs` (a set of flag characters, such as those returned by [[flags]] - may also be `nil` to strip all flags), returning a new regex. All existing flags in `re` are replaced. Returns `nil` if `re` is `nil`. ⚠️ Because this function has platform specific behaviour, its use is discouraged. On the JVM, it's _strongly_ recommended to use [[flags-grp]] instead of this function, since that gives explicit control over how multiple regexes with different flag sets compose together. On JavaScript there's no choice - JavaScript's regex engine doesn't support embedded flags so flags always apply globally. It is therefore recommended to keep flags out of regex fragments used for composition entirely, and only settings flags (if needed) globally to the final, fully composed regex, using this function. Note: * Throws if `flgs` contains invalid flag characters. * On the JVM, all programmatic AND embedded flags in the regex will be removed, except embedded flags that appear in a non-capturing group (those will be retained, since the JVM doesn't consider them to be 'flags'). * On the JVM, the flags will be set via a non-capturing group at the start of the regex that encloses the entire thing. This ensures that regexes with flags can be safely combined with other regexes with different flags, with correct scoping of each regex's flags. It also means that flags do _not_ round-trip between [[flags]] and [[set-flags]] (unlike on JavaScript). * On JavaScript, the flags will be set programmatically (i.e. globally for the entire regex), since JavaScript's regex engine doesn't support embedded flags of any kind (and therefore flags can't be scoped to subsets of a regex). This is obviously a problem if you're trying to compose regexes that have mutually exclusive flags.
(str' o)
Returns the String
representation of o
, with special handling for
RegExp
objects on ClojureScript in an attempt to correct JavaScript's
APPALLING default stringification.
Notes:
embed-flags
).flags
and
set-flags
in combination to preserve flags if needed, but note that
JavaScript only supports global flags - unlike the JVM there is no way to
scope flags to subsets of a regex.Returns the `String` representation of `o`, with special handling for `RegExp` objects on ClojureScript in an attempt to correct JavaScript's **APPALLING** default stringification. Notes: * On the JVM will embed all flags (as per [[embed-flags]]). * On JavaScript this will silently drop flags. You may use [[flags]] and [[set-flags]] in combination to preserve flags if needed, but note that JavaScript only supports global flags - unlike the JVM there is no way to scope flags to subsets of a regex.
(xor' a b)
Returns an 'exclusive or' regex that will match a
or b
, but not both.
This is identical to alt
called with 2 arguments, and is provided as a
convenience for those who might be building up large logic based regexes and
would prefer to use more easily understood logical operator names throughout.
Notes:
alt
).Returns an 'exclusive or' regex that will match `a` or `b`, but _not_ both. This is identical to [[alt]] called with 2 arguments, and is provided as a convenience for those who might be building up large logic based regexes and would prefer to use more easily understood logical operator names throughout. Notes: * May optimise the expression (via de-duplication in [[alt]]). * Does _not_ wrap the result in a group, which, [because alternation has the lowest precedence in regexes](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_08), runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should _almost always_ be preferred.
(xor-cg a b)
Notes:
-cg
fns, this one does not accept any number of res.alt
).[[xor']] then [[cg]]. Notes: * Unlike most other `-cg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(xor-grp a b)
Notes:
-grp
fns, this one does not accept any number of res.alt
).[[xor']] then [[grp]]. Notes: * Unlike most other `-grp` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(xor-ncg nm a b)
Notes:
-ncg
fns, this one does not accept any number of res.alt
).[[xor']] then [[ncg]]. Notes: * Unlike most other `-ncg` fns, this one does _not_ accept any number of res. * May optimise the expression (via de-duplication in [[alt]]).
(zom re)
Returns a regex where re
will match zero or more times.
Returns a regex where `re` will match zero or more times.
cljdoc builds & hosts documentation for Clojure/Script libraries
Ctrl+k | Jump to recent docs |
← | Move to previous article |
→ | Move to next article |
Ctrl+/ | Jump to the search field |