The public API of clj-wcwidth.
The public API of [`clj-wcwidth`](https://github.com/pmonks/clj-wcwidth).
(code-point->string code-point)Returns the String representation of any Unicode code-point<sup>†</sup>,
or nil when code-point is nil.
One of the ways this is useful is because Clojure/Java String literals only
support escape sequences (i.e. "\uXXXX") for code points in the basic
plane; code points in the supplementary planes must be manually converted into
their UTF-16 surrogate pair,
and then each UTF-16 code unit in the pair escaped separately (which is
tedious and error prone).
<sup>†</sup>a char or int, but int is usually the better choice, because
of historical limitations with Java's char type
Returns the `String` representation of any Unicode `code-point`<sup>†</sup>, or `nil` when `code-point` is `nil`. One of the ways this is useful is because Clojure/Java `String` literals only support escape sequences (i.e. `"\uXXXX"`) for code points in the basic plane; code points in the supplementary planes must be manually converted into their [UTF-16 surrogate pair](https://en.wikipedia.org/wiki/UTF-16#Code_points_from_U+010000_to_U+10FFFF), and then each UTF-16 code unit in the pair escaped separately (which is tedious and error prone). <sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)
(code-points->string code-points)Returns a String made up of all of the given Unicode
code-points<sup>†</sup>, or nil when code-points is nil.
<sup>†</sup>a sequence of chars or ints, but ints are usually the better
choice, because of historical limitations with Java's char type
Returns a `String` made up of all of the given Unicode `code-points`<sup>†</sup>, or `nil` when `code-points` is `nil`. <sup>†</sup>a sequence of `char`s or `int`s, but `int`s are usually the better choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)
(combining? code-point)Is code-point<sup>†</sup> a combining character?
<sup>†</sup>a char or int, but int is usually the better choice, because
of historical limitations with Java's char type
Is `code-point`<sup>†</sup> a [combining character](https://en.wikipedia.org/wiki/Combining_character)? <sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)
(display-width cs)(display-width cs & {:keys [ignore-ansi?] :or {ignore-ansi? false}})Returns the number of columns needed to display cs (a CharSequence), but
deviates from POSIX wcswidth behaviour in these ways:
-1)For most use cases, this function is more useful than wcswidth, despite
not adhering to POSIX.
Returns 0 when cs is nil.
Returns the number of columns needed to display `cs` (a `CharSequence`), but deviates from POSIX [[wcswidth]] behaviour in these ways: * non-printing characters are considered zero width (instead of causing the entire result to be `-1`) * ANSI escape sequences are (by default, but configurable) also considered zero width For most use cases, this function is more useful than [[wcswidth]], despite not adhering to POSIX. Returns `0` when `cs` is `nil`.
(grapheme-clusters cs)Returns the Unicode grapheme clusters
(what we tend to think of as "characters") in cs (a CharSequence) as a
sequence of Strings, or nil when cs is nil.
Notes:
BreakIterator
class when available on the classpath, falling back on the JDK's lower
quality BreakIterator
class otherwiseReturns the [Unicode grapheme clusters](https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries) (what we tend to think of as "characters") in `cs` (a `CharSequence`) as a sequence of `String`s, or `nil` when `cs` is `nil`. Notes: * Will use [ICU4J's `BreakIterator`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/BreakIterator.html) class when available on the classpath, falling back on the [JDK's lower quality `BreakIterator`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/text/BreakIterator.html) class otherwise
Which implementation is in use for finding grapheme clusters? A keyword with one of these values:
:icu4j:jdkWhich implementation is in use for finding grapheme clusters? A keyword with one of these values: * `:icu4j` * `:jdk`
(non-printing? code-point)Is code-point<sup>†</sup> a non-printing character?
<sup>†</sup>a char or int, but int is usually the better choice, because
of historical limitations with Java's char type
Is `code-point`<sup>†</sup> a [non-printing character](https://en.wikipedia.org/wiki/Unicode_control_characters)? <sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)
(null? code-point)Is code-point<sup>†</sup> a null character?
<sup>†</sup>a char or int, but int is usually the better choice, because
of historical limitations with Java's char type
Is `code-point`<sup>†</sup> a [null character](https://en.wikipedia.org/wiki/Null_character)? <sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)
A regular expression for matching ANSI escape sequences in a larger text. Adapted from ECMA-48.
A regular expression for matching ANSI escape sequences in a larger text. Adapted from [ECMA-48](https://www.ecma-international.org/publications-and-standards/standards/ecma-48/).
(remove-ansi cs)Strips all ANSI escape sequences from cs (a CharSequence). Returns nil
if cs is nil.
Strips all ANSI escape sequences from `cs` (a `CharSequence`). Returns `nil` if `cs` is `nil`.
(string->code-points cs)Returns all of the Unicode code points in cs (a CharSequence), as a
sequence of ints, or nil when cs is nil.
Returns all of the Unicode code points in `cs` (a `CharSequence`), as a sequence of `int`s, or `nil` when `cs` is `nil`.
(wcswidth cs)Returns the number of columns needed to represent cs (a CharSequence). If
a non-printing code point occurs in cs, -1 is returned (as defined in
POSIX).
Returns 0 when cs is nil.
Returns the number of columns needed to represent `cs` (a `CharSequence`). If a non-printing code point occurs in `cs`, `-1` is returned (as defined in POSIX). Returns `0` when `cs` is `nil`.
(wcwidth code-point)Returns the number of columns needed to represent the code-point
<sup>†</sup>, based on these rules:
0, 1, or 2nil: 0-1<sup>†</sup>a char or int, but int is usually the better choice, because
of historical limitations with Java's char type
Returns the number of columns needed to represent the `code-point` <sup>†</sup>, based on these rules: * Printable: `0`, `1`, or `2` * Null character, or `nil`: `0` * Non-printing: `-1` <sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)
(wide? code-point)Is code-point<sup>†</sup> in the East Asian Wide (W), East Asian Full-width
(F), or other wide character (e.g. emoji) category?
<sup>†</sup>a char or int, but int is usually the better choice, because
of historical limitations with Java's char type
Is `code-point`<sup>†</sup> in the [East Asian Wide (W), East Asian Full-width (F), or other wide character (e.g. emoji) category](https://en.wikipedia.org/wiki/Wide_character)? <sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |