wcwidth.api

Liking cljdoc? Tell your friends :D

Clojure only.

code-point->string
code-points->string
combining?
display-width
grapheme-clusters
grapheme-clusters-impl
non-printing?
null?
re-ansi
remove-ansi
string->code-points
wcswidth
wcwidth
wide?

The public API of clj-wcwidth.

The public API of [`clj-wcwidth`](https://github.com/pmonks/clj-wcwidth).

raw docstring

code-point->string^clj

(code-point->string code-point)

Returns the String representation of any Unicode code-point†, or nil when code-point is nil.

One of the ways this is useful is because Clojure/Java String literals only support escape sequences (i.e. "\uXXXX") for code points in the basic plane; code points in the supplementary planes must be manually converted into their UTF-16 surrogate pair, and then each UTF-16 code unit in the pair escaped separately (which is tedious and error prone).

†a char or int, but int is usually the better choice, because of historical limitations with Java's char type

Returns the `String` representation of any Unicode `code-point`<sup>†</sup>,
or `nil` when `code-point` is `nil`.

One of the ways this is useful is because Clojure/Java `String` literals only
support escape sequences (i.e. `"\uXXXX"`) for code points in the basic
plane; code points in the supplementary planes must be manually converted into
their [UTF-16 surrogate pair](https://en.wikipedia.org/wiki/UTF-16#Code_points_from_U+010000_to_U+10FFFF),
and then each UTF-16 code unit in the pair escaped separately (which is
tedious and error prone).

<sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because
of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)

source raw docstring

code-points->string^clj

(code-points->string code-points)

Returns a String made up of all of the given Unicode code-points†, or nil when code-points is nil.

†a sequence of chars or ints, but ints are usually the better choice, because of historical limitations with Java's char type

Returns a `String` made up of all of the given Unicode
`code-points`<sup>†</sup>, or `nil` when `code-points` is `nil`.

<sup>†</sup>a sequence of `char`s or `int`s, but `int`s are usually the better
choice, because of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)

source raw docstring

combining?^clj

(combining? code-point)

Is code-point† a combining character?

†a char or int, but int is usually the better choice, because of historical limitations with Java's char type

Is `code-point`<sup>†</sup> a [combining character](https://en.wikipedia.org/wiki/Combining_character)?

<sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because
of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)

source raw docstring

display-width^clj

(display-width cs)

(display-width cs & {:keys [ignore-ansi?] :or {ignore-ansi? false}})

Returns the number of columns needed to display cs (a CharSequence), but deviates from POSIX wcswidth behaviour in these ways:

non-printing characters are considered zero width (instead of causing the entire result to be -1)
ANSI escape sequences are (by default, but configurable) also considered zero width

For most use cases, this function is more useful than wcswidth, despite not adhering to POSIX.

Returns 0 when cs is nil.

Returns the number of columns needed to display `cs` (a `CharSequence`), but
deviates from POSIX [[wcswidth]] behaviour in these ways:

* non-printing characters are considered zero width (instead of causing the
  entire result to be `-1`)
* ANSI escape sequences are (by default, but configurable) also considered
  zero width

For most use cases, this function is more useful than [[wcswidth]], despite
not adhering to POSIX.

Returns `0` when `cs` is `nil`.

source raw docstring

grapheme-clusters^clj

(grapheme-clusters cs)

Returns the Unicode grapheme clusters (what we tend to think of as "characters") in cs (a CharSequence) as a sequence of Strings, or nil when cs is nil.

Notes:

Will use ICU4J's BreakIterator class when available on the classpath, falling back on the JDK's lower quality BreakIterator class otherwise

Returns the [Unicode grapheme clusters](https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries)
(what we tend to think of as "characters") in `cs` (a `CharSequence`) as a
sequence of `String`s, or `nil` when `cs` is `nil`.

Notes:

* Will use [ICU4J's `BreakIterator`](https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/BreakIterator.html)
  class when available on the classpath, falling back on the [JDK's lower
  quality `BreakIterator`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/text/BreakIterator.html)
  class otherwise

source raw docstring

grapheme-clusters-impl^clj

Which implementation is in use for finding grapheme clusters? A keyword with one of these values:

:icu4j
:jdk

Which implementation is in use for finding grapheme clusters?  A keyword
with one of these values:

* `:icu4j`
* `:jdk`

source raw docstring

non-printing?^clj

(non-printing? code-point)

Is code-point† a non-printing character?

†a char or int, but int is usually the better choice, because of historical limitations with Java's char type

Is `code-point`<sup>†</sup> a [non-printing character](https://en.wikipedia.org/wiki/Unicode_control_characters)?

<sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because
of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)

source raw docstring

null?^clj

(null? code-point)

Is code-point† a null character?

†a char or int, but int is usually the better choice, because of historical limitations with Java's char type

Is `code-point`<sup>†</sup> a [null character](https://en.wikipedia.org/wiki/Null_character)?

<sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because
of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)

source raw docstring

re-ansi^clj

A regular expression for matching ANSI escape sequences in a larger text. Adapted from ECMA-48.

A regular expression for matching ANSI escape sequences in a larger text.
Adapted from [ECMA-48](https://www.ecma-international.org/publications-and-standards/standards/ecma-48/).

source raw docstring

remove-ansi^clj

(remove-ansi cs)

Strips all ANSI escape sequences from cs (a CharSequence). Returns nil if cs is nil.

Strips all ANSI escape sequences from `cs` (a `CharSequence`).  Returns `nil`
if `cs` is `nil`.

source raw docstring

string->code-points^clj

(string->code-points cs)

Returns all of the Unicode code points in cs (a CharSequence), as a sequence of ints, or nil when cs is nil.

Returns all of the Unicode code points in `cs` (a `CharSequence`), as a
sequence of `int`s, or `nil` when `cs` is `nil`.

source raw docstring

wcswidth^clj

(wcswidth cs)

Returns the number of columns needed to represent cs (a CharSequence). If a non-printing code point occurs in cs, -1 is returned (as defined in POSIX).

Returns 0 when cs is nil.

Returns the number of columns needed to represent `cs` (a `CharSequence`). If
a non-printing code point occurs in `cs`, `-1` is returned (as defined in
POSIX).

Returns `0` when `cs` is `nil`.

source raw docstring

wcwidth^clj

(wcwidth code-point)

Returns the number of columns needed to represent the code-point †, based on these rules:

Printable: 0, 1, or 2
Null character, or nil: 0
Non-printing: -1

†a char or int, but int is usually the better choice, because of historical limitations with Java's char type

Returns the number of columns needed to represent the `code-point`
<sup>†</sup>, based on these rules:

* Printable: `0`, `1`, or `2`
* Null character, or `nil`: `0`
* Non-printing: `-1`

<sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because
of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)

source raw docstring

wide?^clj

(wide? code-point)

Is code-point† in the East Asian Wide (W), East Asian Full-width (F), or other wide character (e.g. emoji) category?

†a char or int, but int is usually the better choice, because of historical limitations with Java's char type

Is `code-point`<sup>†</sup> in the [East Asian Wide (W), East Asian Full-width
(F), or other wide character (e.g. emoji) category](https://en.wikipedia.org/wiki/Wide_character)?

<sup>†</sup>a `char` or `int`, but `int` is usually the better choice, because
of [historical limitations with Java's `char` type](https://www.oracle.com/technical-resources/articles/javase/supplementary.html)

source raw docstring

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close

wcwidth.api

code-point->stringclj

code-points->stringclj

combining?clj

display-widthclj

grapheme-clustersclj

grapheme-clusters-implclj

non-printing?clj

null?clj

re-ansiclj

remove-ansiclj

string->code-pointsclj

wcswidthclj

wcwidthclj

wide?clj

code-point->string^clj

code-points->string^clj

combining?^clj

display-width^clj

grapheme-clusters^clj

grapheme-clusters-impl^clj

non-printing?^clj

null?^clj

re-ansi^clj

remove-ansi^clj

string->code-points^clj

wcswidth^clj

wcwidth^clj

wide?^clj