zero-one.geni.core.functions

Liking cljdoc? Tell your friends :D

Clojure only.

!
**
->date-col
->timestamp-col
->utc-timestamp
abs
acos
add-months
aggregate
approx-count-distinct
array
array-contains
array-distinct
array-except
array-intersect
array-join
array-max
array-min
array-position
array-remove
array-repeat
array-sort
array-union
arrays-overlap
arrays-zip
ascii
asin
atan
atan-2
atan2
base-64
base64
bin
bitwise-not
broadcast
bround
cbrt
ceil
collect-list
collect-set
concat
concat-ws
conv
cos
cosh
count-distinct
covar
covar-pop
covar-samp
crc-32
crc32
cube-root
cume-dist
current-date
current-timestamp
date-add
date-diff
date-format
date-sub
date-trunc
datediff
day-of-month
day-of-week
day-of-year
dayofmonth
dayofweek
dayofyear
decode
degrees
dense-rank
element-at
encode
exists
exp
explode
explode-outer
expm-1
expm1
expr
factorial
flatten
floor
forall
format-number
format-string
from-csv
from-json
from-unixtime
greatest
grouping
grouping-id
hash
hex
hour
hypot
initcap
input-file-name
instr
kurtosis
lag
last-day
lead
least
length
levenshtein
locate
log
log-10
log-1p
log-2
log10
log1p
log2
lower
lpad
ltrim
map
map-concat
map-entries
map-filter
map-from-arrays
map-from-entries
map-keys
map-values
map-zip-with
md-5
md5
minute
monotonically-increasing-id
month
months-between
nanvl
negate
next-day
not
ntile
overlay
percent-rank
pi
pmod
posexplode
posexplode-outer
pow
quarter
radians
rand
randn
rank
regexp-extract
regexp-replace
reverse
rint
round
row-number
rpad
rtrim
schema-of-csv
schema-of-json
second
sequence
sha-1
sha-2
sha1
sha2
shift-left
shift-right
shift-right-unsigned
sign
signum
sin
sinh
size
skewness
slice
sort-array
soundex
spark-partition-id
split
sqr
sqrt
std
stddev
stddev-pop
stddev-samp
struct
substring
substring-index
sum-distinct
tan
tanh
time-window
to-csv
to-date
to-timestamp
to-utc-timestamp
transform
transform-keys
transform-values
translate
trim
unbase-64
unbase64
unhex
unix-timestamp
upper
var-pop
var-samp
variance
week-of-year
weekofyear
when
window
xxhash-64
xxhash64
year
zip-with

!^clj

(! expr)

Params: (e: Column)

Result: Column

Inversion of boolean expression, i.e. NOT.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.497Z

Params: (e: Column)

Result: Column

Inversion of boolean expression, i.e. NOT.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.497Z

raw docstring

**^clj

(** base exponent)

Params: (l: Column, r: Column)

Result: Column

Returns the value of the first argument raised to the power of the second argument.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.520Z

Params: (l: Column, r: Column)

Result: Column

Returns the value of the first argument raised to the power of the second argument.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.520Z

raw docstring

->date-col^clj

(->date-col expr)

(->date-col expr date-format)

Params: (e: Column)

Result: Column

Converts the column into DateType by casting rules to DateType.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.616Z

Params: (e: Column)

Result: Column

Converts the column into DateType by casting rules to DateType.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.616Z

raw docstring

->timestamp-col^clj

(->timestamp-col expr)

(->timestamp-col expr date-format)

Params: (s: Column)

Result: Column

Converts to a timestamp by casting rules to TimestampType.

A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A timestamp, or null if the input was a string that could not be cast to a timestamp

2.2.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.623Z

Params: (s: Column)

Result: Column

Converts to a timestamp by casting rules to TimestampType.


A date, timestamp or string. If a string, the data must be in a format that can be
         cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A timestamp, or null if the input was a string that could not be cast to a timestamp

2.2.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.623Z

raw docstring

->utc-timestamp^clj

(->utc-timestamp expr)

Params: (ts: Column, tz: String)

Result: Column

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A string detailing the time zone ID that the input should be adjusted to. It should be in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.

A timestamp, or null if ts was a string that could not be cast to a timestamp or tz was an invalid value

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.626Z

Params: (ts: Column, tz: String)

Result: Column

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time
zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield
'2017-07-14 01:40:00.0'.


A date, timestamp or string. If a string, the data must be in a format that can be
          cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A string detailing the time zone ID that the input should be adjusted to. It should
          be in the format of either region-based zone IDs or zone offsets. Region IDs must
          have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in
          the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are
          supported as aliases of '+00:00'. Other short names are not recommended to use
          because they can be ambiguous.

A timestamp, or null if ts was a string that could not be cast to a timestamp or
        tz was an invalid value

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.626Z

raw docstring

abs^clj

(abs expr)

Params: (e: Column)

Result: Column

Computes the absolute value of a numeric value.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.169Z

Params: (e: Column)

Result: Column

Computes the absolute value of a numeric value.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.169Z

raw docstring

acos^clj

(acos expr)

Params: (e: Column)

Result: Column

inverse cosine of e in radians, as if computed by java.lang.Math.acos

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.171Z

Params: (e: Column)

Result: Column

inverse cosine of e in radians, as if computed by java.lang.Math.acos

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.171Z

raw docstring

add-months^clj

(add-months expr months)

Params: (startDate: Column, numMonths: Int)

Result: Column

Returns the date that is numMonths after startDate.

A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

The number of months to add to startDate, can be negative to subtract months

A date, or null if startDate was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.174Z

Params: (startDate: Column, numMonths: Int)

Result: Column

Returns the date that is numMonths after startDate.


A date, timestamp or string. If a string, the data must be in a format that
                 can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

The number of months to add to startDate, can be negative to subtract months

A date, or null if startDate was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.174Z

raw docstring

aggregate^clj

(aggregate expr init merge-fn)

(aggregate expr init merge-fn finish-fn)

Params: (expr: Column, initialValue: Column, merge: (Column, Column) ⇒ Column, finish: (Column) ⇒ Column)

Result: Column

Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.

the input array column

the initial value

(combined_value, input_value) => combined_value, the merge function to merge an input value to the combined_value

combined_value => final_value, the lambda function to convert the combined value of all inputs to final result

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.177Z

Params: (expr: Column, initialValue: Column, merge: (Column, Column) ⇒ Column, finish: (Column) ⇒ Column)

Result: Column

Applies a binary operator to an initial state and all elements in the array,
and reduces this to a single state. The final state is converted into the final result
by applying a finish function.

the input array column

the initial value

(combined_value, input_value) => combined_value, the merge function to merge
             an input value to the combined_value

combined_value => final_value, the lambda function to convert the combined value
              of all inputs to final result

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.177Z

raw docstring

approx-count-distinct^clj

(approx-count-distinct expr)

(approx-count-distinct expr rsd)

Params: (e: Column)

Result: Column

(Since version 2.1.0) Use approx_count_distinct

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.742Z

Params: (e: Column)

Result: Column

(Since version 2.1.0) Use approx_count_distinct

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.742Z

raw docstring

array^clj

(array & exprs)

Params: (cols: Column*)

Result: Column

Creates a new array column. The input columns must all have the same data type.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.184Z

Params: (cols: Column*)

Result: Column

Creates a new array column. The input columns must all have the same data type.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.184Z

raw docstring

array-contains^clj

(array-contains expr value)

Params: (column: Column, value: Any)

Result: Column

Returns null if the array is null, true if the array contains value, and false otherwise.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.185Z

Params: (column: Column, value: Any)

Result: Column

Returns null if the array is null, true if the array contains value, and false otherwise.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.185Z

raw docstring

array-distinct^clj

(array-distinct expr)

Params: (e: Column)

Result: Column

Removes duplicate values from the array.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.186Z

Params: (e: Column)

Result: Column

Removes duplicate values from the array.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.186Z

raw docstring

array-except^clj

(array-except left right)

Params: (col1: Column, col2: Column)

Result: Column

Returns an array of the elements in the first array but not in the second array, without duplicates. The order of elements in the result is not determined

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.188Z

Params: (col1: Column, col2: Column)

Result: Column

Returns an array of the elements in the first array but not in the second array,
without duplicates. The order of elements in the result is not determined


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.188Z

raw docstring

array-intersect^clj

(array-intersect left right)

Params: (col1: Column, col2: Column)

Result: Column

Returns an array of the elements in the intersection of the given two arrays, without duplicates.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.189Z

Params: (col1: Column, col2: Column)

Result: Column

Returns an array of the elements in the intersection of the given two arrays,
without duplicates.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.189Z

raw docstring

array-join^clj

(array-join expr delimiter)

(array-join expr delimiter null-replacement)

Params: (column: Column, delimiter: String, nullReplacement: String)

Result: Column

Concatenates the elements of column using the delimiter. Null values are replaced with nullReplacement.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.194Z

Params: (column: Column, delimiter: String, nullReplacement: String)

Result: Column

Concatenates the elements of column using the delimiter. Null values are replaced with
nullReplacement.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.194Z

raw docstring

array-max^clj

(array-max expr)

Params: (e: Column)

Result: Column

Returns the maximum value in the array.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.195Z

Params: (e: Column)

Result: Column

Returns the maximum value in the array.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.195Z

raw docstring

array-min^clj

(array-min expr)

Params: (e: Column)

Result: Column

Returns the minimum value in the array.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.197Z

Params: (e: Column)

Result: Column

Returns the minimum value in the array.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.197Z

raw docstring

array-position^clj

(array-position expr value)

Params: (column: Column, value: Any)

Result: Column

Locates the position of the first occurrence of the value in the given array as long. Returns null if either of the arguments are null.

2.4.0

The position is not zero based, but 1 based index. Returns 0 if value could not be found in array.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.198Z

Params: (column: Column, value: Any)

Result: Column

Locates the position of the first occurrence of the value in the given array as long.
Returns null if either of the arguments are null.


2.4.0

The position is not zero based, but 1 based index. Returns 0 if value
could not be found in array.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.198Z

raw docstring

array-remove^clj

(array-remove expr element)

Params: (column: Column, element: Any)

Result: Column

Remove all elements that equal to element from the given array.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.199Z

Params: (column: Column, element: Any)

Result: Column

Remove all elements that equal to element from the given array.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.199Z

raw docstring

array-repeat^clj

(array-repeat left right)

Params: (left: Column, right: Column)

Result: Column

Creates an array containing the left argument repeated the number of times given by the right argument.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.201Z

Params: (left: Column, right: Column)

Result: Column

Creates an array containing the left argument repeated the number of times given by the
right argument.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.201Z

raw docstring

array-sort^clj

(array-sort expr)

Params: (e: Column)

Result: Column

Sorts the input array in ascending order. The elements of the input array must be orderable. Null elements will be placed at the end of the returned array.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.202Z

Params: (e: Column)

Result: Column

Sorts the input array in ascending order. The elements of the input array must be orderable.
Null elements will be placed at the end of the returned array.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.202Z

raw docstring

array-union^clj

(array-union left right)

Params: (col1: Column, col2: Column)

Result: Column

Returns an array of the elements in the union of the given two arrays, without duplicates.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.204Z

Params: (col1: Column, col2: Column)

Result: Column

Returns an array of the elements in the union of the given two arrays, without duplicates.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.204Z

raw docstring

arrays-overlap^clj

(arrays-overlap left right)

Params: (a1: Column, a2: Column)

Result: Column

Returns true if a1 and a2 have at least one non-null element in common. If not and both the arrays are non-empty and any of them contains a null, it returns null. It returns false otherwise.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.209Z

Params: (a1: Column, a2: Column)

Result: Column

Returns true if a1 and a2 have at least one non-null element in common. If not and both
the arrays are non-empty and any of them contains a null, it returns null. It returns
false otherwise.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.209Z

raw docstring

arrays-zip^clj

(arrays-zip & exprs)

Params: (e: Column*)

Result: Column

Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.211Z

Params: (e: Column*)

Result: Column

Returns a merged array of structs in which the N-th struct contains all N-th values of input
arrays.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.211Z

raw docstring

ascii^clj

(ascii expr)

Params: (e: Column)

Result: Column

Computes the numeric value of the first character of the string column, and returns the result as an int column.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.216Z

Params: (e: Column)

Result: Column

Computes the numeric value of the first character of the string column, and returns the
result as an int column.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.216Z

raw docstring

asin^clj

(asin expr)

Params: (e: Column)

Result: Column

inverse sine of e in radians, as if computed by java.lang.Math.asin

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.219Z

Params: (e: Column)

Result: Column

inverse sine of e in radians, as if computed by java.lang.Math.asin

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.219Z

raw docstring

atan^clj

(atan expr)

Params: (e: Column)

Result: Column

inverse tangent of e, as if computed by java.lang.Math.atan

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.221Z

Params: (e: Column)

Result: Column

inverse tangent of e, as if computed by java.lang.Math.atan

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.221Z

raw docstring

atan-2^clj

(atan-2 expr-x expr-y)

Params: (y: Column, x: Column)

Result: Column

coordinate on y-axis

coordinate on x-axis

the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.233Z

Params: (y: Column, x: Column)

Result: Column

coordinate on y-axis

coordinate on x-axis

the theta component of the point
        (r, theta)
        in polar coordinates that corresponds to the point
        (x, y) in Cartesian coordinates,
        as if computed by java.lang.Math.atan2

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.233Z

raw docstring

atan2^clj

(atan2 expr-x expr-y)

Params: (y: Column, x: Column)

Result: Column

coordinate on y-axis

coordinate on x-axis

the theta component of the point (r, theta) in polar coordinates that corresponds to the point (x, y) in Cartesian coordinates, as if computed by java.lang.Math.atan2

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.233Z

Params: (y: Column, x: Column)

Result: Column

coordinate on y-axis

coordinate on x-axis

the theta component of the point
        (r, theta)
        in polar coordinates that corresponds to the point
        (x, y) in Cartesian coordinates,
        as if computed by java.lang.Math.atan2

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.233Z

raw docstring

base-64^clj

(base-64 expr)

Params: (e: Column)

Result: Column

Computes the BASE64 encoding of a binary column and returns it as a string column. This is the reverse of unbase64.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.236Z

Params: (e: Column)

Result: Column

Computes the BASE64 encoding of a binary column and returns it as a string column.
This is the reverse of unbase64.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.236Z

raw docstring

base64^clj

(base64 expr)

Params: (e: Column)

Result: Column

Computes the BASE64 encoding of a binary column and returns it as a string column. This is the reverse of unbase64.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.236Z

Params: (e: Column)

Result: Column

Computes the BASE64 encoding of a binary column and returns it as a string column.
This is the reverse of unbase64.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.236Z

raw docstring

bin^clj

(bin expr)

Params: (e: Column)

Result: Column

An expression that returns the string representation of the binary value of the given long column. For example, bin("12") returns "1100".

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.238Z

Params: (e: Column)

Result: Column

An expression that returns the string representation of the binary value of the given long
column. For example, bin("12") returns "1100".


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.238Z

raw docstring

bitwise-not^clj

(bitwise-not expr)

Params: (e: Column)

Result: Column

Computes bitwise NOT (~) of a number.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.239Z

Params: (e: Column)

Result: Column

Computes bitwise NOT (~) of a number.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.239Z

raw docstring

broadcast^clj

(broadcast dataframe)

Params: (df: Dataset[T])

Result: Dataset[T]

Marks a DataFrame as small enough for use in broadcast joins.

The following example marks the right DataFrame for broadcast hash join using joinKey.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.240Z

Params: (df: Dataset[T])

Result: Dataset[T]

Marks a DataFrame as small enough for use in broadcast joins.

The following example marks the right DataFrame for broadcast hash join using joinKey.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.240Z

raw docstring

bround^clj

(bround expr)

Params: (e: Column)

Result: Column

Returns the value of the column e rounded to 0 decimal places with HALF_EVEN round mode.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.243Z

Params: (e: Column)

Result: Column

Returns the value of the column e rounded to 0 decimal places with HALF_EVEN round mode.


2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.243Z

raw docstring

cbrt^clj

(cbrt expr)

Params: (e: Column)

Result: Column

Computes the cube-root of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.253Z

Params: (e: Column)

Result: Column

Computes the cube-root of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.253Z

raw docstring

ceil^clj

(ceil expr)

Params: (e: Column)

Result: Column

Computes the ceiling of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.255Z

Params: (e: Column)

Result: Column

Computes the ceiling of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.255Z

raw docstring

collect-list^clj

(collect-list expr)

Params: (e: Column)

Result: Column

Aggregate function: returns a list of objects with duplicates.

1.6.0

The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.261Z

Params: (e: Column)

Result: Column

Aggregate function: returns a list of objects with duplicates.


1.6.0

The function is non-deterministic because the order of collected results depends
on the order of the rows which may be non-deterministic after a shuffle.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.261Z

raw docstring

collect-set^clj

(collect-set expr)

Params: (e: Column)

Result: Column

Aggregate function: returns a set of objects with duplicate elements eliminated.

1.6.0

The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.263Z

Params: (e: Column)

Result: Column

Aggregate function: returns a set of objects with duplicate elements eliminated.


1.6.0

The function is non-deterministic because the order of collected results depends
on the order of the rows which may be non-deterministic after a shuffle.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.263Z

raw docstring

concat^clj

(concat & exprs)

Params: (exprs: Column*)

Result: Column

Concatenates multiple input columns together into a single column. The function works with strings, binary and compatible array columns.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.265Z

Params: (exprs: Column*)

Result: Column

Concatenates multiple input columns together into a single column.
The function works with strings, binary and compatible array columns.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.265Z

raw docstring

concat-ws^clj

(concat-ws sep & exprs)

Params: (sep: String, exprs: Column*)

Result: Column

Concatenates multiple input string columns together into a single string column, using the given separator.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.267Z

Params: (sep: String, exprs: Column*)

Result: Column

Concatenates multiple input string columns together into a single string column,
using the given separator.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.267Z

raw docstring

conv^clj

(conv expr from-base to-base)

Params: (num: Column, fromBase: Int, toBase: Int)

Result: Column

Convert a number in a string column from one base to another.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.268Z

Params: (num: Column, fromBase: Int, toBase: Int)

Result: Column

Convert a number in a string column from one base to another.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.268Z

raw docstring

cos^clj

(cos expr)

Params: (e: Column)

Result: Column

angle in radians

cosine of the angle, as if computed by java.lang.Math.cos

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.272Z

Params: (e: Column)

Result: Column

angle in radians

cosine of the angle, as if computed by java.lang.Math.cos

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.272Z

raw docstring

cosh^clj

(cosh expr)

Params: (e: Column)

Result: Column

hyperbolic angle

hyperbolic cosine of the angle, as if computed by java.lang.Math.cosh

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.275Z

Params: (e: Column)

Result: Column

hyperbolic angle

hyperbolic cosine of the angle, as if computed by java.lang.Math.cosh

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.275Z

raw docstring

count-distinct^clj

(count-distinct & exprs)

Params: (expr: Column, exprs: Column*)

Result: Column

Aggregate function: returns the number of distinct items in a group.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.279Z

Params: (expr: Column, exprs: Column*)

Result: Column

Aggregate function: returns the number of distinct items in a group.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.279Z

raw docstring

covar^clj

(covar l-expr r-expr)

Params: (column1: Column, column2: Column)

Result: Column

Aggregate function: returns the sample covariance for two columns.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.284Z

Params: (column1: Column, column2: Column)

Result: Column

Aggregate function: returns the sample covariance for two columns.


2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.284Z

raw docstring

covar-pop^clj

(covar-pop l-expr r-expr)

Params: (column1: Column, column2: Column)

Result: Column

Aggregate function: returns the population covariance for two columns.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.282Z

Params: (column1: Column, column2: Column)

Result: Column

Aggregate function: returns the population covariance for two columns.


2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.282Z

raw docstring

covar-samp^clj

(covar-samp l-expr r-expr)

Params: (column1: Column, column2: Column)

Result: Column

Aggregate function: returns the sample covariance for two columns.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.284Z

Params: (column1: Column, column2: Column)

Result: Column

Aggregate function: returns the sample covariance for two columns.


2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.284Z

raw docstring

crc-32^clj

(crc-32 expr)

Params: (e: Column)

Result: Column

Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.285Z

Params: (e: Column)

Result: Column

Calculates the cyclic redundancy check value  (CRC32) of a binary column and
returns the value as a bigint.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.285Z

raw docstring

crc32^clj

(crc32 expr)

Params: (e: Column)

Result: Column

Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.285Z

Params: (e: Column)

Result: Column

Calculates the cyclic redundancy check value  (CRC32) of a binary column and
returns the value as a bigint.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.285Z

raw docstring

cube-root^clj

(cube-root expr)

Params: (e: Column)

Result: Column

Computes the cube-root of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.253Z

Params: (e: Column)

Result: Column

Computes the cube-root of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.253Z

raw docstring

cume-dist^clj

(cume-dist)

Params: ()

Result: Column

Window function: returns the cumulative distribution of values within a window partition, i.e. the fraction of rows that are below the current row.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.286Z

Params: ()

Result: Column

Window function: returns the cumulative distribution of values within a window partition,
i.e. the fraction of rows that are below the current row.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.286Z

raw docstring

current-date^clj

(current-date)

Params: ()

Result: Column

Returns the current date as a date column.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.287Z

Params: ()

Result: Column

Returns the current date as a date column.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.287Z

raw docstring

current-timestamp^clj

(current-timestamp)

Params: ()

Result: Column

Returns the current timestamp as a timestamp column.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.288Z

Params: ()

Result: Column

Returns the current timestamp as a timestamp column.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.288Z

raw docstring

date-add^clj

(date-add expr days)

Params: (start: Column, days: Int)

Result: Column

Returns the date that is days days after start

A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

The number of days to add to start, can be negative to subtract days

A date, or null if start was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.295Z

Params: (start: Column, days: Int)

Result: Column

Returns the date that is days days after start


A date, timestamp or string. If a string, the data must be in a format that
             can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

The number of days to add to start, can be negative to subtract days

A date, or null if start was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.295Z

raw docstring

date-diff^clj

(date-diff l-expr r-expr)

Params: (end: Column, start: Column)

Result: Column

Returns the number of days from start to end.

Only considers the date part of the input. For example:

A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

An integer, or null if either end or start were strings that could not be cast to a date. Negative if end is before start

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.304Z

Params: (end: Column, start: Column)

Result: Column

Returns the number of days from start to end.

Only considers the date part of the input. For example:

A date, timestamp or string. If a string, the data must be in a format that
           can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A date, timestamp or string. If a string, the data must be in a format that
             can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

An integer, or null if either end or start were strings that could not be cast to
        a date. Negative if end is before start

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.304Z

raw docstring

date-format^clj

(date-format expr date-fmt)

Params: (dateExpr: Column, format: String)

Result: Column

Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.

See Datetime Patterns for valid date and time format patterns

A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A pattern dd.MM.yyyy would return a string like 18.03.1993

A string, or null if dateExpr was a string that could not be cast to a timestamp

1.5.0

IllegalArgumentException if the format pattern is invalid

Use specialized functions like year whenever possible as they benefit from a specialized implementation.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.297Z

Params: (dateExpr: Column, format: String)

Result: Column

Converts a date/timestamp/string to a value of string in the format specified by the date
format given by the second argument.

See 
  Datetime Patterns
for valid date and time format patterns


A date, timestamp or string. If a string, the data must be in a format that
                can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A pattern dd.MM.yyyy would return a string like 18.03.1993

A string, or null if dateExpr was a string that could not be cast to a timestamp

1.5.0

IllegalArgumentException if the format pattern is invalid

Use specialized functions like year whenever possible as they benefit from a
specialized implementation.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.297Z

raw docstring

date-sub^clj

(date-sub expr days)

Params: (start: Column, days: Int)

Result: Column

Returns the date that is days days before start

A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

The number of days to subtract from start, can be negative to add days

A date, or null if start was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.300Z

Params: (start: Column, days: Int)

Result: Column

Returns the date that is days days before start


A date, timestamp or string. If a string, the data must be in a format that
             can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

The number of days to subtract from start, can be negative to add days

A date, or null if start was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.300Z

raw docstring

date-trunc^clj

(date-trunc fmt expr)

Params: (format: String, timestamp: Column)

Result: Column

Returns timestamp truncated to the unit specified by the format.

For example, date_trunc("year", "2018-11-19 12:01:19") returns 2018-01-01 00:00:00

A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A timestamp, or null if timestamp was a string that could not be cast to a timestamp or format was an invalid value

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.302Z

Params: (format: String, timestamp: Column)

Result: Column

Returns timestamp truncated to the unit specified by the format.

For example, date_trunc("year", "2018-11-19 12:01:19") returns 2018-01-01 00:00:00


A date, timestamp or string. If a string, the data must be in a format that
                 can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A timestamp, or null if timestamp was a string that could not be cast to a timestamp
        or format was an invalid value

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.302Z

raw docstring

datediff^clj

(datediff l-expr r-expr)

Params: (end: Column, start: Column)

Result: Column

Returns the number of days from start to end.

Only considers the date part of the input. For example:

A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

An integer, or null if either end or start were strings that could not be cast to a date. Negative if end is before start

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.304Z

Params: (end: Column, start: Column)

Result: Column

Returns the number of days from start to end.

Only considers the date part of the input. For example:

A date, timestamp or string. If a string, the data must be in a format that
           can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A date, timestamp or string. If a string, the data must be in a format that
             can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

An integer, or null if either end or start were strings that could not be cast to
        a date. Negative if end is before start

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.304Z

raw docstring

day-of-month^clj

(day-of-month expr)

Params: (e: Column)

Result: Column

Extracts the day of the month as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.305Z

Params: (e: Column)

Result: Column

Extracts the day of the month as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.305Z

raw docstring

day-of-week^clj

(day-of-week expr)

Params: (e: Column)

Result: Column

Extracts the day of the week as an integer from a given date/timestamp/string. Ranges from 1 for a Sunday through to 7 for a Saturday

An integer, or null if the input was a string that could not be cast to a date

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.306Z

Params: (e: Column)

Result: Column

Extracts the day of the week as an integer from a given date/timestamp/string.
Ranges from 1 for a Sunday through to 7 for a Saturday

An integer, or null if the input was a string that could not be cast to a date

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.306Z

raw docstring

day-of-year^clj

(day-of-year expr)

Params: (e: Column)

Result: Column

Extracts the day of the year as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.307Z

Params: (e: Column)

Result: Column

Extracts the day of the year as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.307Z

raw docstring

dayofmonth^clj

(dayofmonth expr)

Params: (e: Column)

Result: Column

Extracts the day of the month as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.305Z

Params: (e: Column)

Result: Column

Extracts the day of the month as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.305Z

raw docstring

dayofweek^clj

(dayofweek expr)

Params: (e: Column)

Result: Column

Extracts the day of the week as an integer from a given date/timestamp/string. Ranges from 1 for a Sunday through to 7 for a Saturday

An integer, or null if the input was a string that could not be cast to a date

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.306Z

Params: (e: Column)

Result: Column

Extracts the day of the week as an integer from a given date/timestamp/string.
Ranges from 1 for a Sunday through to 7 for a Saturday

An integer, or null if the input was a string that could not be cast to a date

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.306Z

raw docstring

dayofyear^clj

(dayofyear expr)

Params: (e: Column)

Result: Column

Extracts the day of the year as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.307Z

Params: (e: Column)

Result: Column

Extracts the day of the year as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.307Z

raw docstring

decode^clj

(decode expr charset)

Params: (value: Column, charset: String)

Result: Column

Computes the first argument into a string from a binary using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). If either argument is null, the result will also be null.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.309Z

Params: (value: Column, charset: String)

Result: Column

Computes the first argument into a string from a binary using the provided character set
(one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').
If either argument is null, the result will also be null.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.309Z

raw docstring

degrees^clj

(degrees expr)

Params: (e: Column)

Result: Column

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

angle in radians

angle in degrees, as if computed by java.lang.Math.toDegrees

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.312Z

Params: (e: Column)

Result: Column

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.


angle in radians

angle in degrees, as if computed by java.lang.Math.toDegrees

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.312Z

raw docstring

dense-rank^clj

(dense-rank)

Params: ()

Result: Column

Window function: returns the rank of rows within a window partition, without any gaps.

The difference between rank and dense_rank is that denseRank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second place and that the next person came in third. Rank would give me sequential numbers, making the person that came in third place (after the ties) would register as coming in fifth.

This is equivalent to the DENSE_RANK function in SQL.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.313Z

Params: ()

Result: Column

Window function: returns the rank of rows within a window partition, without any gaps.

The difference between rank and dense_rank is that denseRank leaves no gaps in ranking
sequence when there are ties. That is, if you were ranking a competition using dense_rank
and had three people tie for second place, you would say that all three were in second
place and that the next person came in third. Rank would give me sequential numbers, making
the person that came in third place (after the ties) would register as coming in fifth.

This is equivalent to the DENSE_RANK function in SQL.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.313Z

raw docstring

element-at^clj

(element-at expr value)

Params: (column: Column, value: Any)

Result: Column

Returns element of array at given index in value if column is array. Returns value for the given key in value if column is map.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.318Z

Params: (column: Column, value: Any)

Result: Column

Returns element of array at given index in value if column is array. Returns value for
the given key in value if column is map.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.318Z

raw docstring

encode^clj

(encode expr charset)

Params: (value: Column, charset: String)

Result: Column

Computes the first argument into a binary from a string using the provided character set (one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). If either argument is null, the result will also be null.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.319Z

Params: (value: Column, charset: String)

Result: Column

Computes the first argument into a binary from a string using the provided character set
(one of 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16').
If either argument is null, the result will also be null.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.319Z

raw docstring

exists^clj

(exists expr predicate)

Params: (column: Column, f: (Column) ⇒ Column)

Result: Column

Returns whether a predicate holds for one or more elements in the array.

the input array column

col => predicate, the Boolean predicate to check the input column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.322Z

Params: (column: Column, f: (Column) ⇒ Column)

Result: Column

Returns whether a predicate holds for one or more elements in the array.

the input array column

col => predicate, the Boolean predicate to check the input column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.322Z

raw docstring

exp^clj

(exp expr)

Params: (e: Column)

Result: Column

Computes the exponential of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.324Z

Params: (e: Column)

Result: Column

Computes the exponential of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.324Z

raw docstring

explode^clj

(explode expr)

Params: (e: Column)

Result: Column

Creates a new row for each element in the given array or map column. Uses the default column name col for elements in the array and key and value for elements in the map unless specified otherwise.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.325Z

Params: (e: Column)

Result: Column

Creates a new row for each element in the given array or map column.
Uses the default column name col for elements in the array and
key and value for elements in the map unless specified otherwise.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.325Z

raw docstring

explode-outer^clj

(explode-outer expr)

Params: (e: Column)

Result: Column

Creates a new row for each element in the given array or map column. Uses the default column name col for elements in the array and key and value for elements in the map unless specified otherwise.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.325Z

Params: (e: Column)

Result: Column

Creates a new row for each element in the given array or map column.
Uses the default column name col for elements in the array and
key and value for elements in the map unless specified otherwise.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.325Z

raw docstring

expm-1^clj

(expm-1 expr)

Params: (e: Column)

Result: Column

Computes the exponential of the given value minus one.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.329Z

Params: (e: Column)

Result: Column

Computes the exponential of the given value minus one.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.329Z

raw docstring

expm1^clj

(expm1 expr)

Params: (e: Column)

Result: Column

Computes the exponential of the given value minus one.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.329Z

Params: (e: Column)

Result: Column

Computes the exponential of the given value minus one.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.329Z

raw docstring

expr^clj

(expr s)

Params: (expr: String)

Result: Column

Parses the expression string into the column that it represents, similar to Dataset#selectExpr.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.330Z

Params: (expr: String)

Result: Column

Parses the expression string into the column that it represents, similar to
Dataset#selectExpr.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.330Z

raw docstring

factorial^clj

(factorial expr)

Params: (e: Column)

Result: Column

Computes the factorial of the given value.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.331Z

Params: (e: Column)

Result: Column

Computes the factorial of the given value.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.331Z

raw docstring

flatten^clj

(flatten expr)

Params: (e: Column)

Result: Column

Creates a single array from an array of arrays. If a structure of nested arrays is deeper than two levels, only one level of nesting is removed.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.345Z

Params: (e: Column)

Result: Column

Creates a single array from an array of arrays. If a structure of nested arrays is deeper than
two levels, only one level of nesting is removed.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.345Z

raw docstring

floor^clj

(floor expr)

Params: (e: Column)

Result: Column

Computes the floor of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.347Z

Params: (e: Column)

Result: Column

Computes the floor of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.347Z

raw docstring

forall^clj

(forall expr predicate)

Params: (column: Column, f: (Column) ⇒ Column)

Result: Column

Returns whether a predicate holds for every element in the array.

the input array column

col => predicate, the Boolean predicate to check the input column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.349Z

Params: (column: Column, f: (Column) ⇒ Column)

Result: Column

Returns whether a predicate holds for every element in the array.

the input array column

col => predicate, the Boolean predicate to check the input column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.349Z

raw docstring

format-number^clj

(format-number expr decimal-places)

Params: (x: Column, d: Int)

Result: Column

Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places with HALF_EVEN round mode, and returns the result as a string column.

If d is 0, the result has no decimal point or fractional part. If d is less than 0, the result will be null.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.350Z

Params: (x: Column, d: Int)

Result: Column

Formats numeric column x to a format like '#,###,###.##', rounded to d decimal places
with HALF_EVEN round mode, and returns the result as a string column.

If d is 0, the result has no decimal point or fractional part.
If d is less than 0, the result will be null.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.350Z

raw docstring

format-string^clj

(format-string fmt & exprs)

Params: (format: String, arguments: Column*)

Result: Column

Formats the arguments in printf-style and returns the result as a string column.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.351Z

Params: (format: String, arguments: Column*)

Result: Column

Formats the arguments in printf-style and returns the result as a string column.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.351Z

raw docstring

from-csv^clj

(from-csv expr schema)

(from-csv expr schema options)

Params: (e: Column, schema: StructType, options: Map[String, String])

Result: Column

Parses a column containing a CSV string into a StructType with the specified schema. Returns null, in the case of an unparseable string.

a string column containing CSV data.

the schema to use when parsing the CSV string

options to control how the CSV is parsed. accepts the same options and the CSV data source.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.354Z

Params: (e: Column, schema: StructType, options: Map[String, String])

Result: Column

Parses a column containing a CSV string into a StructType with the specified schema.
Returns null, in the case of an unparseable string.


a string column containing CSV data.

the schema to use when parsing the CSV string

options to control how the CSV is parsed. accepts the same options and the
               CSV data source.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.354Z

raw docstring

from-json^clj

(from-json expr schema)

(from-json expr schema options)

Params: (e: Column, schema: StructType, options: Map[String, String])

Result: Column

(Scala-specific) Parses a column containing a JSON string into a StructType with the specified schema. Returns null, in the case of an unparseable string.

a string column containing JSON data.

the schema to use when parsing the json string

options to control how the json is parsed. Accepts the same options as the json data source.

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.372Z

Params: (e: Column, schema: StructType, options: Map[String, String])

Result: Column

(Scala-specific) Parses a column containing a JSON string into a StructType with the
specified schema. Returns null, in the case of an unparseable string.


a string column containing JSON data.

the schema to use when parsing the json string

options to control how the json is parsed. Accepts the same options as the
               json data source.

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.372Z

raw docstring

from-unixtime^clj

(from-unixtime expr)

(from-unixtime expr fmt)

Params: (ut: Column)

Result: Column

Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the yyyy-MM-dd HH:mm:ss format.

A number of a type that is castable to a long, such as string or integer. Can be negative for timestamps before the unix epoch

A string, or null if the input was a string that could not be cast to a long

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.375Z

Params: (ut: Column)

Result: Column

Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string
representing the timestamp of that moment in the current system time zone in the
yyyy-MM-dd HH:mm:ss format.


A number of a type that is castable to a long, such as string or integer. Can be
          negative for timestamps before the unix epoch

A string, or null if the input was a string that could not be cast to a long

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.375Z

raw docstring

greatest^clj

(greatest & exprs)

Params: (exprs: Column*)

Result: Column

Returns the greatest value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.382Z

Params: (exprs: Column*)

Result: Column

Returns the greatest value of the list of values, skipping null values.
This function takes at least 2 parameters. It will return null iff all parameters are null.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.382Z

raw docstring

grouping^clj

(grouping expr)

Params: (e: Column)

Result: Column

Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.388Z

Params: (e: Column)

Result: Column

Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated
or not, returns 1 for aggregated or 0 for not aggregated in the result set.


2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.388Z

raw docstring

grouping-id^clj

(grouping-id & exprs)

Params: (cols: Column*)

Result: Column

Aggregate function: returns the level of grouping, equals to

2.0.0

The list of columns should match with grouping columns exactly, or empty (means all the grouping columns).

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.390Z

Params: (cols: Column*)

Result: Column

Aggregate function: returns the level of grouping, equals to

2.0.0

The list of columns should match with grouping columns exactly, or empty (means all the
grouping columns).

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.390Z

raw docstring

hash^clj

(hash & exprs)

Params: (cols: Column*)

Result: Column

Calculates the hash code of given columns, and returns the result as an int column.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.391Z

Params: (cols: Column*)

Result: Column

Calculates the hash code of given columns, and returns the result as an int column.


2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.391Z

raw docstring

hex^clj

(hex expr)

Params: (column: Column)

Result: Column

Computes hex value of the given column.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.393Z

Params: (column: Column)

Result: Column

Computes hex value of the given column.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.393Z

raw docstring

hour^clj

(hour expr)

Params: (e: Column)

Result: Column

Extracts the hours as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.394Z

Params: (e: Column)

Result: Column

Extracts the hours as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.394Z

raw docstring

hypot^clj

(hypot left-expr right-expr)

Params: (l: Column, r: Column)

Result: Column

Computes sqrt(a2 + b2) without intermediate overflow or underflow.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.406Z

Params: (l: Column, r: Column)

Result: Column

Computes sqrt(a2 + b2) without intermediate overflow or underflow.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.406Z

raw docstring

initcap^clj

(initcap expr)

Params: (e: Column)

Result: Column

Returns a new string column by converting the first letter of each word to uppercase. Words are delimited by whitespace.

For example, "hello world" will become "Hello World".

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.407Z

Params: (e: Column)

Result: Column

Returns a new string column by converting the first letter of each word to uppercase.
Words are delimited by whitespace.

For example, "hello world" will become "Hello World".


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.407Z

raw docstring

input-file-name^clj

(input-file-name)

Params: ()

Result: Column

Creates a string column for the file name of the current Spark task.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.408Z

Params: ()

Result: Column

Creates a string column for the file name of the current Spark task.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.408Z

raw docstring

instr^clj

(instr expr substr)

Params: (str: Column, substring: String)

Result: Column

Locate the position of the first occurrence of substr column in the given string. Returns null if either of the arguments are null.

1.5.0

The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.409Z

Params: (str: Column, substring: String)

Result: Column

Locate the position of the first occurrence of substr column in the given string.
Returns null if either of the arguments are null.


1.5.0

The position is not zero based, but 1 based index. Returns 0 if substr
could not be found in str.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.409Z

raw docstring

kurtosis^clj

(kurtosis expr)

Params: (e: Column)

Result: Column

Aggregate function: returns the kurtosis of the values in a group.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.416Z

Params: (e: Column)

Result: Column

Aggregate function: returns the kurtosis of the values in a group.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.416Z

raw docstring

lag^clj

(lag expr offset)

(lag expr offset default)

Params: (e: Column, offset: Int)

Result: Column

Window function: returns the value that is offset rows before the current row, and null if there is less than offset rows before the current row. For example, an offset of one will return the previous row at any given point in the window partition.

This is equivalent to the LAG function in SQL.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.421Z

Params: (e: Column, offset: Int)

Result: Column

Window function: returns the value that is offset rows before the current row, and
null if there is less than offset rows before the current row. For example,
an offset of one will return the previous row at any given point in the window partition.

This is equivalent to the LAG function in SQL.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.421Z

raw docstring

last-day^clj

(last-day expr)

Params: (e: Column)

Result: Column

Returns the last day of the month which the given date belongs to. For example, input "2015-07-27" returns "2015-07-31" since July 31 is the last day of the month in July 2015.

A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A date, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.431Z

Params: (e: Column)

Result: Column

Returns the last day of the month which the given date belongs to.
For example, input "2015-07-27" returns "2015-07-31" since July 31 is the last day of the
month in July 2015.


A date, timestamp or string. If a string, the data must be in a format that can be
         cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A date, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.431Z

raw docstring

lead^clj

(lead expr offset)

(lead expr offset default)

Params: (columnName: String, offset: Int)

Result: Column

Window function: returns the value that is offset rows after the current row, and null if there is less than offset rows after the current row. For example, an offset of one will return the next row at any given point in the window partition.

This is equivalent to the LEAD function in SQL.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.437Z

Params: (columnName: String, offset: Int)

Result: Column

Window function: returns the value that is offset rows after the current row, and
null if there is less than offset rows after the current row. For example,
an offset of one will return the next row at any given point in the window partition.

This is equivalent to the LEAD function in SQL.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.437Z

raw docstring

least^clj

(least & exprs)

Params: (exprs: Column*)

Result: Column

Returns the least value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.439Z

Params: (exprs: Column*)

Result: Column

Returns the least value of the list of values, skipping null values.
This function takes at least 2 parameters. It will return null iff all parameters are null.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.439Z

raw docstring

length^clj

(length expr)

Params: (e: Column)

Result: Column

Computes the character length of a given string or number of bytes of a binary string. The length of character strings include the trailing spaces. The length of binary strings includes binary zeros.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.440Z

Params: (e: Column)

Result: Column

Computes the character length of a given string or number of bytes of a binary string.
The length of character strings include the trailing spaces. The length of binary strings
includes binary zeros.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.440Z

raw docstring

levenshtein^clj

(levenshtein left-expr right-expr)

Params: (l: Column, r: Column)

Result: Column

Computes the Levenshtein distance of the two given string columns.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.441Z

Params: (l: Column, r: Column)

Result: Column

Computes the Levenshtein distance of the two given string columns.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.441Z

raw docstring

locate^clj

(locate substr expr)

Params: (substr: String, str: Column)

Result: Column

Locate the position of the first occurrence of substr.

1.5.0

The position is not zero based, but 1 based index. Returns 0 if substr could not be found in str.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.445Z

Params: (substr: String, str: Column)

Result: Column

Locate the position of the first occurrence of substr.


1.5.0

The position is not zero based, but 1 based index. Returns 0 if substr
could not be found in str.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.445Z

raw docstring

log^clj

(log expr)

Params: (e: Column)

Result: Column

Computes the natural logarithm of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.449Z

Params: (e: Column)

Result: Column

Computes the natural logarithm of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.449Z

raw docstring

log-10^clj

(log-10 expr)

Params: (e: Column)

Result: Column

Computes the logarithm of the given value in base 10.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.451Z

Params: (e: Column)

Result: Column

Computes the logarithm of the given value in base 10.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.451Z

raw docstring

log-1p^clj

(log-1p expr)

Params: (e: Column)

Result: Column

Computes the natural logarithm of the given value plus one.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.453Z

Params: (e: Column)

Result: Column

Computes the natural logarithm of the given value plus one.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.453Z

raw docstring

log-2^clj

(log-2 expr)

Params: (expr: Column)

Result: Column

Computes the logarithm of the given column in base 2.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.455Z

Params: (expr: Column)

Result: Column

Computes the logarithm of the given column in base 2.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.455Z

raw docstring

log10^clj

(log10 expr)

Params: (e: Column)

Result: Column

Computes the logarithm of the given value in base 10.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.451Z

Params: (e: Column)

Result: Column

Computes the logarithm of the given value in base 10.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.451Z

raw docstring

log1p^clj

(log1p expr)

Params: (e: Column)

Result: Column

Computes the natural logarithm of the given value plus one.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.453Z

Params: (e: Column)

Result: Column

Computes the natural logarithm of the given value plus one.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.453Z

raw docstring

log2^clj

(log2 expr)

Params: (expr: Column)

Result: Column

Computes the logarithm of the given column in base 2.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.455Z

Params: (expr: Column)

Result: Column

Computes the logarithm of the given column in base 2.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.455Z

raw docstring

lower^clj

(lower expr)

Params: (e: Column)

Result: Column

Converts a string column to lower case.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.457Z

Params: (e: Column)

Result: Column

Converts a string column to lower case.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.457Z

raw docstring

lpad^clj

(lpad expr length pad)

Params: (str: Column, len: Int, pad: String)

Result: Column

Left-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.458Z

Params: (str: Column, len: Int, pad: String)

Result: Column

Left-pad the string column with pad to a length of len. If the string column is longer
than len, the return value is shortened to len characters.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.458Z

raw docstring

ltrim^clj

(ltrim expr)

Params: (e: Column)

Result: Column

Trim the spaces from left end for the specified string value.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.460Z

Params: (e: Column)

Result: Column

Trim the spaces from left end for the specified string value.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.460Z

raw docstring

map^clj

(map & exprs)

Params: (cols: Column*)

Result: Column

Creates a new map column. The input columns must be grouped as key-value pairs, e.g. (key1, value1, key2, value2, ...). The key columns must all have the same data type, and can't be null. The value columns must all have the same data type.

2.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.461Z

Params: (cols: Column*)

Result: Column

Creates a new map column. The input columns must be grouped as key-value pairs, e.g.
(key1, value1, key2, value2, ...). The key columns must all have the same data type, and can't
be null. The value columns must all have the same data type.


2.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.461Z

raw docstring

map-concat^clj

(map-concat & exprs)

Params: (cols: Column*)

Result: Column

Returns the union of all the given maps.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.462Z

Params: (cols: Column*)

Result: Column

Returns the union of all the given maps.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.462Z

raw docstring

map-entries^clj

(map-entries expr)

Params: (e: Column)

Result: Column

Returns an unordered array of all entries in the given map.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.463Z

Params: (e: Column)

Result: Column

Returns an unordered array of all entries in the given map.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.463Z

raw docstring

map-filter^clj

(map-filter expr predicate)

Params: (expr: Column, f: (Column, Column) ⇒ Column)

Result: Column

Returns a map whose key-value pairs satisfy a predicate.

the input map column

(key, value) => predicate, the Boolean predicate to filter the input map column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.465Z

Params: (expr: Column, f: (Column, Column) ⇒ Column)

Result: Column

Returns a map whose key-value pairs satisfy a predicate.

the input map column

(key, value) => predicate, the Boolean predicate to filter the input map column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.465Z

raw docstring

map-from-arrays^clj

(map-from-arrays key-expr val-expr)

Params: (keys: Column, values: Column)

Result: Column

Creates a new map column. The array in the first column is used for keys. The array in the second column is used for values. All elements in the array for key should not be null.

2.4

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.470Z

Params: (keys: Column, values: Column)

Result: Column

Creates a new map column. The array in the first column is used for keys. The array in the
second column is used for values. All elements in the array for key should not be null.


2.4

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.470Z

raw docstring

map-from-entries^clj

(map-from-entries expr)

Params: (e: Column)

Result: Column

Returns a map created from the given array of entries.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.471Z

Params: (e: Column)

Result: Column

Returns a map created from the given array of entries.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.471Z

raw docstring

map-keys^clj

(map-keys expr)

Params: (e: Column)

Result: Column

Returns an unordered array containing the keys of the map.

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.472Z

Params: (e: Column)

Result: Column

Returns an unordered array containing the keys of the map.

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.472Z

raw docstring

map-values^clj

(map-values expr)

Params: (e: Column)

Result: Column

Returns an unordered array containing the values of the map.

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.473Z

Params: (e: Column)

Result: Column

Returns an unordered array containing the values of the map.

2.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.473Z

raw docstring

map-zip-with^clj

(map-zip-with left right merge-fn)

Params: (left: Column, right: Column, f: (Column, Column, Column) ⇒ Column)

Result: Column

Merge two given maps, key-wise into a single map using a function.

the left input map column

the right input map column

(key, value1, value2) => new_value, the lambda function to merge the map values

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.474Z

Params: (left: Column, right: Column, f: (Column, Column, Column) ⇒ Column)

Result: Column

Merge two given maps, key-wise into a single map using a function.

the left input map column

the right input map column

(key, value1, value2) => new_value, the lambda function to merge the map values

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.474Z

raw docstring

md-5^clj

(md-5 expr)

Params: (e: Column)

Result: Column

Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.478Z

Params: (e: Column)

Result: Column

Calculates the MD5 digest of a binary column and returns the value
as a 32 character hex string.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.478Z

raw docstring

md5^clj

(md5 expr)

Params: (e: Column)

Result: Column

Calculates the MD5 digest of a binary column and returns the value as a 32 character hex string.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.478Z

Params: (e: Column)

Result: Column

Calculates the MD5 digest of a binary column and returns the value
as a 32 character hex string.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.478Z

raw docstring

minute^clj

(minute expr)

Params: (e: Column)

Result: Column

Extracts the minutes as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.483Z

Params: (e: Column)

Result: Column

Extracts the minutes as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.483Z

raw docstring

monotonically-increasing-id^clj

(monotonically-increasing-id)

Params: ()

Result: Column

A column expression that generates monotonically increasing 64-bit integers.

The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.

As an example, consider a DataFrame with two partitions, each with 3 records. This expression would return the following IDs:

(Since version 2.0.0) Use monotonically_increasing_id()

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.744Z

Params: ()

Result: Column

A column expression that generates monotonically increasing 64-bit integers.

The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive.
The current implementation puts the partition ID in the upper 31 bits, and the record number
within each partition in the lower 33 bits. The assumption is that the data frame has
less than 1 billion partitions, and each partition has less than 8 billion records.

As an example, consider a DataFrame with two partitions, each with 3 records.
This expression would return the following IDs:

(Since version 2.0.0) Use monotonically_increasing_id()

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.744Z

raw docstring

month^clj

(month expr)

Params: (e: Column)

Result: Column

Extracts the month as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.486Z

Params: (e: Column)

Result: Column

Extracts the month as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.486Z

raw docstring

months-between^clj

(months-between l-expr r-expr)

Params: (end: Column, start: Column)

Result: Column

Returns number of months between dates start and end.

A whole number is returned if both inputs have the same day of month or both are the last day of their respective months. Otherwise, the difference is calculated assuming 31 days per month.

For example:

A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A date, timestamp or string. If a string, the data must be in a format that can cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A double, or null if either end or start were strings that could not be cast to a timestamp. Negative if end is before start

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.490Z

Params: (end: Column, start: Column)

Result: Column

Returns number of months between dates start and end.

A whole number is returned if both inputs have the same day of month or both are the last day
of their respective months. Otherwise, the difference is calculated assuming 31 days per month.

For example:

A date, timestamp or string. If a string, the data must be in a format that can
             be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A date, timestamp or string. If a string, the data must be in a format that can
             cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A double, or null if either end or start were strings that could not be cast to a
        timestamp. Negative if end is before start

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.490Z

raw docstring

nanvl^clj

(nanvl left-expr right-expr)

Params: (col1: Column, col2: Column)

Result: Column

Returns col1 if it is not NaN, or col2 if col1 is NaN.

Both inputs should be floating point columns (DoubleType or FloatType).

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.492Z

Params: (col1: Column, col2: Column)

Result: Column

Returns col1 if it is not NaN, or col2 if col1 is NaN.

Both inputs should be floating point columns (DoubleType or FloatType).


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.492Z

raw docstring

negate^clj

(negate expr)

Params: (e: Column)

Result: Column

Unary minus, i.e. negate the expression.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.494Z

Params: (e: Column)

Result: Column

Unary minus, i.e. negate the expression.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.494Z

raw docstring

next-day^clj

(next-day expr day-of-week)

Params: (date: Column, dayOfWeek: String)

Result: Column

Returns the first date which is later than the value of the date column that is on the specified day of the week.

For example, next_day('2015-07-27', "Sunday") returns 2015-08-02 because that is the first Sunday after 2015-07-27.

A date, timestamp or string. If a string, the data must be in a format that can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

Case insensitive, and accepts: "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"

A date, or null if date was a string that could not be cast to a date or if dayOfWeek was an invalid value

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.495Z

Params: (date: Column, dayOfWeek: String)

Result: Column

Returns the first date which is later than the value of the date column that is on the
specified day of the week.

For example, next_day('2015-07-27', "Sunday") returns 2015-08-02 because that is the first
Sunday after 2015-07-27.


A date, timestamp or string. If a string, the data must be in a format that
                 can be cast to a date, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

Case insensitive, and accepts: "Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"

A date, or null if date was a string that could not be cast to a date or if
        dayOfWeek was an invalid value

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.495Z

raw docstring

not^clj

(not expr)

Params: (e: Column)

Result: Column

Inversion of boolean expression, i.e. NOT.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.497Z

Params: (e: Column)

Result: Column

Inversion of boolean expression, i.e. NOT.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.497Z

raw docstring

ntile^clj

(ntile n)

Params: (n: Int)

Result: Column

Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window partition. For example, if n is 4, the first quarter of the rows will get value 1, the second quarter will get 2, the third quarter will get 3, and the last quarter will get 4.

This is equivalent to the NTILE function in SQL.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.500Z

Params: (n: Int)

Result: Column

Window function: returns the ntile group id (from 1 to n inclusive) in an ordered window
partition. For example, if n is 4, the first quarter of the rows will get value 1, the second
quarter will get 2, the third quarter will get 3, and the last quarter will get 4.

This is equivalent to the NTILE function in SQL.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.500Z

raw docstring

overlay^clj

(overlay src rep pos)

(overlay src rep pos len)

Params: (src: Column, replace: Column, pos: Column, len: Column)

Result: Column

Overlay the specified portion of src with replace, starting from byte position pos of src and proceeding for len bytes.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.503Z

Params: (src: Column, replace: Column, pos: Column, len: Column)

Result: Column

Overlay the specified portion of src with replace,
 starting from byte position pos of src and proceeding for len bytes.


3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.503Z

raw docstring

percent-rank^clj

(percent-rank)

Params: ()

Result: Column

Window function: returns the relative rank (i.e. percentile) of rows within a window partition.

This is computed by:

This is equivalent to the PERCENT_RANK function in SQL.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.504Z

Params: ()

Result: Column

Window function: returns the relative rank (i.e. percentile) of rows within a window partition.

This is computed by:

This is equivalent to the PERCENT_RANK function in SQL.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.504Z

raw docstring

pi^clj

The double value that is closer than any other to pi, the ratio of the circumference of a circle to its diameter.

The double value that is closer than any other to pi, the ratio of the circumference of a circle to its diameter.

raw docstring

pmod^clj

(pmod left-expr right-expr)

Params: (dividend: Column, divisor: Column)

Result: Column

Returns the positive value of dividend mod divisor.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.505Z

Params: (dividend: Column, divisor: Column)

Result: Column

Returns the positive value of dividend mod divisor.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.505Z

raw docstring

posexplode^clj

(posexplode expr)

Params: (e: Column)

Result: Column

Creates a new row for each element with position in the given array or map column. Uses the default column name pos for position, and col for elements in the array and key and value for elements in the map unless specified otherwise.

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.506Z

Params: (e: Column)

Result: Column

Creates a new row for each element with position in the given array or map column.
Uses the default column name pos for position, and col for elements in the array
and key and value for elements in the map unless specified otherwise.


2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.506Z

raw docstring

posexplode-outer^clj

(posexplode-outer expr)

Params: (e: Column)

Result: Column

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.506Z

Params: (e: Column)

Result: Column

Creates a new row for each element with position in the given array or map column.
Uses the default column name pos for position, and col for elements in the array
and key and value for elements in the map unless specified otherwise.


2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.506Z

raw docstring

pow^clj

(pow base exponent)

Params: (l: Column, r: Column)

Result: Column

Returns the value of the first argument raised to the power of the second argument.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.520Z

Params: (l: Column, r: Column)

Result: Column

Returns the value of the first argument raised to the power of the second argument.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.520Z

raw docstring

quarter^clj

(quarter expr)

Params: (e: Column)

Result: Column

Extracts the quarter as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.521Z

Params: (e: Column)

Result: Column

Extracts the quarter as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.521Z

raw docstring

radians^clj

(radians expr)

Params: (e: Column)

Result: Column

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

angle in degrees

angle in radians, as if computed by java.lang.Math.toRadians

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.523Z

Params: (e: Column)

Result: Column

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.


angle in degrees

angle in radians, as if computed by java.lang.Math.toRadians

2.1.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.523Z

raw docstring

rand^clj

(rand)

(rand seed)

Params: (seed: Long)

Result: Column

Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).

1.4.0

The function is non-deterministic in general case.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.526Z

Params: (seed: Long)

Result: Column

Generate a random column with independent and identically distributed (i.i.d.) samples
uniformly distributed in [0.0, 1.0).


1.4.0

The function is non-deterministic in general case.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.526Z

raw docstring

randn^clj

(randn)

(randn seed)

Params: (seed: Long)

Result: Column

Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.

1.4.0

The function is non-deterministic in general case.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.528Z

Params: (seed: Long)

Result: Column

Generate a column with independent and identically distributed (i.i.d.) samples from
the standard normal distribution.


1.4.0

The function is non-deterministic in general case.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.528Z

raw docstring

rank^clj

(rank)

Params: ()

Result: Column

Window function: returns the rank of rows within a window partition.

The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking sequence when there are ties. That is, if you were ranking a competition using dense_rank and had three people tie for second place, you would say that all three were in second place and that the next person came in third. Rank would give me sequential numbers, making the person that came in third place (after the ties) would register as coming in fifth.

This is equivalent to the RANK function in SQL.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.529Z

Params: ()

Result: Column

Window function: returns the rank of rows within a window partition.

The difference between rank and dense_rank is that dense_rank leaves no gaps in ranking
sequence when there are ties. That is, if you were ranking a competition using dense_rank
and had three people tie for second place, you would say that all three were in second
place and that the next person came in third. Rank would give me sequential numbers, making
the person that came in third place (after the ties) would register as coming in fifth.

This is equivalent to the RANK function in SQL.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.529Z

raw docstring

regexp-extract^clj

(regexp-extract expr regex idx)

Params: (e: Column, exp: String, groupIdx: Int)

Result: Column

Extract a specific group matched by a Java regex, from the specified string column. If the regex did not match, or the specified group did not match, an empty string is returned.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.530Z

Params: (e: Column, exp: String, groupIdx: Int)

Result: Column

Extract a specific group matched by a Java regex, from the specified string column.
If the regex did not match, or the specified group did not match, an empty string is returned.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.530Z

raw docstring

regexp-replace^clj

(regexp-replace expr pattern-expr replacement-expr)

Params: (e: Column, pattern: String, replacement: String)

Result: Column

Replace all substrings of the specified string value that match regexp with rep.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.532Z

Params: (e: Column, pattern: String, replacement: String)

Result: Column

Replace all substrings of the specified string value that match regexp with rep.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.532Z

raw docstring

reverse^clj

(reverse expr)

Params: (e: Column)

Result: Column

Returns a reversed string or an array with reverse order of elements.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.534Z

Params: (e: Column)

Result: Column

Returns a reversed string or an array with reverse order of elements.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.534Z

raw docstring

rint^clj

(rint expr)

Params: (e: Column)

Result: Column

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.536Z

Params: (e: Column)

Result: Column

Returns the double value that is closest in value to the argument and
is equal to a mathematical integer.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.536Z

raw docstring

round^clj

(round expr)

Params: (e: Column)

Result: Column

Returns the value of the column e rounded to 0 decimal places with HALF_UP round mode.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.539Z

Params: (e: Column)

Result: Column

Returns the value of the column e rounded to 0 decimal places with HALF_UP round mode.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.539Z

raw docstring

row-number^clj

(row-number)

Params: ()

Result: Column

Window function: returns a sequential number starting at 1 within a window partition.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.540Z

Params: ()

Result: Column

Window function: returns a sequential number starting at 1 within a window partition.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.540Z

raw docstring

rpad^clj

(rpad expr length pad)

Params: (str: Column, len: Int, pad: String)

Result: Column

Right-pad the string column with pad to a length of len. If the string column is longer than len, the return value is shortened to len characters.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.541Z

Params: (str: Column, len: Int, pad: String)

Result: Column

Right-pad the string column with pad to a length of len. If the string column is longer
than len, the return value is shortened to len characters.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.541Z

raw docstring

rtrim^clj

(rtrim expr)

Params: (e: Column)

Result: Column

Trim the spaces from right end for the specified string value.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.543Z

Params: (e: Column)

Result: Column

Trim the spaces from right end for the specified string value.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.543Z

raw docstring

schema-of-csv^clj

(schema-of-csv expr)

(schema-of-csv expr options)

Params: (csv: String)

Result: Column

Parses a CSV string and infers its schema in DDL format.

a CSV string.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.547Z

Params: (csv: String)

Result: Column

Parses a CSV string and infers its schema in DDL format.


a CSV string.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.547Z

raw docstring

schema-of-json^clj

(schema-of-json expr)

(schema-of-json expr options)

Params: (json: String)

Result: Column

Parses a JSON string and infers its schema in DDL format.

a JSON string.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.554Z

Params: (json: String)

Result: Column

Parses a JSON string and infers its schema in DDL format.


a JSON string.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.554Z

raw docstring

second^clj

(second expr)

Params: (e: Column)

Result: Column

Extracts the seconds as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a timestamp

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.555Z

Params: (e: Column)

Result: Column

Extracts the seconds as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a timestamp

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.555Z

raw docstring

sequence^clj

(sequence start stop step)

Params: (start: Column, stop: Column, step: Column)

Result: Column

Generate a sequence of integers from start to stop, incrementing by step.

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.557Z

Params: (start: Column, stop: Column, step: Column)

Result: Column

Generate a sequence of integers from start to stop, incrementing by step.


2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.557Z

raw docstring

sha-1^clj

(sha-1 expr)

Params: (e: Column)

Result: Column

Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.558Z

Params: (e: Column)

Result: Column

Calculates the SHA-1 digest of a binary column and returns the value
as a 40 character hex string.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.558Z

raw docstring

sha-2^clj

(sha-2 expr n-bits)

Params: (e: Column, numBits: Int)

Result: Column

Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.

column to compute SHA-2 on.

one of 224, 256, 384, or 512.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.559Z

Params: (e: Column, numBits: Int)

Result: Column

Calculates the SHA-2 family of hash functions of a binary column and
returns the value as a hex string.


column to compute SHA-2 on.

one of 224, 256, 384, or 512.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.559Z

raw docstring

sha1^clj

(sha1 expr)

Params: (e: Column)

Result: Column

Calculates the SHA-1 digest of a binary column and returns the value as a 40 character hex string.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.558Z

Params: (e: Column)

Result: Column

Calculates the SHA-1 digest of a binary column and returns the value
as a 40 character hex string.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.558Z

raw docstring

sha2^clj

(sha2 expr n-bits)

Params: (e: Column, numBits: Int)

Result: Column

Calculates the SHA-2 family of hash functions of a binary column and returns the value as a hex string.

column to compute SHA-2 on.

one of 224, 256, 384, or 512.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.559Z

Params: (e: Column, numBits: Int)

Result: Column

Calculates the SHA-2 family of hash functions of a binary column and
returns the value as a hex string.


column to compute SHA-2 on.

one of 224, 256, 384, or 512.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.559Z

raw docstring

shift-left^clj

(shift-left expr num-bits)

Params: (e: Column, numBits: Int)

Result: Column

Shift the given value numBits left. If the given value is a long value, this function will return a long value else it will return an integer value.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.560Z

Params: (e: Column, numBits: Int)

Result: Column

Shift the given value numBits left. If the given value is a long value, this function
will return a long value else it will return an integer value.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.560Z

raw docstring

shift-right^clj

(shift-right expr num-bits)

Params: (e: Column, numBits: Int)

Result: Column

(Signed) shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.562Z

Params: (e: Column, numBits: Int)

Result: Column

(Signed) shift the given value numBits right. If the given value is a long value, it will
return a long value else it will return an integer value.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.562Z

raw docstring

shift-right-unsigned^clj

(shift-right-unsigned expr num-bits)

Params: (e: Column, numBits: Int)

Result: Column

Unsigned shift the given value numBits right. If the given value is a long value, it will return a long value else it will return an integer value.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.563Z

Params: (e: Column, numBits: Int)

Result: Column

Unsigned shift the given value numBits right. If the given value is a long value,
it will return a long value else it will return an integer value.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.563Z

raw docstring

sign^clj

(sign expr)

Params: (e: Column)

Result: Column

Computes the signum of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.566Z

Params: (e: Column)

Result: Column

Computes the signum of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.566Z

raw docstring

signum^clj

(signum expr)

Params: (e: Column)

Result: Column

Computes the signum of the given value.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.566Z

Params: (e: Column)

Result: Column

Computes the signum of the given value.


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.566Z

raw docstring

sin^clj

(sin expr)

Params: (e: Column)

Result: Column

angle in radians

sine of the angle, as if computed by java.lang.Math.sin

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.568Z

Params: (e: Column)

Result: Column

angle in radians

sine of the angle, as if computed by java.lang.Math.sin

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.568Z

raw docstring

sinh^clj

(sinh expr)

Params: (e: Column)

Result: Column

hyperbolic angle

hyperbolic sine of the given value, as if computed by java.lang.Math.sinh

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.570Z

Params: (e: Column)

Result: Column

hyperbolic angle

hyperbolic sine of the given value, as if computed by java.lang.Math.sinh

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.570Z

raw docstring

size^clj

(size expr)

Params: (e: Column)

Result: Column

Returns length of array or map.

The function returns null for null input if spark.sql.legacy.sizeOfNull is set to false or spark.sql.ansi.enabled is set to true. Otherwise, the function returns -1 for null input. With the default settings, the function returns -1 for null input.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.571Z

Params: (e: Column)

Result: Column

Returns length of array or map.

The function returns null for null input if spark.sql.legacy.sizeOfNull is set to false or
spark.sql.ansi.enabled is set to true. Otherwise, the function returns -1 for null input.
With the default settings, the function returns -1 for null input.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.571Z

raw docstring

skewness^clj

(skewness expr)

Params: (e: Column)

Result: Column

Aggregate function: returns the skewness of the values in a group.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.574Z

Params: (e: Column)

Result: Column

Aggregate function: returns the skewness of the values in a group.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.574Z

raw docstring

slice^clj

(slice expr start length)

Params: (x: Column, start: Int, length: Int)

Result: Column

Returns an array containing all the elements in x from index start (or starting from the end if start is negative) with the specified length.

the array column to be sliced

the starting index

the length of the slice

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.575Z

Params: (x: Column, start: Int, length: Int)

Result: Column

Returns an array containing all the elements in x from index start (or starting from the
end if start is negative) with the specified length.


the array column to be sliced

the starting index

the length of the slice

2.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.575Z

raw docstring

sort-array^clj

(sort-array expr)

(sort-array expr asc)

Params: (e: Column)

Result: Column

Sorts the input array for the given column in ascending order, according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.577Z

Params: (e: Column)

Result: Column

Sorts the input array for the given column in ascending order,
according to the natural ordering of the array elements.
Null elements will be placed at the beginning of the returned array.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.577Z

raw docstring

soundex^clj

(soundex expr)

Params: (e: Column)

Result: Column

Returns the soundex code for the specified expression.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.578Z

Params: (e: Column)

Result: Column

Returns the soundex code for the specified expression.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.578Z

raw docstring

spark-partition-id^clj

(spark-partition-id)

Params: ()

Result: Column

Partition ID.

1.6.0

This is non-deterministic because it depends on data partitioning and task scheduling.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.579Z

Params: ()

Result: Column

Partition ID.


1.6.0

This is non-deterministic because it depends on data partitioning and task scheduling.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.579Z

raw docstring

split^clj

(split expr pattern)

Params: (str: Column, pattern: String)

Result: Column

Splits str around matches of the given pattern.

a string expression to split

a string representing a regular expression. The regex string should be a Java regular expression.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.582Z

Params: (str: Column, pattern: String)

Result: Column

Splits str around matches of the given pattern.


a string expression to split

a string representing a regular expression. The regex string should be
               a Java regular expression.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.582Z

raw docstring

sqr^clj

(sqr expr)

Returns the value of the first argument raised to the power of two.

Returns the value of the first argument raised to the power of two.

raw docstring

sqrt^clj

(sqrt expr)

Params: (e: Column)

Result: Column

Computes the square root of the specified float value.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.584Z

Params: (e: Column)

Result: Column

Computes the square root of the specified float value.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.584Z

raw docstring

std^clj

(std expr)

Params: (e: Column)

Result: Column

Aggregate function: alias for stddev_samp.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.586Z

Params: (e: Column)

Result: Column

Aggregate function: alias for stddev_samp.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.586Z

raw docstring

stddev^clj

(stddev expr)

Params: (e: Column)

Result: Column

Aggregate function: alias for stddev_samp.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.586Z

Params: (e: Column)

Result: Column

Aggregate function: alias for stddev_samp.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.586Z

raw docstring

stddev-pop^clj

(stddev-pop expr)

Params: (e: Column)

Result: Column

Aggregate function: returns the population standard deviation of the expression in a group.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.593Z

Params: (e: Column)

Result: Column

Aggregate function: returns the population standard deviation of
the expression in a group.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.593Z

raw docstring

stddev-samp^clj

(stddev-samp expr)

Params: (e: Column)

Result: Column

Aggregate function: alias for stddev_samp.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.586Z

Params: (e: Column)

Result: Column

Aggregate function: alias for stddev_samp.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.586Z

raw docstring

struct^clj

(struct & exprs)

Params: (cols: Column*)

Result: Column

Creates a new struct column. If the input column is a column in a DataFrame, or a derived column expression that is named (i.e. aliased), its name would be retained as the StructField's name, otherwise, the newly generated StructField's name would be auto generated as col with a suffix index + 1, i.e. col1, col2, col3, ...

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.597Z

Params: (cols: Column*)

Result: Column

Creates a new struct column.
If the input column is a column in a DataFrame, or a derived column expression
that is named (i.e. aliased), its name would be retained as the StructField's name,
otherwise, the newly generated StructField's name would be auto generated as
col with a suffix index + 1, i.e. col1, col2, col3, ...


1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.597Z

raw docstring

substring^clj

(substring expr pos len)

Params: (str: Column, pos: Int, len: Int)

Result: Column

Substring starts at pos and is of length len when str is String type or returns the slice of byte array that starts at pos in byte and is of length len when str is Binary type

1.5.0

The position is not zero based, but 1 based index.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.599Z

Params: (str: Column, pos: Int, len: Int)

Result: Column

Substring starts at pos and is of length len when str is String type or
returns the slice of byte array that starts at pos in byte and is of length len
when str is Binary type


1.5.0

The position is not zero based, but 1 based index.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.599Z

raw docstring

substring-index^clj

(substring-index expr delim cnt)

Params: (str: Column, delim: String, count: Int)

Result: Column

Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything the left of the final delimiter (counting from left) is returned. If count is negative, every to the right of the final delimiter (counting from the right) is returned. substring_index performs a case-sensitive match when searching for delim.

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.600Z

Params: (str: Column, delim: String, count: Int)

Result: Column

Returns the substring from string str before count occurrences of the delimiter delim.
If count is positive, everything the left of the final delimiter (counting from left) is
returned. If count is negative, every to the right of the final delimiter (counting from the
right) is returned. substring_index performs a case-sensitive match when searching for delim.


Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.600Z

raw docstring

sum-distinct^clj

(sum-distinct expr)

Params: (e: Column)

Result: Column

Aggregate function: returns the sum of distinct values in the expression.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.604Z

Params: (e: Column)

Result: Column

Aggregate function: returns the sum of distinct values in the expression.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.604Z

raw docstring

tan^clj

(tan expr)

Params: (e: Column)

Result: Column

angle in radians

tangent of the given value, as if computed by java.lang.Math.tan

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.607Z

Params: (e: Column)

Result: Column

angle in radians

tangent of the given value, as if computed by java.lang.Math.tan

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.607Z

raw docstring

tanh^clj

(tanh expr)

Params: (e: Column)

Result: Column

hyperbolic angle

hyperbolic tangent of the given value, as if computed by java.lang.Math.tanh

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.610Z

Params: (e: Column)

Result: Column

hyperbolic angle

hyperbolic tangent of the given value, as if computed by java.lang.Math.tanh

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.610Z

raw docstring

time-window^clj

(time-window time-expr duration)

(time-window time-expr duration slide)

(time-window time-expr duration slide start)

Params: (timeColumn: Column, windowDuration: String, slideDuration: String, startTime: String)

Result: Column

Bucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported. The following example takes the average stock price for a one minute window every 10 seconds starting 5 seconds after the hour:

The windows will look like:

For a streaming query, you may use the function current_timestamp to generate windows on processing time.

The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType.

A string specifying the width of the window, e.g. 10 minutes, 1 second. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers. Note that the duration is a fixed length of time, and does not vary over time according to a calendar. For example, 1 day always means 86,400,000 milliseconds, not a calendar day.

A string specifying the sliding interval of the window, e.g. 1 minute. A new window will be generated every slideDuration. Must be less than or equal to the windowDuration. Check org.apache.spark.unsafe.types.CalendarInterval for valid duration identifiers. This duration is likewise absolute, and does not vary according to a calendar.

The offset with respect to 1970-01-01 00:00:00 UTC with which to start window intervals. For example, in order to have hourly tumbling windows that start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide startTime as 15 minutes.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.732Z

Params: (timeColumn: Column, windowDuration: String, slideDuration: String, startTime: String)

Result: Column

Bucketize rows into one or more time windows given a timestamp specifying column. Window
starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window
[12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in
the order of months are not supported. The following example takes the average stock price for
a one minute window every 10 seconds starting 5 seconds after the hour:

The windows will look like:

For a streaming query, you may use the function current_timestamp to generate windows on
processing time.

The column or the expression to use as the timestamp for windowing by time.
The time column must be of TimestampType.

A string specifying the width of the window, e.g. 10 minutes,
1 second. Check org.apache.spark.unsafe.types.CalendarInterval for
valid duration identifiers. Note that the duration is a fixed length of
time, and does not vary over time according to a calendar. For example,
1 day always means 86,400,000 milliseconds, not a calendar day.

A string specifying the sliding interval of the window, e.g. 1 minute.
A new window will be generated every slideDuration. Must be less than
or equal to the windowDuration. Check
org.apache.spark.unsafe.types.CalendarInterval for valid duration
identifiers. This duration is likewise absolute, and does not vary
according to a calendar.

The offset with respect to 1970-01-01 00:00:00 UTC with which to start
window intervals. For example, in order to have hourly tumbling windows that
start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide
startTime as 15 minutes.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.732Z

raw docstring

to-csv^clj

(to-csv expr)

(to-csv expr options)

Params: (e: Column, options: Map[String, String])

Result: Column

(Java-specific) Converts a column containing a StructType into a CSV string with the specified schema. Throws an exception, in the case of an unsupported type.

a column containing a struct.

options to control how the struct column is converted into a CSV string. It accepts the same options and the json data source.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.613Z

Params: (e: Column, options: Map[String, String])

Result: Column

(Java-specific) Converts a column containing a StructType into a CSV string with
the specified schema. Throws an exception, in the case of an unsupported type.


a column containing a struct.

options to control how the struct column is converted into a CSV string.
               It accepts the same options and the json data source.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.613Z

raw docstring

to-date^clj

(to-date expr)

(to-date expr date-format)

Params: (e: Column)

Result: Column

Converts the column into DateType by casting rules to DateType.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.616Z

Params: (e: Column)

Result: Column

Converts the column into DateType by casting rules to DateType.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.616Z

raw docstring

to-timestamp^clj

(to-timestamp expr)

(to-timestamp expr date-format)

Params: (s: Column)

Result: Column

Converts to a timestamp by casting rules to TimestampType.

A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A timestamp, or null if the input was a string that could not be cast to a timestamp

2.2.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.623Z

Params: (s: Column)

Result: Column

Converts to a timestamp by casting rules to TimestampType.


A date, timestamp or string. If a string, the data must be in a format that can be
         cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A timestamp, or null if the input was a string that could not be cast to a timestamp

2.2.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.623Z

raw docstring

to-utc-timestamp^clj

(to-utc-timestamp expr)

Params: (ts: Column, tz: String)

Result: Column

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

A date, timestamp or string. If a string, the data must be in a format that can be cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A timestamp, or null if ts was a string that could not be cast to a timestamp or tz was an invalid value

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.626Z

Params: (ts: Column, tz: String)

Result: Column

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time
zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield
'2017-07-14 01:40:00.0'.


A date, timestamp or string. If a string, the data must be in a format that can be
          cast to a timestamp, such as yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.SSSS

A string detailing the time zone ID that the input should be adjusted to. It should
          be in the format of either region-based zone IDs or zone offsets. Region IDs must
          have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in
          the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. Also 'UTC' and 'Z' are
          supported as aliases of '+00:00'. Other short names are not recommended to use
          because they can be ambiguous.

A timestamp, or null if ts was a string that could not be cast to a timestamp or
        tz was an invalid value

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.626Z

raw docstring

transform^clj

(transform expr xform-fn)

Params: (column: Column, f: (Column) ⇒ Column)

Result: Column

Returns an array of elements after applying a transformation to each element in the input array.

the input array column

col => transformed_col, the lambda function to transform the input column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.629Z

Params: (column: Column, f: (Column) ⇒ Column)

Result: Column

Returns an array of elements after applying a transformation to each element
in the input array.

the input array column

col => transformed_col, the lambda function to transform the input column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.629Z

raw docstring

transform-keys^clj

(transform-keys expr key-fn)

Params: (expr: Column, f: (Column, Column) ⇒ Column)

Result: Column

Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new keys for the pairs.

the input map column

(key, value) => new_key, the lambda function to transform the key of input map column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.630Z

Params: (expr: Column, f: (Column, Column) ⇒ Column)

Result: Column

Applies a function to every key-value pair in a map and returns
a map with the results of those applications as the new keys for the pairs.

the input map column

(key, value) => new_key, the lambda function to transform the key of input map column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.630Z

raw docstring

transform-values^clj

(transform-values expr key-fn)

Params: (expr: Column, f: (Column, Column) ⇒ Column)

Result: Column

Applies a function to every key-value pair in a map and returns a map with the results of those applications as the new values for the pairs.

the input map column

(key, value) => new_value, the lambda function to transform the value of input map column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.638Z

Params: (expr: Column, f: (Column, Column) ⇒ Column)

Result: Column

Applies a function to every key-value pair in a map and returns
a map with the results of those applications as the new values for the pairs.

the input map column

(key, value) => new_value, the lambda function to transform the value of input map
         column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.638Z

raw docstring

translate^clj

(translate expr match replacement)

Params: (src: Column, matchingString: String, replaceString: String)

Result: Column

Translate any character in the src by a character in replaceString. The characters in replaceString correspond to the characters in matchingString. The translate will happen when any character in the string matches the character in the matchingString.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.639Z

Params: (src: Column, matchingString: String, replaceString: String)

Result: Column

Translate any character in the src by a character in replaceString.
The characters in replaceString correspond to the characters in matchingString.
The translate will happen when any character in the string matches the character
in the matchingString.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.639Z

raw docstring

trim^clj

(trim expr trim-string)

Params: (e: Column)

Result: Column

Trim the spaces from both ends for the specified string column.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.641Z

Params: (e: Column)

Result: Column

Trim the spaces from both ends for the specified string column.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.641Z

raw docstring

unbase-64^clj

(unbase-64 expr)

Params: (e: Column)

Result: Column

Decodes a BASE64 encoded string column and returns it as a binary column. This is the reverse of base64.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.702Z

Params: (e: Column)

Result: Column

Decodes a BASE64 encoded string column and returns it as a binary column.
This is the reverse of base64.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.702Z

raw docstring

unbase64^clj

(unbase64 expr)

Params: (e: Column)

Result: Column

Decodes a BASE64 encoded string column and returns it as a binary column. This is the reverse of base64.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.702Z

Params: (e: Column)

Result: Column

Decodes a BASE64 encoded string column and returns it as a binary column.
This is the reverse of base64.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.702Z

raw docstring

unhex^clj

(unhex expr)

Params: (column: Column)

Result: Column

Inverse of hex. Interprets each pair of characters as a hexadecimal number and converts to the byte representation of number.

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.703Z

Params: (column: Column)

Result: Column

Inverse of hex. Interprets each pair of characters as a hexadecimal number
and converts to the byte representation of number.


1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.703Z

raw docstring

unix-timestamp^clj

(unix-timestamp)

(unix-timestamp expr)

(unix-timestamp expr pattern)

Params: ()

Result: Column

Returns the current Unix timestamp (in seconds) as a long.

1.5.0

All calls of unix_timestamp within the same query return the same value (i.e. the current timestamp is calculated at the start of query evaluation).

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.710Z

Params: ()

Result: Column

Returns the current Unix timestamp (in seconds) as a long.


1.5.0

All calls of unix_timestamp within the same query return the same value
(i.e. the current timestamp is calculated at the start of query evaluation).

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.710Z

raw docstring

upper^clj

(upper expr)

Params: (e: Column)

Result: Column

Converts a string column to upper case.

1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.712Z

Params: (e: Column)

Result: Column

Converts a string column to upper case.


1.3.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.712Z

raw docstring

var-pop^clj

(var-pop expr)

Params: (e: Column)

Result: Column

Aggregate function: returns the population variance of the values in a group.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.714Z

Params: (e: Column)

Result: Column

Aggregate function: returns the population variance of the values in a group.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.714Z

raw docstring

var-samp^clj

(var-samp expr)

Params: (e: Column)

Result: Column

Aggregate function: alias for var_samp.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.718Z

Params: (e: Column)

Result: Column

Aggregate function: alias for var_samp.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.718Z

raw docstring

variance^clj

(variance expr)

Params: (e: Column)

Result: Column

Aggregate function: alias for var_samp.

1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.718Z

Params: (e: Column)

Result: Column

Aggregate function: alias for var_samp.


1.6.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.718Z

raw docstring

week-of-year^clj

(week-of-year expr)

Params: (e: Column)

Result: Column

Extracts the week number as an integer from a given date/timestamp/string.

A week is considered to start on a Monday and week 1 is the first week with more than 3 days, as defined by ISO 8601

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.723Z

Params: (e: Column)

Result: Column

Extracts the week number as an integer from a given date/timestamp/string.

A week is considered to start on a Monday and week 1 is the first week with more than 3 days,
as defined by ISO 8601


An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.723Z

raw docstring

weekofyear^clj

(weekofyear expr)

Params: (e: Column)

Result: Column

Extracts the week number as an integer from a given date/timestamp/string.

A week is considered to start on a Monday and week 1 is the first week with more than 3 days, as defined by ISO 8601

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.723Z

Params: (e: Column)

Result: Column

Extracts the week number as an integer from a given date/timestamp/string.

A week is considered to start on a Monday and week 1 is the first week with more than 3 days,
as defined by ISO 8601


An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.723Z

raw docstring

when^clj

(when condition if-expr)

(when condition if-expr else-expr)

Params: (condition: Column, value: Any)

Result: Column

Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.724Z

Params: (condition: Column, value: Any)

Result: Column

Evaluates a list of conditions and returns one of multiple possible result expressions.
If otherwise is not defined at the end, null is returned for unmatched conditions.

1.4.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.724Z

raw docstring

window^clj

(window time-expr duration)

(window time-expr duration slide)

(window time-expr duration slide start)

Params: (timeColumn: Column, windowDuration: String, slideDuration: String, startTime: String)

Result: Column

The windows will look like:

For a streaming query, you may use the function current_timestamp to generate windows on processing time.

The column or the expression to use as the timestamp for windowing by time. The time column must be of TimestampType.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.732Z

Params: (timeColumn: Column, windowDuration: String, slideDuration: String, startTime: String)

Result: Column

Bucketize rows into one or more time windows given a timestamp specifying column. Window
starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window
[12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in
the order of months are not supported. The following example takes the average stock price for
a one minute window every 10 seconds starting 5 seconds after the hour:

The windows will look like:

For a streaming query, you may use the function current_timestamp to generate windows on
processing time.

The column or the expression to use as the timestamp for windowing by time.
The time column must be of TimestampType.

A string specifying the width of the window, e.g. 10 minutes,
1 second. Check org.apache.spark.unsafe.types.CalendarInterval for
valid duration identifiers. Note that the duration is a fixed length of
time, and does not vary over time according to a calendar. For example,
1 day always means 86,400,000 milliseconds, not a calendar day.

A string specifying the sliding interval of the window, e.g. 1 minute.
A new window will be generated every slideDuration. Must be less than
or equal to the windowDuration. Check
org.apache.spark.unsafe.types.CalendarInterval for valid duration
identifiers. This duration is likewise absolute, and does not vary
according to a calendar.

The offset with respect to 1970-01-01 00:00:00 UTC with which to start
window intervals. For example, in order to have hourly tumbling windows that
start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide
startTime as 15 minutes.

2.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.732Z

raw docstring

xxhash-64^clj

(xxhash-64 & exprs)

Params: (cols: Column*)

Result: Column

Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.733Z

Params: (cols: Column*)

Result: Column

Calculates the hash code of given columns using the 64-bit
variant of the xxHash algorithm, and returns the result as a long
column.


3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.733Z

raw docstring

xxhash64^clj

(xxhash64 & exprs)

Params: (cols: Column*)

Result: Column

Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.733Z

Params: (cols: Column*)

Result: Column

Calculates the hash code of given columns using the 64-bit
variant of the xxHash algorithm, and returns the result as a long
column.


3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.733Z

raw docstring

year^clj

(year expr)

Params: (e: Column)

Result: Column

Extracts the year as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.734Z

Params: (e: Column)

Result: Column

Extracts the year as an integer from a given date/timestamp/string.

An integer, or null if the input was a string that could not be cast to a date

1.5.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.734Z

raw docstring

zip-with^clj

(zip-with left right merge-fn)

Params: (left: Column, right: Column, f: (Column, Column) ⇒ Column)

Result: Column

Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.

the left input array column

the right input array column

(lCol, rCol) => col, the lambda function to merge two input columns into one column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.737Z

Params: (left: Column, right: Column, f: (Column, Column) ⇒ Column)

Result: Column

Merge two given arrays, element-wise, into a single array using a function.
If one array is shorter, nulls are appended at the end to match the length of the longer
array, before applying the function.

the left input array column

the right input array column

(lCol, rCol) => col, the lambda function to merge two input columns into one column

3.0.0

Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/sql/functions$.html

Timestamp: 2020-10-19T01:56:22.737Z

raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

Keyboard shortcuts Report a problem cljdoc on GitHub

× close