Liking cljdoc? Tell your friends :D

tmducken.duckdb

DuckDB C-level bindings for tech.ml.dataset.

Current datatype support:

  • boolean, all numeric types int8->int64, uint8->uint64, float32, float64.
  • string
  • LocalDate, Instant column types.

Example:


user> (require '[tech.v3.dataset :as ds])
nil
user> (require '[tmducken.duckdb :as duckdb])
nil
user> (duckdb/initialize!)
10:04:14.814 [nREPL-session-635e9bc8-2923-442b-9fad-da547210617b] INFO tmducken.duckdb - Attempting to load duckdb from "/home/chrisn/dev/cnuernber/tmducken/binaries/libduckdb.so"
true
user> (def stocks
      (-> (ds/->dataset "https://github.com/techascent/tech.ml.dataset/raw/master/test/data/stocks.csv" {:key-fn keyword})
          (vary-meta assoc :name :stocks)))
#'user/stocks
user> (def db (duckdb/open-db))
#'user/db
user> (def conn (duckdb/connect db))
#'user/conn
user> (duckdb/create-table! conn stocks)
"stocks"
user> (duckdb/append-dataset! conn stocks)
nil
user> (ds/head (duckdb/execute-query! conn "select * from stocks"))
10:05:28.356 [tech.resource.gc ref thread] INFO tech.v3.resource.gc - Reference thread starting
_unnamed [5 3]:

| symbol |       date | price |
|--------|------------|------:|
|   MSFT | 2000-01-01 | 39.81 |
|   MSFT | 2000-02-01 | 36.35 |
|   MSFT | 2000-03-01 | 43.22 |
|   MSFT | 2000-04-01 | 28.37 |
|   MSFT | 2000-05-01 | 25.45 |
DuckDB C-level bindings for tech.ml.dataset.

  Current datatype support:

  * boolean, all numeric types int8->int64, uint8->uint64, float32, float64.
  * string
  * LocalDate, Instant column types.


  Example:

  ```clojure

user> (require '[tech.v3.dataset :as ds])
nil
user> (require '[tmducken.duckdb :as duckdb])
nil
user> (duckdb/initialize!)
10:04:14.814 [nREPL-session-635e9bc8-2923-442b-9fad-da547210617b] INFO tmducken.duckdb - Attempting to load duckdb from "/home/chrisn/dev/cnuernber/tmducken/binaries/libduckdb.so"
true
user> (def stocks
        (-> (ds/->dataset "https://github.com/techascent/tech.ml.dataset/raw/master/test/data/stocks.csv" {:key-fn keyword})
            (vary-meta assoc :name :stocks)))
#'user/stocks
user> (def db (duckdb/open-db))
#'user/db
user> (def conn (duckdb/connect db))
#'user/conn
user> (duckdb/create-table! conn stocks)
"stocks"
user> (duckdb/append-dataset! conn stocks)
nil
user> (ds/head (duckdb/execute-query! conn "select * from stocks"))
10:05:28.356 [tech.resource.gc ref thread] INFO tech.v3.resource.gc - Reference thread starting
_unnamed [5 3]:

| symbol |       date | price |
|--------|------------|------:|
|   MSFT | 2000-01-01 | 39.81 |
|   MSFT | 2000-02-01 | 36.35 |
|   MSFT | 2000-03-01 | 43.22 |
|   MSFT | 2000-04-01 | 28.37 |
|   MSFT | 2000-05-01 | 25.45 |
```
raw docstring

close-dbclj

(close-db db)

Close the database.

Close the database.
raw docstring

connectclj

(connect db)

Create a new database connection from an opened database. Users should call disconnect to close this connection.

Create a new database connection from an opened database.
Users should call disconnect to close this connection.
raw docstring

create-table!clj

(create-table! conn dataset)
(create-table! conn dataset options)

Create an sql table based off of the column datatypes of the dataset. Note that users can also call [[execute-query!]] with their own sql create-table string. Note that the fastest way to get data into the system is [[append-dataset!]].

Options:

  • :table-name - Name of the table to create. If not supplied the dataset name will be used.
  • :primary-key - sequence of column names to be used as the primary key.
Create an sql table based off of the column datatypes of the dataset.  Note that users
can also call [[execute-query!]] with their own sql create-table string.  Note that the
fastest way to get data into the system is [[append-dataset!]].

Options:

* `:table-name` - Name of the table to create.  If not supplied the dataset name will
   be used.
* `:primary-key` - sequence of column names to be used as the primary key.
raw docstring

disconnectclj

(disconnect conn)

Disconnect a connection.

Disconnect a connection.
raw docstring

drop-table!clj

(drop-table! conn dataset)

get-config-optionsclj

(get-config-options)

Returns a sequence of maps of {:name :desc} describing valid valid configuration options to the open-db function.

Returns a sequence of maps of {:name :desc} describing valid valid configuration
options to the open-db function.
raw docstring

initialize!clj

(initialize!)
(initialize! {:keys [duckdb-home]})

Initialize the duckdb ffi system. This must be called first should be called only once. It is safe, however, to call this multiple times.

Options:

  • :duckdb-home - Directory in which to find the duckdb shared library. Users can pass this in. If not passed in, then the environment variable DUCKDB_HOME is checked. If neither is passed in then the library will be searched in the normal system library paths.
Initialize the duckdb ffi system.  This must be called first should be called only once.
It is safe, however, to call this multiple times.

Options:

* `:duckdb-home` - Directory in which to find the duckdb shared library.  Users can pass
this in.  If not passed in, then the environment variable `DUCKDB_HOME` is checked.  If
neither is passed in then the library will be searched in the normal system library
paths.
raw docstring

initialized?clj

(initialized?)

insert-dataset!clj

(insert-dataset! conn dataset)
(insert-dataset! conn dataset options)

Append this dataset using the higher performance append api of duckdb. This is recommended as opposed to using sql statements or prepared statements.

Append this dataset using the higher performance append api of duckdb.  This is recommended
as opposed to using sql statements or prepared statements.
raw docstring

open-dbclj

(open-db)
(open-db path)
(open-db path config-options)

Open a database. path may be nil in which case database is opened in-memory. For valid config options call get-config-options. Options must be passed as a map of string->string. As duckdb is dynamically linked configuration options may change but with linux-amd64-0.3.1 current options are:

tmducken.duckdb> (get-config-options)
[{:name "access_mode",
  :desc "Access mode of the database ([AUTOMATIC], READ_ONLY or READ_WRITE)"}
 {:name "default_order",
  :desc "The order type used when none is specified ([ASC] or DESC)"}
 {:name "default_null_order",
  :desc "Null ordering used when none is specified ([NULLS_FIRST] or NULLS_LAST)"}
 {:name "enable_external_access",
  :desc
  "Allow the database to access external state (through e.g. COPY TO/FROM, CSV readers, pandas replacement scans, etc)"}
 {:name "enable_object_cache",
  :desc "Whether or not object cache is used to cache e.g. Parquet metadata"}
 {:name "max_memory", :desc "The maximum memory of the system (e.g. 1GB)"}
 {:name "threads", :desc "The number of total threads used by the system"}]
Open a database.  `path` may be nil in which case database is opened in-memory.
  For valid config options call [[get-config-options]].  Options must be
  passed as a map of string->string.  As duckdb is dynamically linked configuration options
  may change but with `linux-amd64-0.3.1` current options are:

```clojure
tmducken.duckdb> (get-config-options)
[{:name "access_mode",
  :desc "Access mode of the database ([AUTOMATIC], READ_ONLY or READ_WRITE)"}
 {:name "default_order",
  :desc "The order type used when none is specified ([ASC] or DESC)"}
 {:name "default_null_order",
  :desc "Null ordering used when none is specified ([NULLS_FIRST] or NULLS_LAST)"}
 {:name "enable_external_access",
  :desc
  "Allow the database to access external state (through e.g. COPY TO/FROM, CSV readers, pandas replacement scans, etc)"}
 {:name "enable_object_cache",
  :desc "Whether or not object cache is used to cache e.g. Parquet metadata"}
 {:name "max_memory", :desc "The maximum memory of the system (e.g. 1GB)"}
 {:name "threads", :desc "The number of total threads used by the system"}]
```
raw docstring

sql->datasetclj

(sql->dataset conn sql)
(sql->dataset conn sql options)

Execute a query returning a dataset. Most data will be read in-place in the result set which will be link via metadata to the returned dataset. If you wish to release the data immediately wrap call in tech.v3.resource/stack-resource-context and clone the result.

Example:


  ;; !!Recommended!! - Results copied into jvm and result-set released immediately after query

tmducken.duckdb> (resource/stack-resource-context
                  (dt/clone (execute-query! conn "select * from stocks")))
_unnamed [560 3]:

| symbol |       date | price |
|--------|------------|------:|
|   MSFT | 2000-01-01 | 39.81 |
|   MSFT | 2000-02-01 | 36.35 |
|   MSFT | 2000-03-01 | 43.22 |
|   MSFT | 2000-04-01 | 28.37 |
|   MSFT | 2000-05-01 | 25.45 |
|   MSFT | 2000-06-01 | 32.54 |
|   MSFT | 2000-07-01 | 28.40 |



  ;; Results read in-place, result-set released as some point after dataset falls
  ;; out of scope.  Be extremely careful with this one.

tmducken.duckdb> (ds/head (execute-query! conn "select * from stocks"))
_unnamed [5 3]:

| symbol |       date | price |
|--------|------------|------:|
|   MSFT | 2000-01-01 | 39.81 |
|   MSFT | 2000-02-01 | 36.35 |
|   MSFT | 2000-03-01 | 43.22 |
|   MSFT | 2000-04-01 | 28.37 |
|   MSFT | 2000-05-01 | 25.45 |
Execute a query returning a dataset.  Most data will be read in-place in the result
  set which will be link via metadata to the returned dataset.  If you wish to release
  the data immediately wrap call in `tech.v3.resource/stack-resource-context` and clone
  the result.

  Example:


```clojure

  ;; !!Recommended!! - Results copied into jvm and result-set released immediately after query

tmducken.duckdb> (resource/stack-resource-context
                  (dt/clone (execute-query! conn "select * from stocks")))
_unnamed [560 3]:

| symbol |       date | price |
|--------|------------|------:|
|   MSFT | 2000-01-01 | 39.81 |
|   MSFT | 2000-02-01 | 36.35 |
|   MSFT | 2000-03-01 | 43.22 |
|   MSFT | 2000-04-01 | 28.37 |
|   MSFT | 2000-05-01 | 25.45 |
|   MSFT | 2000-06-01 | 32.54 |
|   MSFT | 2000-07-01 | 28.40 |



  ;; Results read in-place, result-set released as some point after dataset falls
  ;; out of scope.  Be extremely careful with this one.

tmducken.duckdb> (ds/head (execute-query! conn "select * from stocks"))
_unnamed [5 3]:

| symbol |       date | price |
|--------|------------|------:|
|   MSFT | 2000-01-01 | 39.81 |
|   MSFT | 2000-02-01 | 36.35 |
|   MSFT | 2000-03-01 | 43.22 |
|   MSFT | 2000-04-01 | 28.37 |
|   MSFT | 2000-05-01 | 25.45 |
```
raw docstring

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close