Spark makes available a number of functions that operate on SQL maps. Geni makes all these functions available in the core namespace. On top of that, Geni also adds a number of synonyms from Clojure core's map functions.
The examples assume the following required namespaces:
(require '[zero-one.geni.core :as g])
(require '[zero-one.geni.ml :as ml])
(require '[zero-one.geni.test-resources :refer [melbourne-df]])
We can create map types using map
, map-from-entries
and map-from-arrays
as follows:
Note that melbourne-df
is a function so we need to add ()
to evaluate it.
(def dataframe
(-> (melbourne-df)
(g/limit 2)
(g/select
{:location (g/map (g/lit "suburb") :Suburb
(g/lit "region") :Regionname
(g/lit "council") :CouncilArea
(g/lit "address") :Address)
:market (g/map-from-entries
(g/array (g/struct (g/lit "size") (g/double :Price))
(g/struct (g/lit "price") (g/double :Price))))
:coord (g/map-from-arrays
(g/array (g/lit "lat") (g/lit "long"))
(g/array :Lattitude :Longtitude))})))
(g/collect dataframe)
; =>
({:location
{"suburb" "Abbotsford",
"region" "Northern Metropolitan",
"council" "Yarra",
"address" "85 Turner St"},
:market {"size" 1480000.0, "price" 1480000.0},
:coord {"lat" -37.7996, "long" 144.9984}}
{:location
{"suburb" "Abbotsford",
"region" "Northern Metropolitan",
"council" "Yarra",
"address" "25 Bloomburg St"},
:market {"size" 1035000.0, "price" 1035000.0},
:coord {"lat" -37.8079, "long" 144.9934}}]
Note that map-from-arrays
above is actually the same as Clojure's zipmap
, and it would be more natural to use the latter in a Clojure codebase. For that reason, Geni includes the following functions:
Geni | Original Spark |
---|---|
assoc | - |
dissoc | - |
keys | map-keys |
merge-with | map-zip-with |
merge | map-concat |
rename-keys | - |
select-keys | - |
update | - |
vals | map-values |
zipmap | map-from-arrays |
Can you improve this documentation? These fine people already did:
Anthony Khong & BurinEdit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close