Provides access to different datasets that are bundled with Incanter.
Provides access to different datasets that are bundled with Incanter.
(get-dataset dataset-key
{:keys [incanter-home from-repo]
:or {incanter-home (or (System/getProperty "incanter.home")
(System/getenv "INCANTER_HOME"))
from-repo false}})
Returns the sample dataset associated with the given key. Most datasets are from R's sample data sets, as are the descriptions below.
:incanter-home -- if the incanter.home property is not set when the JVM is started (using -Dincanter.home) or there is no INCANTER_HOME environment variable set, use the :incanter-home options to provide the parent directory of the sample data directory.
:from-repo (default false) -- If true, retrieves the dataset from the online repository instead of locally, it will do this by default if incanter-home is not set.
:iris -- the Fisher's or Anderson's Iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris.
:cars -- The data give the speed of cars and the distances taken to stop. Note that the data were recorded in the 1920s.
:survey -- survey data used in Scott Lynch's 'Introduction to Applied Bayesian Statistics and Estimation for Social Scientists'
:us-arrests -- This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas.
:flow-meter -- flow meter data used in Bland Altman Lancet paper.
:co2 -- has 84 rows and 5 columns of data from an experiment on the cold tolerance of the grass species Echinochloa crus-galli.
:chick-weight -- has 578 rows and 4 columns from an experiment on the effect of diet on early growth of chicks.
:plant-growth -- Results from an experiment to compare yields (as measured by dried weight of plants) obtained under a control and two different treatment conditions.
:pontius -- These data are from a NIST study involving calibration of load cells. The response variable (y) is the deflection and the predictor variable (x) is load. See
:filip -- NIST data set for linear regression certification, see
:longely -- This classic dataset of labor statistics was one of the first used to test the accuracy of least squares computations. The response variable (y) is the Total Derived Employment and the predictor variables are GNP Implicit Price Deflator with Year 1954 = 100 (x1), Gross National Product (x2), Unemployment (x3), Size of Armed Forces (x4), Non-Institutional Population Age 14 & Over (x5), and Year (x6). See
:Chwirut -- These data are the result of a NIST study involving ultrasonic calibration. The response variable is ultrasonic response, and the predictor variable is metal distance. See
:thurstone -- test data for non-linear least squares.
:austres -- Quarterly Time Series of the Number of Australian Residents
:hair-eye-color -- Hair and eye color of sample of students
:airline-passengers -- Monthly Airline Passenger Numbers 1949-1960
:math-prog -- Pass/fail results for a high school mathematics assessment test and a freshmen college programming course.
:iran-election -- Vote counts for 30 provinces from the 2009 Iranian election.
Examples: (def data (get-dataset :cars)) (def data2 (get-dataset :cars :incanter.home "/usr/local/packages/incanter"))
Returns the sample dataset associated with the given key. Most datasets are from R's sample data sets, as are the descriptions below. Options: :incanter-home -- if the incanter.home property is not set when the JVM is started (using -Dincanter.home) or there is no INCANTER_HOME environment variable set, use the :incanter-home options to provide the parent directory of the sample data directory. :from-repo (default false) -- If true, retrieves the dataset from the online repository instead of locally, it will do this by default if incanter-home is not set. Datasets: :iris -- the Fisher's or Anderson's Iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. :cars -- The data give the speed of cars and the distances taken to stop. Note that the data were recorded in the 1920s. :survey -- survey data used in Scott Lynch's 'Introduction to Applied Bayesian Statistics and Estimation for Social Scientists' :us-arrests -- This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 1973. Also given is the percent of the population living in urban areas. :flow-meter -- flow meter data used in Bland Altman Lancet paper. :co2 -- has 84 rows and 5 columns of data from an experiment on the cold tolerance of the grass species _Echinochloa crus-galli_. :chick-weight -- has 578 rows and 4 columns from an experiment on the effect of diet on early growth of chicks. :plant-growth -- Results from an experiment to compare yields (as measured by dried weight of plants) obtained under a control and two different treatment conditions. :pontius -- These data are from a NIST study involving calibration of load cells. The response variable (y) is the deflection and the predictor variable (x) is load. See :filip -- NIST data set for linear regression certification, see :longely -- This classic dataset of labor statistics was one of the first used to test the accuracy of least squares computations. The response variable (y) is the Total Derived Employment and the predictor variables are GNP Implicit Price Deflator with Year 1954 = 100 (x1), Gross National Product (x2), Unemployment (x3), Size of Armed Forces (x4), Non-Institutional Population Age 14 & Over (x5), and Year (x6). See :Chwirut -- These data are the result of a NIST study involving ultrasonic calibration. The response variable is ultrasonic response, and the predictor variable is metal distance. See :thurstone -- test data for non-linear least squares. :austres -- Quarterly Time Series of the Number of Australian Residents :hair-eye-color -- Hair and eye color of sample of students :airline-passengers -- Monthly Airline Passenger Numbers 1949-1960 :math-prog -- Pass/fail results for a high school mathematics assessment test and a freshmen college programming course. :iran-election -- Vote counts for 30 provinces from the 2009 Iranian election. Examples: (def data (get-dataset :cars)) (def data2 (get-dataset :cars :incanter.home "/usr/local/packages/incanter"))
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close