(decision-tree-classifier params)Decision tree learning algorithm (http://en.wikipedia.org/wiki/Decision_tree_learning) for classification. It supports both binary and multiclass labels, as well as both continuous and categorical features.
Timestamp: 2020-10-19T01:55:55.948Z
Decision tree learning algorithm (http://en.wikipedia.org/wiki/Decision_tree_learning) for classification. It supports both binary and multiclass labels, as well as both continuous and categorical features. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.html Timestamp: 2020-10-19T01:55:55.948Z
(fm-classifier params)Factorization Machines learning algorithm for classification. It supports normal gradient descent and AdamW solver.
The implementation is based upon:
S. Rendle. "Factorization machines" 2010.
FM is able to estimate interactions even in problems with huge sparsity (like advertising and recommendation system). FM formula is:
FM classification model uses logistic loss which can be solved by gradient descent method, and regularization terms like L2 are usually added to the loss function to prevent overfitting.
Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/FMClassifier.html
Timestamp: 2020-10-19T01:55:56.340Z
Factorization Machines learning algorithm for classification. It supports normal gradient descent and AdamW solver. The implementation is based upon: S. Rendle. "Factorization machines" 2010. FM is able to estimate interactions even in problems with huge sparsity (like advertising and recommendation system). FM formula is: FM classification model uses logistic loss which can be solved by gradient descent method, and regularization terms like L2 are usually added to the loss function to prevent overfitting. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/FMClassifier.html Timestamp: 2020-10-19T01:55:56.340Z
(gbt-classifier params)Gradient-Boosted Trees (GBTs) (http://en.wikipedia.org/wiki/Gradient_boosting) learning algorithm for classification. It supports binary labels, as well as both continuous and categorical features.
The implementation is based upon: J.H. Friedman. "Stochastic Gradient Boosting." 1999.
Notes on Gradient Boosting vs. TreeBoost:
Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/GBTClassifier.html
Timestamp: 2020-10-19T01:55:56.899Z
Gradient-Boosted Trees (GBTs) (http://en.wikipedia.org/wiki/Gradient_boosting) learning algorithm for classification. It supports binary labels, as well as both continuous and categorical features. The implementation is based upon: J.H. Friedman. "Stochastic Gradient Boosting." 1999. Notes on Gradient Boosting vs. TreeBoost: Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/GBTClassifier.html Timestamp: 2020-10-19T01:55:56.899Z
(linear-svc params)Linear SVM Classifier
This binary classifier optimizes the Hinge Loss using the OWLQN optimizer. Only supports L2 regularization currently.
Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/LinearSVC.html
Timestamp: 2020-10-19T01:55:57.279Z
Linear SVM Classifier This binary classifier optimizes the Hinge Loss using the OWLQN optimizer. Only supports L2 regularization currently. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/LinearSVC.html Timestamp: 2020-10-19T01:55:57.279Z
(logistic-regression params)Logistic regression. Supports:
This class supports fitting traditional logistic regression model by LBFGS/OWLQN and bound (box) constrained logistic regression model by LBFGSB.
Timestamp: 2020-10-19T01:55:57.830Z
Logistic regression. Supports: This class supports fitting traditional logistic regression model by LBFGS/OWLQN and bound (box) constrained logistic regression model by LBFGSB. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/LogisticRegression.html Timestamp: 2020-10-19T01:55:57.830Z
(mlp-classifier params)Classifier trainer based on the Multilayer Perceptron. Each layer has sigmoid activation function, output layer has softmax. Number of inputs has to be equal to the size of feature vectors. Number of outputs has to be equal to the total number of labels.
Timestamp: 2020-10-19T01:55:58.225Z
Classifier trainer based on the Multilayer Perceptron. Each layer has sigmoid activation function, output layer has softmax. Number of inputs has to be equal to the size of feature vectors. Number of outputs has to be equal to the total number of labels. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.html Timestamp: 2020-10-19T01:55:58.225Z
(multilayer-perceptron-classifier params)Classifier trainer based on the Multilayer Perceptron. Each layer has sigmoid activation function, output layer has softmax. Number of inputs has to be equal to the size of feature vectors. Number of outputs has to be equal to the total number of labels.
Timestamp: 2020-10-19T01:55:58.225Z
Classifier trainer based on the Multilayer Perceptron. Each layer has sigmoid activation function, output layer has softmax. Number of inputs has to be equal to the size of feature vectors. Number of outputs has to be equal to the total number of labels. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.html Timestamp: 2020-10-19T01:55:58.225Z
(naive-bayes params)Naive Bayes Classifiers. It supports Multinomial NB (see here) which can handle finitely supported discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification. By making every vector a binary (0/1) data, it can also be used as Bernoulli NB (see here). The input feature values for Multinomial NB and Bernoulli NB must be nonnegative. Since 3.0.0, it supports Complement NB which is an adaptation of the Multinomial NB. Specifically, Complement NB uses statistics from the complement of each class to compute the model's coefficients The inventors of Complement NB show empirically that the parameter estimates for CNB are more stable than those for Multinomial NB. Like Multinomial NB, the input feature values for Complement NB must be nonnegative. Since 3.0.0, it also supports Gaussian NB (see here) which can handle continuous data.
Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/NaiveBayes.html
Timestamp: 2020-10-19T01:55:58.596Z
Naive Bayes Classifiers. It supports Multinomial NB (see here) which can handle finitely supported discrete data. For example, by converting documents into TF-IDF vectors, it can be used for document classification. By making every vector a binary (0/1) data, it can also be used as Bernoulli NB (see here). The input feature values for Multinomial NB and Bernoulli NB must be nonnegative. Since 3.0.0, it supports Complement NB which is an adaptation of the Multinomial NB. Specifically, Complement NB uses statistics from the complement of each class to compute the model's coefficients The inventors of Complement NB show empirically that the parameter estimates for CNB are more stable than those for Multinomial NB. Like Multinomial NB, the input feature values for Complement NB must be nonnegative. Since 3.0.0, it also supports Gaussian NB (see here) which can handle continuous data. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/NaiveBayes.html Timestamp: 2020-10-19T01:55:58.596Z
(one-vs-rest params)Reduction of Multiclass Classification to Binary Classification. Performs reduction using one against all strategy. For a multiclass classification with k classes, train k models (one per class). Each example is scored against all k models and the model with highest score is picked to label the example.
Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/OneVsRest.html
Timestamp: 2020-10-19T01:55:58.960Z
Reduction of Multiclass Classification to Binary Classification. Performs reduction using one against all strategy. For a multiclass classification with k classes, train k models (one per class). Each example is scored against all k models and the model with highest score is picked to label the example. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/OneVsRest.html Timestamp: 2020-10-19T01:55:58.960Z
(random-forest-classifier params)Random Forest learning algorithm for classification. It supports both binary and multiclass labels, as well as both continuous and categorical features.
Timestamp: 2020-10-19T01:55:59.351Z
Random Forest learning algorithm for classification. It supports both binary and multiclass labels, as well as both continuous and categorical features. Source: https://spark.apache.org/docs/3.0.1/api/scala/org/apache/spark/ml/classification/RandomForestClassifier.html Timestamp: 2020-10-19T01:55:59.351Z
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs | 
| ← | Move to previous article | 
| → | Move to next article | 
| Ctrl+/ | Jump to the search field |