Skip to content

Latest commit

 

History

History
78 lines (54 loc) · 4.24 KB

Libraries Summary.md

File metadata and controls

78 lines (54 loc) · 4.24 KB

Linear Regression & Classification

A generic notebook describing the methods can be found here.

SparseRegression.jl MultivariateStats.jl OnlineStats.jl
Package works yes yes yes
Deprecations warnings No No No
Compatible with JuliaDB If transformed into matrix If transformed into matrix If transformed into matrix
Documentation lacking very good very good
Simplicity good good High

SVM

Theory on support vector machines is found here

KSVM.jl LIBSVM.jl SVM.jl
Package works no yes no
Deprecations warnings yes No yes
- If transformed into matrix yes -
Documentation none yes, as inline documentation. Type ?Functionname none
Simplicity good medium good

Gradient Boosting

A notebook detailing boosting and gradient boosting is available here.

  • LightGMB.jl a Julia interface for Microsoft's LightGBM
  • XGBoost.jl an interface fo XGBoost (written in C)
  • A Julia implementation of gradient boosting
LightGBM.jl XGBoost.jl GradientBoost.jl
Packages works no Yes No
Deprecation warnings None None Several
Compatible with JuliaDB Yes (transformation of tables to arrays required) -
Documetation None Points to XGBoost docs few examples

Decision Trees

A generic notebook discussing decision tree models is available here. Note:

  • DecisionTrees.jl was found to be about one order of magnitude slower than the python version
  • OnlineStats only has an implementation for a type of tree called FastTree (and its ensemble, the FastForest)
DecisionTrees.jl ScikitLearn.jl OnlineStats.jl
Packages works yes Yes Yes
Deprecation warnings None Some None
Compatible with JuliaDB Yes (transformation of tables to arrays required) Yes (transformation of tables to arrays required) Yes (transformation of tables to arrays required)
Documetation None, but many examples very good None
Simplicity Good, like sklearn good quite low

Utilities

  • MLPreprocessing is used to do simple scaling/normalising
  • MLLabelUtils provides functions to modify the labels to be compatible with whatever the algorithm requires. For instance transform into categorical, booleans, from text to number etc
  • MLBase provides function for label encoding, classification from model scores, performance evaluation (ROC, F1 etc), cross-validation (Kfold, stratified Kfold, subsmapling), and grid search hyperparameter tuning.
  • MLMetrics provides model evaluation functions for regression, classification, multilabel ranking, and clustering designed to follow python's sklearn.metrics closely.
MLPreprocessing.jl MLLabelUtils.jl MLBase.jl MLMetrics.jl
Packages works yes yes yes yes
Deprecation warnings None No No yes
Compatible with JuliaDB If tables are converted to arrays or dataframes If tables transformed into arrays If tables transformed into arrays yes
Documetation None, but sufficient examples good good none
Simplicity Fair good good good