11.2 Classification Models
Classification Model Overview
Classification models are those which predict a categorical dependent variable. We can further sub-divide these algorithms into the "two-class" and "multiclass" versions. Two-class algorithms are used when there are only two values of the dependent variable. For example, in our Bike Buyers example, predicting the words "Yes" and "No"--indicating whether or not they purchased a bike--would require a two-class algorithm whereas, in our Lending Tree example, predicting a loan status of "Fully Paid," "Current," "Grace Period," ... "Charged Off" would require a multiclass algorithm. We will review a sample of the following algorithms below:
-
Two-class logistic regression: fast to train, assumes linear model.
-
Two-class averaged perceptron: fast to train, assumes linear model.
-
Two-class Bayes point machine: fast to train, assumes linear model.
-
Two-class decision forest: accurate and fast to train.
-
Two-class decision jungle: accurate and small memory footprint.
-
Two-class boosted decision tree: accurate and fast to train with large memory footprint.
-
Two-class support vector machine: under 100 independent variables, assumes linear model.
-
Two-class locally deep support vector machine: must have less than 100 independent variables.
-
Two-class neural network: very accurate, long training times.
-
Multiclass logistic regression: fast to train, assumes linear model.
-
Multiclass neural network: very accurate, long training times.
-
Multiclass decision forest: accurate, fast training times.
-
Multiclass decision jungle: accurate, small memory footprint.
-
One-v-all multiclass: depends on the two-class classifier.
Two-Class Logistic Regression
Two-class Averaged Perceptron
Two-class Bayes Point Machine
Two-class Decision Forest
Two-Class Decision Jungle
Two-Class Boosted Decision Tree
Two-class Support Vector Machine
Two-Class Locally Deep Support Vector Machine
Two-Class Neural Network
Multiclass Models
Multiclass Logistic Regression
This works just the same as the two-class version.
Multiclass Neural Network
This works just the same as the two-class version.
Multiclass Decision Forest
This works just the same as the two-class version.
Multiclass Decision Jungle
This works just the same as the two-class version.
One-v-All Multiclass
One-v-all multiclass is an interesting ensamble method that uses a two-class algorithm of your choice to make a two-class comparison for every binary classification. For example, if you want to evaluate five levels of education, this technique will compare level 1 to level 2, level 1 to level 3, level 1 to level 4, and so forth. Then it will combine all of those results together. Watch below: