Comparing, Evaluating, and Maximizing

The question you may be asking at this point is, "How do I compare the accuracy of alternative models?" Microsoft has provided a good summary of model evaluation options for individual models here. However, there are several useful techniques for both evaluating individual models as well as comparing alternative models. Follow along with each of the examples below:

Compare Models

Okay, so you are already somewhat familiar with the Evaluate Model pill, but have you ever wondered what that second input is for? Let's find out:

Evaluate Models

There are also useful tools for further evaluating the reliability of evaluation metrics for a given model. Follow along below to learn the pill

How Cross-Validation Works (from Microsoft Documentation)

  • Cross validation randomly divides the training data into a number of partitions, also called folds.

    • The algorithm defaults to 10 folds if you have not previously partitioned the dataset.

    • To divide the dataset into a different number of folds, you can use the Partition and Sample module and indicate how many folds to use.

  • The module sets aside the data in fold 1 to use for validation (this is sometimes called the holdout fold), and uses the remaining folds to train a model.

    For example, if you create five folds, the module would generate five models during cross-validation, each model trained using 4/5 of the data, and tested on the remaining 1/5.

  • During testing of the model for each fold, multiple accuracy statistics are evaluated. Which statistics are used depends on the type of model that you are evaluating. Different statistics are used to evaluate classification models vs. regression models.

  • When the building and evaluation process is complete for all folds, Cross-Validate Model generates a set of performance metrics and scored results for all the data. You should review these metrics to see whether any single fold has particularly high or low accuracy

Advantages of cross-validation

A different, and very common way of evaluating a model is to divide the data into a training and test set using Split Data, and then validate the model on the training data. However, cross-validation offers some advantages:

  • Cross-validation uses more test data

    Cross-validation measures the performance of the model with the specified parameters in a bigger data space. That is, cross-validation uses the entire training dataset for both training and evaluation, instead of some portion. In contrast, if you validate a model by using data generated from a random split, typically you evaluate the model only on 30% or less of the available data.

    However, because cross-validation trains and validates the model multiple times over a larger dataset, it is much more computationally intensive and takes much longer than validating on a random split.

  • Cross-validation evaluates the dataset as well as the model

    Cross-validation does not simply measure the accuracy of a model, but also gives you some idea of how representative the dataset is and how sensitive the model might be to variations in the data.

Maximizing Model Performance

So far, you've learned how to try out variaous statistical algorithms to find out which one is best. You've learned how to compare and evaluate the model performance. Now, let's teach you how to squeeze every last drop of excellence out of your data. To do this, you are going to learn the pill. Follow along to see how this works: