LINEAR REGRESSION

Parameters

values

Definition

Solver

11 - L2-regularized L2-loss SVR primal

We have 3 linear Regression solvers, by combining several types of loss functions and regularization schemes. The regularization can be L1 or L2, and the losses can be the regular L2-loss for SVM (hinge loss), or L1-loss for SVM. The default value for type is 11.

12 - L2-regularized L2-loss SVR dual

13 - L2-regularized L1-loss SVR dual

Cost (C)

The parameter C, trades off misclassification of training examples against simplicity of the decision surface. A low C makes the decision surface smooth, while a high C aims at classifying all training examples correctly. As C increases, tendency to misclassification decreases on train data (may lead to overfitting).

Epsilon_SVR (P)

Epsilon in the epsilon-SVR model. It specifies the epsilon-tube within which no penalty is associated in the training loss function with points predicted within a distance epsilon from the actual value.

Termination Criterion

Tolerance for stopping criterion. The stopping tolerance affects the number of iterations used when optimizing the model.

Folds

V-fold for Cross Validation. In v-fold cross-validation, we first divide the training set into v subsets of equal size. Sequentially one subset is tested using the classifier trained on the remaining v − 1 subsets. Thus, each instance of the whole training set is predicted once so the cross-validation accuracy is the percentage of data which are correctly classified.