![]() 19 Feature Selection using Univariate Filters.18.1 Models with Built-In Feature Selection.16.6 Neural Networks with a Principal Component Step.16.2 Partial Least Squares Discriminant Analysis.16.1 Yet Another k-Nearest Neighbor Function.13.9 Illustrative Example 6: Offsets in Generalized Linear Models.13.8 Illustrative Example 5: Optimizing probability thresholds for class imbalances.13.7 Illustrative Example 4: PLS Feature Extraction Pre-Processing.13.6 Illustrative Example 3: Nonstandard Formulas.13.5 Illustrative Example 2: Something More Complicated - LogitBoost.13.2 Illustrative Example 1: SVMs with Laplacian Kernels.12.1.2 Using additional data to measure performance.12.1.1 More versatile tools for preprocessing data.11.4 Using Custom Subsampling Techniques.7.0.27 Multivariate Adaptive Regression Splines.5.9 Fitting Models Without Parameter Tuning.5.8 Exploring and Comparing Resampling Distributions.5.7 Extracting Predictions and Class Probabilities.5.1 Model Training and Parameter Tuning.4.4 Simple Splitting with Important Groups.4.1 Simple Splitting Based on the Outcome.3.2 Zero- and Near Zero-Variance Predictors.In logistic regression, we use the logistic function, which is defined in Equation (5.1) and produces the S-shaped curve in the right plot above. To avoid the inadequacies of the linear model fit on a binary response, we must model the probability of our response using a function that gives outputs between 0 and 1 for all values of \(X\). Predicted probabilities using linear regression results in flawed logic whereas predicted values from logistic regression will always lie between 0 and 1. Contrast this with the logistic regression line (right plot) that is nonlinear (sigmoidal-shaped).įigure 5.1: Comparing the predicted probabilities of linear regression (left) to logistic regression (right). These inconsistencies only increase as our data become more imbalanced and the number of outliers increase. These predictions are not sensible, since of course the true probability of defaulting, regardless of credit card balance, must fall between 0 and 1. Unfortunately, for balances close to zero we predict a negative probability of defaulting if we were to predict for very large balances, we would get values bigger than 1. To classify a customer as a high- vs. low-risk defaulter based on their balance we could use linear regression however, the left plot in Figure 5.1 illustrates how linear regression would predict the probability of defaulting. To provide a clear motivation for logistic regression, assume we have credit card default data for customers and we want to understand if the current credit card balance of a customer is an indicator of whether or not they’ll default on their credit card. 22.2 Measuring probability and uncertainty.21.3.2 Divisive hierarchical clustering.21.3.1 Agglomerative hierarchical clustering.21.2 Hierarchical clustering algorithms.18.4.2 Tuning to optimize for unseen data.17.5.2 Proportion of variance explained criterion.17.5 Selecting the number of principal components.16.8.3 XGBoost and built-in Shapley values.16.7 Local interpretable model-agnostic explanations.16.5 Individual conditional expectation.16.3 Permutation-based feature importance.16.2.3 Model-specific vs. model-agnostic.7.2.1 Multivariate adaptive regression splines.7 Multivariate Adaptive Regression Splines. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |