Introduction to Logistic Regression
Logistic regression is used to predict a discrete outcome based on variables which may be discrete, continuous or mixed. Thus, when the dependent variable has two or more discrete outcomes, logistic regression is a commonly used technique. The outcome could be in the form of Yes / No, 1 / 0, True / False, High/Low, given a set of independent variables.
Let’s first understand how logistic regression is used in business world. Logistic regression has an array of applications. Here are a few applications used in real-world situations.
Marketing: A marketing consultant wants to predict if the subsidiary of his company will make profit, loss or just break even depending on the characteristic of the subsidiary operations.
Human Resources: The HR manager of a company wants to predict the absenteeism pattern of his employees based on their individual characteristic.
Finance: A bank wants to predict if his customers would default based on the previous transactions and history.
Types of logistic regression
If the response variable is dichotomous (two categories), then it is called binary logistic regression. If you have more than two categories within the response variable, then there are two possible logistic regression models.
- If the response variable is nominal, you fit a nominal logistic regression model.
- If the response variable is ordinal, you fit an ordinal regression model.
Logistic regression model
The plot shows a model of the relationship between a continuous predictor and the probability of an event or outcome. The linear model clearly does not fit if this is the true relationship between X and the probability. In order to model this relationship directly, you must use a nonlinear function. The plot displays one such function. The S-shape of the function is known as sigmoid.
Logit transformation
A logistic regression model applies a logit transformation to the probabilities. The logit is the natural log of the odds.