Tuesday, May 20, 2025
HomeData ScienceWant to learn about Logistic Regression? Check it out here!

Want to learn about Logistic Regression? Check it out here!

Table of Content

Applications in data mining and machine learning depend heavily on classification methods. Almost 70% of data science problems are classification problems. Logistic regression is a popular and practical regression technique for resolving the binary classification problem, while there are many other classification problems available. Spam detection is one of the many classification problems that may be solved with logistic regression. Other instances include predicting a consumer’s likelihood of purchasing a specific product, determining whether a customer would churn, determining whether a user will click on an advertisement link, and many more.

One of the simplest and most widely used machine learning techniques for two-class classification is logistic regression. It can serve as the foundation for any binary classification problem and is simple to implement. Deep learning benefits from its fundamental principles as well. The correlation between 1 dependent binary variable and independent variables is described and estimated by logistic regression.

You will learn the definition, operation, and appropriate application of logistic regression in this blog post. We’ll also go over a real-world example and go over the key concepts one by one.

What is Logistic Regression?

A statistical technique called logistic regression uses one or more independent variables to predict the probable outcome of a binary result, such as yes/no, pass/fail, or 0/1. Logistic regression is intended for classification problems with categorical output, as opposed to linear regression, which predicts continuous values.

Example:

To predict will a student pass(1)/fail(0) in an examination? The solution is entirely based on the no.of hours studied is a classic use case for logistic regression.

Logistic Regression Equation

The logistic regression is obtained from the linear regression equation by applying a sigmoid function to it.

The equation of Linear Regression is,

image 1

Where,

        y = dependent variable

        x = independent variable

β₀,β₁,β₂,βₙ       = coefficients

Sigmoid Function:

                                                                  1

                                        P  =           𑁋𑁋𑁋𑁋𑁋𑁋

                                                              1 + e⁽-y⁾

Applying the Sigmoid function on Linear Regression:

                                                                                 1

                                        P  =    𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋𑁋

                 1 + e-⁽β₀ + β₁X₁ + β₂X₂  +.....+  βₙXₙ⁾

                                       =>       p / 1 – p

                     P

ln ( 𑁋𑁋𑁋) =    β₀ + β₁X₁ + β₂X₂  +.....+  βₙXₙ

                   1 – P

  >>> The above equation is obtained after applying the logarithm.

Types of Logistic Regression

types of logistic regression
statology.org*

There are 3 forms of logistic regression determined based on the categories:

  1. Binomial

In binomial logistic regression, only 2 types of dependent variables are possible, like 0/1, pass/fail, true/false, spam/not spam, positive/negative, etc.

  1. Multinomial

In multinomial logistic regression, 3 or more unordered types of dependent variables are possible, such as “cat”, “dogs”, or “sheep”.

  1. Ordinal

In ordinal logistic regression, 3 or more ordered types of dependent variables are possible, such as “low”, “medium”, or “high”.

Example: Predicting whether a tumor is cancerous or not cancerous based on its size

Python Code

import numpy as np
from sklearn import linear_model

# Tumor sizes (in centimeters)

X = np.array([3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88]).reshape(-1, 1)

# Labels: 0 = not cancerous, 1 = cancerous
y = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

# Create and train the logistic regression model
logr = linear_model.LogisticRegression()
logr.fit(X, y)

# Predict if a tumor of size 3.46 cm is cancerous
prediction = logr.predict(np.array([[3.46]]))
print("Prediction (0 = not cancerous, 1 = cancerous):", prediction[0])

# Predict probability
probability = logr.predict_proba(np.array([[3.46]]))
print("Probability [not cancerous, cancerous]:", probability[0])

Output

Prediction (0 = not cancerous, 1 = cancerous): 0
Probability [not cancerous, cancerous]: [0.50241666 0.49758334]

Uses of Logistic Regression

  • The model coefficients are simple to understand and interpretable.
  • Suitable for large data sets, i.e., quick to train.
  • It can handle 0/1, true/false, and yes/no predictions with ease (Ideal for Binary Categorization).
  • Probabilistic Output – it gives the probability of a class, not just the label.

Applications of Logistic Regression

DomainExample Use Case
HealthcarePredicting disease presence or absence
MarketingCustomer conversion prediction
FinanceCredit risk analysis (default or not)
CybersecurityDetecting whether a login attempt is fraudulent
HRPredicting Employee Attrition

Assumptions for Logistic Regression:

  1. The dependent variable must be categorical.
  2. The independent variable should not have multicollinearity.

Conclusion

In machine learning, logistic regression continues to be one of the most used and successful methods, particularly for problems related to binary classification. Its interpretability, adaptability, and mathematical simplicity make it an effective base model for experts and a perfect place to start for newbies.

Leave feedback about this

  • Rating
Choose Image

Latest Posts

List of Categories