Logistic Regression

Tags: Classification, Logistic, Supervised, Week2

Updated: December 1, 2022

Logistic Regression

Sigmoid function

y = \frac{1}{1 + e^{-x}}

Apply Sigmoid function to the regression.

\begin{aligned} Y_{\beta}(x) & = \frac{1}{1 + e^{-\beta^T x+\epsilon}}\\ \\ \Pr(Y_{i}=y\mid \mathbf{X} _{i}) & = {p_{i}}^{y}(1-p_{i})^{1-y}\\ & = \left({\frac{e^{\mathbf{\beta}\cdot \mathbf{X} _{i}}}{1+e^{\mathbf{\beta} \cdot \mathbf{X} _{i}}}}\right)^{y}\left(1-{\frac {e^{\mathbf{\beta}\cdot \mathbf{X} _{i}}}{1+e^{\mathbf{\beta}\cdot \mathbf{X} _{i}}}}\right)^{1-y}\\ & = {\frac {e^{\mathbf{\beta}\cdot \mathbf{X} _{i}\cdot y}}{1+e^{\mathbf{\beta}\cdot \mathbf{X} _{i}}}} \end{aligned}

And it convert into a odds ratio.

\frac{P(x)}{1-P(x)} = e^{\beta^T x}

and

\ln{\frac{P(x)}{1-P(x)}} = \beta^T x

                  from sklearn.linear_model import LogisticRegression
LR = LogisticRegression(penalty='l2', C=1e5)
LR = LR.fit(x_train, y_train)
y_predict = LR.predict(X_test)
LR.coef_

                

Confusion Matrix

	Predicted True	Predicted False
Actual True	TP	FN
Actual False	FP	TN

FN is called type 1 error, and FP is called type 2 error.

Accuracy is the ratio of correct predictions to all predictions.

= \frac{TP + TN}{TP + TN + FP + FN}

Sensitivity, Recall Correctly predict the positive class. What percentage is captured true

= \frac{TP}{TP + FN} = \frac{TP}{Actual\ True}

Precision is, out of all positive predictions, how many are correct. Trade off between recall and precision.

= \frac{TP}{TP + FP} = \frac{TP}{Predicted\ True}

Specificity is how correctly predicted the negative class. Recall for class 0.

= \frac{TN}{TN + FP} = \frac{TN}{Actual\ False}

F1 is the harmonic mean of precision and recall. It captures trade off between recall and precision.

= 2\times \frac{Precision \times Recall}{Precision + Recall}

Classification Error Metrics

ROC

Receiver Operating Characteristics is a scatter Plot of True Positive Rate (TPR, Sensitivity) and False Positive Rate (FPR 1-Specificity).
Better when data with balanced classes.

smallcenter

Precision-Recall Curves

Trade off between precision and recall.
Better for data with imbalanced classes.

smallcenter

Multiple Class Error Metrics

Accuracy $= \frac{TP1+TP2+TP3}{Total}$

Code

                  from sklearn.metrics import accuracy_score
accuracy_value = accuracy_score(y_test, y_predict)

from sklearn.metrics import precision_score, recall_score, f1_score, roc_auc_score, confusion_matrix, roc_curve, precision_recall_curve

Share on

Twitter Facebook LinkedIn

Younghun Lee

Logistic Regression

Logistic Regression

Sigmoid function

Confusion Matrix

Classification Error Metrics

ROC

Precision-Recall Curves

Multiple Class Error Metrics

Code

Share on

Leave a comment

You may also enjoy

Policy Gradient Method

Machine Learning on Apple stock daily return2

Machine Learning on Apple stock daily return

Reinforcement Learning