Machine Learning Tutorial

By KnowledgeHut .

It is one of the simplest metrics that helps in finding how correct and how accurate the model is. It is used with classification problems wherein the output (or the class into which the dataset has been classified into) could belong any of the classes (one of the 2 in binary classification and one of the many in multi-class classification). The confusion matrix is not a performance measure on its own, but most of the performance metrics are based on this matrix and the value this matrix gives out. Terminologies associated with the confusion matrix: True positives: Let us understand this with respect to a binary classification example- There are 2 classesnamely, True and False. True positive is the case wherein the predicted class is ‘True’ and the actual class to which the data item belongs to is also ‘True’. A non-spam email was correctly identified as ‘non-spam’. True positive rate, which is also known as sensitivity or recall can be defined as the ratio of truepositives and sum of true positives and false negatives: TPR = True positives/ (True positives + False negatives) True negatives: This also can be understood with respect to a binary classification example. Considertwo classes namely ‘True’ and ‘False’. True negative is the case wherein the predicted class is ‘False’, and the actual class to which the model belongs to is also ‘False’. A spam email was correctly identified as ‘spam’. True negative rate, which is also known as specificity or selectivity can be defined as the ratio of truenegatives and sum of true negatives and false positives: TNR = True negatives/ (True negatives + False positives) False positives: This can be understood with the help of a binary classification example that consists oftwo classes namely True and False. False positive values are those which have been predicted as belonging to the ‘True’ class, but they actually belong to the ‘False class. A non-spam email was incorrectly identified as a spam email. False positive rate, which is also known as fall-out can be defined as the ratio of false positives and sumof false positives and true negatives: FPR = False positives / (False positives + True negatives) False negatives: This can be understood with the help of a binary classification example that consists oftwo classes namely True and False. False negative values are those which have been predicted as belonging to the ‘False class, but they actually belong to the ‘True’ class. A spam email was incorrectly identified as a non-spam email. False negative rate, which is also known as miss rate can be defined as the ratio of false negatives andsum of false negatives and true positives: FNR = False negatives / (False negatives + True positives) The ideal situation is when the model gives 0 false positive values and 0 false negative values. But this wouldn’t be the case in real-life. The confusion matrix contains enough information so as to calculate precision and recall values as well. The below image shows what a confusion matrix would look like while classifying an animal as a cat or a dog. ConclusionIn this post, we understood about confusion matrix and how it can be used to determine the performance of a model.

1. Machine Learning Overview

2. Machine Learning Terminologies

3. Demystifying Machine Learning

4. Applications of Machine Learning

5. Methods for Machine Learning

6. Underfitting and Overfitting in Machine Learning

7. Data Loading for ML Projects

8. Introduction to Data in Machine Learning

9. Why Data Pre-processing?

10. Normalization

11. Numpy

12. K-Nearest Neighbors (KNN)

13. Hyperparameter Tuning

14. Pre-procesing Data

15. What is Clustering in Machine Learning?

16. Overview - Regression & Logistic Regression

17. Linear Regression(Python Implementation)

18. Softmax Regression using TensorFlow

19. What is Linear Regression?

20. Linear Regression using PyTorch

21. Decision Trees

22. Introduction To Machine Learning using Python

23. Learning Model Building in Scikit-learn: A Python Machine Learning Library

24. Confusion matrix

25. Machine learning metrics

26. Improving Performance of ML Models

27. How to get synonyms/antonyms from NLTK WordNet in Python?

28. Removing stop words with NLTK in Python

29. Tokenize text using NLTK in Python

Confusion matrix

The confusion matrix is not a performance measure on its own, but most of the performance metrics are based on this matrix and the value this matrix gives out.

Terminologies associated with the confusion matrix:

True positives: Let us understand this with respect to a binary classification example- There are 2 classesnamely, True and False. True positive is the case wherein the predicted class is ‘True’ and the actual class to which the data item belongs to is also ‘True’. A non-spam email was correctly identified as ‘non-spam’.
True positive rate, which is also known as sensitivity or recall can be defined as the ratio of truepositives and sum of true positives and false negatives:

TPR = True positives/ (True positives + False negatives)

True negatives: This also can be understood with respect to a binary classification example. Considertwo classes namely ‘True’ and ‘False’. True negative is the case wherein the predicted class is ‘False’, and the actual class to which the model belongs to is also ‘False’. A spam email was correctly identified as ‘spam’.
True negative rate, which is also known as specificity or selectivity can be defined as the ratio of truenegatives and sum of true negatives and false positives:

TNR = True negatives/ (True negatives + False positives)

False positives: This can be understood with the help of a binary classification example that consists oftwo classes namely True and False. False positive values are those which have been predicted as belonging to the ‘True’ class, but they actually belong to the ‘False class. A non-spam email was incorrectly identified as a spam email.
False positive rate, which is also known as fall-out can be defined as the ratio of false positives and sumof false positives and true negatives:

FPR = False positives / (False positives + True negatives)

False negatives: This can be understood with the help of a binary classification example that consists oftwo classes namely True and False. False negative values are those which have been predicted as belonging to the ‘False class, but they actually belong to the ‘True’ class. A spam email was incorrectly identified as a non-spam email.
False negative rate, which is also known as miss rate can be defined as the ratio of false negatives andsum of false negatives and true positives:

FNR = False negatives / (False negatives + True positives)

The ideal situation is when the model gives 0 false positive values and 0 false negative values. But this wouldn’t be the case in real-life. The confusion matrix contains enough information so as to calculate precision and recall values as well.

The below image shows what a confusion matrix would look like while classifying an animal as a cat or a dog.

Conclusion

In this post, we understood about confusion matrix and how it can be used to determine the performance of a model.

23-A Learning Model Building in Scikit-learn: A Python Machine Learning Library

25-A Machine learning metrics

Your email address will not be published. Required fields are marked *

Comments

Vinu

After reading your article, I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article. Thanks for sharing.

Johnson M

Good and informative article.

Vinu

I enjoyed reading your articles. This is truly a great read for me. Keep up the good work!

Vinu

Awesome blog. I enjoyed reading this article. This is truly a great read for me. Keep up the good work!

best data science courses in India

Thanks for sharing this article!! Machine learning is a branch of artificial intelligence (AI) and computer science that focus on the uses of data and algorithms. I came to know a lot of information from this article.

View More Comments

Search

Machine Learning Tutorial

By KnowledgeHut .

Machine Learning Tutorial

Confusion matrix

Leave a Reply

Comments

Vinu

Johnson M

Vinu

Vinu

best data science courses in India