Machine Learning Tutorial

By KnowledgeHut .

Machine learning metrics are used to understand how well the machine learning model performed on the input data that was supplied to it. This way, the performance of the model can be improved by tuning the hyper parameters or tweaking features of the input dataset. The main goal of a learning model is to generalize well on never seen before data. Performance metrics help in determining how well the model generalizes on new data. Specific metrics need to be used on specific learning models, and not all metrics can be used on a single model. Just a specific metric or a set of metrics can be taken as point of reference and improved upon. Accuracy It talks about the part of predictions which a classification model predicted correctly. When a model classifies the given data items into two classes, it is known as binary classification. It can be defined as: Accuracy = (True positives +True negatives)/(Total number of data items) When classification models classify the data items into multiple classes, it is known as multi-class classification. It can be defined as: Accuracy = correctly predicted number of data items/ total number of data items Precision It tells about the number of data points (predictions) which were correctly predicted as true that were actually true/correct: Precision = (True positive) / (True Positive + False Positive) Recall It tells about the number of data points (predictions) which were actually relevant in a dataset: Recall = (True positive) / (True Positive + False Negative) Mean Absolute Error (MAE) It is used to understand the difference between the average of predicted values and actual values in the training data. Mean Squared Error (MSE) It is the result of dividing the square of losses and the total number of examples in the training dataset. F1 Score It is the harmonic mean of the precision and recall values. It is used to measure the accuracy of test dataset: F1 = 2 * (1/ ((1/precision) + (1/recall)) Area under curve It is defined with respect to binary classification problem. It is used to find the area under the ROC curve. ROC curve (Receiver Operating Characteristic curve) ROC refers to Receiver Operating Characteristic Curve, which is a visual way of determining the binary classifier’s performance. It is the ratio of True positive rate (also known as recall) and False positive rate: ROC curve = True positive rate/False positive rate True positive rate, which is also known as sensitivity or recall can be defined as the ratio of truepositives and sum of true positives and false negatives: TPR = True positives/ (True positives + False negatives) True negative rate, which is also known as specificity or selectivity can be defined as the ratio oftrue negatives and sum of true negatives and false positives: TNR = True negatives/ (True negatives + False positives) False positive rate, which is also known as fall-out can be defined as the ratio of false positivesand sum of false positives and true negatives: FPR = False positives / (False positives + True negatives) False negative rate, which is also known as miss rate can be defined as the ratio of falsenegatives and sum of false negatives and true positives FNR = False negatives / (False negatives + True positives) Conclusion In this post, we understood about the various performance metrics which are used by machine learning algorithms. The goal of these machine learning algorithms is to improve the value of certain performance metrics so that prediction values are good.

1. Machine Learning Overview

2. Machine Learning Terminologies

3. Demystifying Machine Learning

4. Applications of Machine Learning

5. Methods for Machine Learning

6. Underfitting and Overfitting in Machine Learning

7. Data Loading for ML Projects

8. Introduction to Data in Machine Learning

9. Why Data Pre-processing?

10. Normalization

11. Numpy

12. K-Nearest Neighbors (KNN)

13. Hyperparameter Tuning

14. Pre-procesing Data

15. What is Clustering in Machine Learning?

16. Overview - Regression & Logistic Regression

17. Linear Regression(Python Implementation)

18. Softmax Regression using TensorFlow

19. What is Linear Regression?

20. Linear Regression using PyTorch

21. Decision Trees

22. Introduction To Machine Learning using Python

23. Learning Model Building in Scikit-learn: A Python Machine Learning Library

24. Confusion matrix

25. Machine learning metrics

26. Improving Performance of ML Models

27. How to get synonyms/antonyms from NLTK WordNet in Python?

28. Removing stop words with NLTK in Python

29. Tokenize text using NLTK in Python

Machine learning metrics

Specific metrics need to be used on specific learning models, and not all metrics can be used on a single model. Just a specific metric or a set of metrics can be taken as point of reference and improved upon.

Accuracy

It talks about the part of predictions which a classification model predicted correctly. When a model classifies the given data items into two classes, it is known as binary classification. It can be defined as:

Accuracy = (True positives +True negatives)/(Total number of data items)

When classification models classify the data items into multiple classes, it is known as multi-class classification. It can be defined as:

Accuracy = correctly predicted number of data items/ total number of data items

Precision

It tells about the number of data points (predictions) which were correctly predicted as true that were actually true/correct:

Precision = (True positive) / (True Positive + False Positive)

Recall

It tells about the number of data points (predictions) which were actually relevant in a dataset:

Recall = (True positive) / (True Positive + False Negative)

Mean Absolute Error (MAE)

It is used to understand the difference between the average of predicted values and actual values in the training data.

Mean Squared Error (MSE)

It is the result of dividing the square of losses and the total number of examples in the training dataset.

F1 Score

It is the harmonic mean of the precision and recall values. It is used to measure the accuracy of test dataset:

F1 = 2 * (1/ ((1/precision) + (1/recall))

Area under curve

It is defined with respect to binary classification problem. It is used to find the area under the ROC curve.

ROC curve (Receiver Operating Characteristic curve)

ROC refers to Receiver Operating Characteristic Curve, which is a visual way of determining the binary classifier’s performance.

It is the ratio of True positive rate (also known as recall) and False positive rate:

ROC curve = True positive rate/False positive rate

True positive rate, which is also known as sensitivity or recall can be defined as the ratio of truepositives and sum of true positives and false negatives:

TPR = True positives/ (True positives + False negatives)

True negative rate, which is also known as specificity or selectivity can be defined as the ratio oftrue negatives and sum of true negatives and false positives:

TNR = True negatives/ (True negatives + False positives)

False positive rate, which is also known as fall-out can be defined as the ratio of false positivesand sum of false positives and true negatives:

FPR = False positives / (False positives + True negatives)

False negative rate, which is also known as miss rate can be defined as the ratio of falsenegatives and sum of false negatives and true positives

FNR = False negatives / (False negatives + True positives)

Conclusion

In this post, we understood about the various performance metrics which are used by machine learning algorithms. The goal of these machine learning algorithms is to improve the value of certain performance metrics so that prediction values are good.

24-A Confusion matrix

26-A Improving Performance of ML Models

Your email address will not be published. Required fields are marked *

Comments

Vinu

After reading your article, I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article. Thanks for sharing.

Johnson M

Good and informative article.

Vinu

I enjoyed reading your articles. This is truly a great read for me. Keep up the good work!

Vinu

Awesome blog. I enjoyed reading this article. This is truly a great read for me. Keep up the good work!

best data science courses in India

Thanks for sharing this article!! Machine learning is a branch of artificial intelligence (AI) and computer science that focus on the uses of data and algorithms. I came to know a lot of information from this article.

View More Comments

Search

Machine Learning Tutorial

By KnowledgeHut .

Machine Learning Tutorial

Machine learning metrics

Accuracy

Precision

Recall

Mean Absolute Error (MAE)

Mean Squared Error (MSE)

Area under curve

ROC curve (Receiver Operating Characteristic curve)

Conclusion

Leave a Reply

Comments

Vinu

Johnson M

Vinu

Vinu

best data science courses in India