Machine Learning Tutorial

By KnowledgeHut .

Softmax regression is also known as multi nomial logistic regression, which is a generalization of logistic regression. It is used in cases where multiple classes need to be worked with, i.e data points in the dataset need to be classified into more than 2 classes. Softmax function performs the below functions: Converts all the scores into probabilities Ensures that the sum of the probabilities is 1 Feature matrix Assume a dataset to have ‘m’ columns, ‘n’ rows, and ‘k’ classes into which these values need to be classified into. The feature matrix can be represented as: x = (1 11...1) (1 21 ... 2) (1 1 ...)Weight matrix The weight matrix represents the weight of the ith row and jth colum W = ( 01...1) ( 11 …. 2 ) ( 1....)The scores need to be normalized so that it is easy to implement gradient descent algorithm so as to minimize the cost function. Hence, we use a softmax function, which is defined below: P(y|) = S() (vector form) One hot encoded target matrix: The softmax function gives a vector of probabilities for every class label, with respect to a data point. This needs to be converted into the same format so as to calculate the cost function. Hence, every data point has a target vector which has zeroes and ones where a correct label is set to 1. This process is known as one-hot encoding. Let us understand how softmax regression can be implemented using TensorFlow library: Import the required libraries to implement softmax regression, and download the MNIST handwritten digit dataset. The MNIST data is split into a training, testing, and validation dataset. Next a computation graph is created. In the training data, a placeholder is supplied at run time. This technique uses mini batches to train the model using gradient descent, and this is known as stochastic gradient descent. The weight matrix explained prior is initialized using random values with a normal distribution. The bias is initialized to 0. The input data points are multiplied with weight matrix and bias value is added to it. Next the softmax is calculated using TensorFlow. Next, the cost function is minimized using the gradient descent algorithm. Let us look at the code now: import tensorflow as tf import numpy as np import matplotlib.pyplot as plt from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) print("Shape of the feature matrix:", mnist.train.images.shape) print("Shape of the target matrix:", mnist.train.labels.shape) print("One-hot encoding for the first observation is:\n", mnist.train.labels[0]) visualizing data by plotting the images fig,ax = plt.subplots(10,10) k = 0 for i in range(10): for j in range(10): ax[i][j].imshow(mnist.train.images[k].reshape(28,28), aspect='auto') k += 1 plt.show() number of features num_features = 784 number of target labels num_labels = 10 learning rate (also knwon as alpha) learning_rate = 0.05 batch size batch_size = 128 number of epochs num_steps = 5001 input dataset train_dataset = mnist.train.images train_labels = mnist.train.labels test_dataset = mnist.test.images test_labels = mnist.test.labels valid_dataset = mnist.validation.images valid_labels = mnist.validation.labels initializing a tensorflow graph graph = tf.Graph() with graph.as_default(): # Inputs tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, num_features)) tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) tf_valid_dataset = tf.constant(valid_dataset) tf_test_dataset = tf.constant(test_dataset) # Variables. weights = tf.Variable(tf.truncated_normal([num_features, num_labels])) biases = tf.Variable(tf.zeros([num_labels])) # Training computation logits = tf.matmul(tf_train_dataset, weights) + biases loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( labels=tf_train_labels, logits=logits)) # Optimizer optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) Predictions for the training, validation, and test datasets train_prediction = tf.nn.softmax(logits) valid_prediction = tf.nn.softmax(tf.matmul(tf_valid_dataset, weights) + biases) test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases) utility function that calculates accuracy def accuracy(predictions, labels): correctly_predicted = np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1)) accu = (100.0 * correctly_predicted) / predictions.shape[0] return accu with tf.Session(graph=graph) as session: initialize the weights and biases tf.global_variables_initializer().run() print("Initialized") for step in range(num_steps): # randomized offset is picked offset = np.random.randint(0, train_labels.shape[0] - batch_size - 1) # Generating a minibatch. batch_data = train_dataset[offset:(offset + batch_size), :] ba feed_dict=feed_dict) if (step % 500 == 0): print("Minibatch loss at step {0}: {1}".format(step, l)) print("Minibatch accuracy: {:.1f}%".format( accuracy(predictions, batch_labels))) print("Validation accuracy: {:.1f}%".format( accuracy(valid_prediction.eval(), valid_labels))) print("\nTest accuracy: {:.1f}%".format( accuracy(test_prediction.eval(), test_labels))) Output: Initialized Minibatch loss at step 0: 11.68 Minibatch accuracy: 10.2% Validation accuracy: 14.3% Minibatch loss at step 500: 2.25 Minibatch accuracy: 46.9% Validation accuracy: 67.6% Minibatch loss at step 1000: 1.10 Minibatch accuracy: 78.1% Validation accuracy: 75.0% Minibatch loss at step 1500: 0.67 Minibatch accuracy: 78.9% Validation accuracy: 78.6% Minibatch loss at step 2000: 0.22 Minibatch accuracy: 91.4% Validation accuracy: 81.0% Minibatch loss at step 2500: 0.60 Minibatch accuracy: 84.4% Validation accuracy: 82.5% Minibatch loss at step 3000: 0.97 Minibatch accuracy: 85.2% Validation accuracy: 83.9% Minibatch loss at step 3500: 0.64 Minibatch accuracy: 85.2% Validation accuracy: 84.4% Minibatch loss at step 4000: 0.79 Minibatch accuracy: 82.8% Validation accuracy: 85.0% Minibatch loss at step 4500: 0.60 Minibatch accuracy: 80.5% Validation accuracy: 85.6% Minibatch loss at step 5000: 0.48 Minibatch accuracy: 89.1%tch_labels = train_labels[offset:(offset + batch_size), :] feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels} _, l, predictions = session.run([optimizer, loss, train_prediction], Validation accuracy: 86.2% Test accuracy: 86.49% Conclusion In this post, we understood the meaning of softmax regression, how it can be used, and its implementation using TensorFlow library.

1. Machine Learning Overview

2. Machine Learning Terminologies

3. Demystifying Machine Learning

4. Applications of Machine Learning

5. Methods for Machine Learning

6. Underfitting and Overfitting in Machine Learning

7. Data Loading for ML Projects

8. Introduction to Data in Machine Learning

9. Why Data Pre-processing?

10. Normalization

11. Numpy

12. K-Nearest Neighbors (KNN)

13. Hyperparameter Tuning

14. Pre-procesing Data

15. What is Clustering in Machine Learning?

16. Overview - Regression & Logistic Regression

17. Linear Regression(Python Implementation)

18. Softmax Regression using TensorFlow

19. What is Linear Regression?

20. Linear Regression using PyTorch

21. Decision Trees

22. Introduction To Machine Learning using Python

23. Learning Model Building in Scikit-learn: A Python Machine Learning Library

24. Confusion matrix

25. Machine learning metrics

26. Improving Performance of ML Models

27. How to get synonyms/antonyms from NLTK WordNet in Python?

28. Removing stop words with NLTK in Python

29. Tokenize text using NLTK in Python

Softmax Regression using TensorFlow

Softmax function performs the below functions:

Converts all the scores into probabilities
Ensures that the sum of the probabilities is 1

Feature matrix

Assume a dataset to have ‘m’ columns, ‘n’ rows, and ‘k’ classes into which these values need to be classified into. The feature matrix can be represented as:

x = (1  11...1)
(1  21 ... 2)
(1   1 ...)

Weight matrix

The weight matrix represents the weight of the ith row and jth colum

W = (  01...1)
(  11  ….   2  )
(   1....)

The scores need to be normalized so that it is easy to implement gradient descent algorithm so as to minimize the cost function. Hence, we use a softmax function, which is defined below:

P(y|) = S() (vector form)

One hot encoded target matrix: The softmax function gives a vector of probabilities for every class label, with respect to a data point. This needs to be converted into the same format so as to calculate the cost function. Hence, every data point has a target vector which has zeroes and ones where a correct label is set to 1. This process is known as one-hot encoding.

Let us understand how softmax regression can be implemented using TensorFlow library:

Import the required libraries to implement softmax regression, and download the MNIST handwritten digit dataset.

The MNIST data is split into a training, testing, and validation dataset. Next a computation graph is created. In the training data, a placeholder is supplied at run time. This technique uses mini batches to train the model using gradient descent, and this is known as stochastic gradient descent.

The weight matrix explained prior is initialized using random values with a normal distribution. The bias is initialized to 0.

The input data points are multiplied with weight matrix and bias value is added to it. Next the softmax is calculated using TensorFlow.

Next, the cost function is minimized using the gradient descent algorithm.

Let us look at the code now:

import tensorflow as tf 
import numpy as np 
import matplotlib.pyplot as plt 
from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) 
print("Shape of the feature matrix:", mnist.train.images.shape) 
print("Shape of the target matrix:", mnist.train.labels.shape) 
print("One-hot encoding for the first observation is:\n", mnist.train.labels[0]) 
visualizing data by plotting the images fig,ax = plt.subplots(10,10) 
k = 0 
for i in range(10): for j in range(10): 
ax[i][j].imshow(mnist.train.images[k].reshape(28,28), aspect='auto') k += 1 
plt.show() 
number of features 
num_features = 784 
number of target labels num_labels = 10 
learning rate (also knwon as alpha) learning_rate = 0.05 
batch size 
batch_size = 128 
number of epochs num_steps = 5001 
input dataset 
train_dataset = mnist.train.images 
train_labels = mnist.train.labels 
test_dataset = mnist.test.images 
test_labels = mnist.test.labels 
valid_dataset = mnist.validation.images 
valid_labels = mnist.validation.labels 
initializing a tensorflow graph graph = tf.Graph() 
with graph.as_default(): 
# Inputs 
tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, num_features)) 
tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) 
tf_valid_dataset = tf.constant(valid_dataset) 
tf_test_dataset = tf.constant(test_dataset) 
# Variables. 
weights = tf.Variable(tf.truncated_normal([num_features, num_labels])) 
biases = tf.Variable(tf.zeros([num_labels])) 
# Training computation 
logits = tf.matmul(tf_train_dataset, weights) + biases 
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( labels=tf_train_labels, logits=logits)) 
# Optimizer 
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss) 
Predictions for the training, validation, and test datasets train_prediction = tf.nn.softmax(logits) 
valid_prediction = tf.nn.softmax(tf.matmul(tf_valid_dataset, weights) + biases) test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases) 
utility function that calculates accuracy 
def accuracy(predictions, labels): 
correctly_predicted = np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1)) 
accu = (100.0 * correctly_predicted) / predictions.shape[0] 
return accu 
with tf.Session(graph=graph) as session: 
initialize the weights and biases tf.global_variables_initializer().run() print("Initialized") 
for step in range(num_steps): 
# randomized offset is picked 
offset = np.random.randint(0, train_labels.shape[0] - batch_size - 1) 
# Generating a minibatch. 
batch_data = train_dataset[offset:(offset + batch_size), :] ba feed_dict=feed_dict) 
if (step % 500 == 0): 
print("Minibatch loss at step {0}: {1}".format(step, l)) 
print("Minibatch accuracy: {:.1f}%".format( 
accuracy(predictions, batch_labels))) 
print("Validation accuracy: {:.1f}%".format( 
accuracy(valid_prediction.eval(), valid_labels))) 
print("\nTest accuracy: {:.1f}%".format( 
accuracy(test_prediction.eval(), test_labels)))

Output:

Initialized 
Minibatch loss at step 0: 11.68 
Minibatch accuracy: 10.2% 
Validation accuracy: 14.3% 
Minibatch loss at step 500: 2.25 
Minibatch accuracy: 46.9% 
Validation accuracy: 67.6% 
Minibatch loss at step 1000: 1.10 
Minibatch accuracy: 78.1% 
Validation accuracy: 75.0% 
Minibatch loss at step 1500: 0.67 
Minibatch accuracy: 78.9% 
Validation accuracy: 78.6% 
Minibatch loss at step 2000: 0.22 
Minibatch accuracy: 91.4% 
Validation accuracy: 81.0% 
Minibatch loss at step 2500: 0.60 
Minibatch accuracy: 84.4% 
Validation accuracy: 82.5% 
Minibatch loss at step 3000: 0.97 
Minibatch accuracy: 85.2% 
Validation accuracy: 83.9% 
Minibatch loss at step 3500: 0.64 
Minibatch accuracy: 85.2% 
Validation accuracy: 84.4% 
Minibatch loss at step 4000: 0.79 
Minibatch accuracy: 82.8% 
Validation accuracy: 85.0% 
Minibatch loss at step 4500: 0.60 
Minibatch accuracy: 80.5% 
Validation accuracy: 85.6% 
Minibatch loss at step 5000: 0.48 
Minibatch accuracy: 89.1%tch_labels = train_labels[offset:(offset + batch_size), :] 
feed_dict = {tf_train_dataset : batch_data, 
tf_train_labels : batch_labels} 
_, l, predictions = session.run([optimizer, loss, train_prediction], 
Validation accuracy: 86.2% 
Test accuracy: 86.49%

Conclusion

In this post, we understood the meaning of softmax regression, how it can be used, and its implementation using TensorFlow library.

17-A Linear Regression(Python Implementation)

19-A What is Linear Regression?

Your email address will not be published. Required fields are marked *

Comments

Vinu

After reading your article, I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article. Thanks for sharing.

Johnson M

Good and informative article.

Vinu

I enjoyed reading your articles. This is truly a great read for me. Keep up the good work!

Vinu

Awesome blog. I enjoyed reading this article. This is truly a great read for me. Keep up the good work!

best data science courses in India

Thanks for sharing this article!! Machine learning is a branch of artificial intelligence (AI) and computer science that focus on the uses of data and algorithms. I came to know a lot of information from this article.

View More Comments

Search

Machine Learning Tutorial

By KnowledgeHut .

Machine Learning Tutorial

Softmax Regression using TensorFlow

Feature matrix

Weight matrix

Conclusion

Leave a Reply

Comments

Vinu

Johnson M

Vinu

Vinu

best data science courses in India