
Domains
Agile Management
Master Agile methodologies for efficient and timely project delivery.
View All Agile Management Coursesicon-refresh-cwCertifications
Scrum Alliance
16 Hours
Best Seller
Certified ScrumMaster (CSM) CertificationScrum Alliance
16 Hours
Best Seller
Certified Scrum Product Owner (CSPO) CertificationScaled Agile
16 Hours
Trending
Leading SAFe 6.0 CertificationScrum.org
16 Hours
Professional Scrum Master (PSM) CertificationScaled Agile
16 Hours
SAFe 6.0 Scrum Master (SSM) CertificationAdvanced Certifications
Scaled Agile, Inc.
32 Hours
Recommended
Implementing SAFe 6.0 (SPC) CertificationScaled Agile, Inc.
24 Hours
SAFe 6.0 Release Train Engineer (RTE) CertificationScaled Agile, Inc.
16 Hours
Trending
SAFe® 6.0 Product Owner/Product Manager (POPM)IC Agile
24 Hours
ICP Agile Certified Coaching (ICP-ACC)Scrum.org
16 Hours
Professional Scrum Product Owner I (PSPO I) TrainingMasters
32 Hours
Trending
Agile Management Master's Program32 Hours
Agile Excellence Master's ProgramOn-Demand Courses
Agile and ScrumRoles
Scrum MasterTech Courses and Bootcamps
Full Stack Developer BootcampAccreditation Bodies
Scrum AllianceTop Resources
Scrum TutorialProject Management
Gain expert skills to lead projects to success and timely completion.
View All Project Management Coursesicon-standCertifications
PMI
36 Hours
Best Seller
Project Management Professional (PMP) CertificationAxelos
32 Hours
PRINCE2 Foundation & Practitioner CertificationAxelos
16 Hours
PRINCE2 Foundation CertificationAxelos
16 Hours
PRINCE2 Practitioner CertificationSkills
Change ManagementMasters
Job Oriented
45 Hours
Trending
Project Management Master's ProgramUniversity Programs
45 Hours
Trending
Project Management Master's ProgramOn-Demand Courses
PRINCE2 Practitioner CourseRoles
Project ManagerAccreditation Bodies
PMITop Resources
Theories of MotivationCloud Computing
Learn to harness the cloud to deliver computing resources efficiently.
View All Cloud Computing Coursesicon-cloud-snowingCertifications
AWS
32 Hours
Best Seller
AWS Certified Solutions Architect - AssociateAWS
32 Hours
AWS Cloud Practitioner CertificationAWS
24 Hours
AWS DevOps CertificationMicrosoft
16 Hours
Azure Fundamentals CertificationMicrosoft
24 Hours
Best Seller
Azure Administrator CertificationMicrosoft
45 Hours
Recommended
Azure Data Engineer CertificationMicrosoft
32 Hours
Azure Solution Architect CertificationMicrosoft
40 Hours
Azure DevOps CertificationAWS
24 Hours
Systems Operations on AWS Certification TrainingAWS
24 Hours
Developing on AWSMasters
Job Oriented
48 Hours
New
AWS Cloud Architect Masters ProgramBootcamps
Career Kickstarter
100 Hours
Trending
Cloud Engineer BootcampRoles
Cloud EngineerOn-Demand Courses
AWS Certified Developer Associate - Complete GuideAuthorized Partners of
AWSTop Resources
Scrum TutorialIT Service Management
Understand how to plan, design, and optimize IT services efficiently.
View All DevOps Coursesicon-git-commitCertifications
Axelos
16 Hours
Best Seller
ITIL 4 Foundation CertificationAxelos
16 Hours
ITIL Practitioner CertificationPeopleCert
16 Hours
ISO 14001 Foundation CertificationPeopleCert
16 Hours
ISO 20000 CertificationPeopleCert
24 Hours
ISO 27000 Foundation CertificationAxelos
24 Hours
ITIL 4 Specialist: Create, Deliver and Support TrainingAxelos
24 Hours
ITIL 4 Specialist: Drive Stakeholder Value TrainingAxelos
16 Hours
ITIL 4 Strategist Direct, Plan and Improve TrainingOn-Demand Courses
ITIL 4 Specialist: Create, Deliver and Support ExamTop Resources
ITIL Practice TestData Science
Unlock valuable insights from data with advanced analytics.
View All Data Science Coursesicon-dataBootcamps
Job Oriented
6 Months
Trending
Data Science BootcampJob Oriented
289 Hours
Data Engineer BootcampJob Oriented
6 Months
Data Analyst BootcampJob Oriented
288 Hours
New
AI Engineer BootcampSkills
Data Science with PythonRoles
Data ScientistOn-Demand Courses
Data Analysis Using ExcelTop Resources
Machine Learning TutorialDevOps
Automate and streamline the delivery of products and services.
View All DevOps Coursesicon-terminal-squareCertifications
DevOps Institute
16 Hours
Best Seller
DevOps Foundation CertificationCNCF
32 Hours
New
Certified Kubernetes AdministratorDevops Institute
16 Hours
Devops LeaderSkills
KubernetesRoles
DevOps EngineerOn-Demand Courses
CI/CD with Jenkins XGlobal Accreditations
DevOps InstituteTop Resources
Top DevOps ProjectsBI And Visualization
Understand how to transform data into actionable, measurable insights.
View All BI And Visualization Coursesicon-microscopeBI and Visualization Tools
Certification
24 Hours
Recommended
Tableau CertificationCertification
24 Hours
Data Visualization with Tableau CertificationMicrosoft
24 Hours
Best Seller
Microsoft Power BI CertificationTIBCO
36 Hours
TIBCO Spotfire TrainingCertification
30 Hours
Data Visualization with QlikView CertificationCertification
16 Hours
Sisense BI CertificationOn-Demand Courses
Data Visualization Using Tableau TrainingTop Resources
Python Data Viz LibsCyber Security
Understand how to protect data and systems from threats or disasters.
View All Cyber Security Coursesicon-refresh-cwCertifications
CompTIA
40 Hours
Best Seller
CompTIA Security+EC-Council
40 Hours
Certified Ethical Hacker (CEH v12) CertificationISACA
22 Hours
Certified Information Systems Auditor (CISA) CertificationISACA
40 Hours
Certified Information Security Manager (CISM) Certification(ISC)²
40 Hours
Certified Information Systems Security Professional (CISSP)(ISC)²
40 Hours
Certified Cloud Security Professional (CCSP) Certification16 Hours
Certified Information Privacy Professional - Europe (CIPP-E) CertificationISACA
16 Hours
COBIT5 Foundation16 Hours
Payment Card Industry Security Standards (PCI-DSS) CertificationOn-Demand Courses
CISSPTop Resources
Laptops for IT SecurityWeb Development
Learn to create user-friendly, fast, and dynamic web applications.
View All Web Development Coursesicon-codeBootcamps
Career Kickstarter
6 Months
Best Seller
Full-Stack Developer BootcampJob Oriented
3 Months
Best Seller
UI/UX Design BootcampEnterprise Recommended
6 Months
Java Full Stack Developer BootcampCareer Kickstarter
490+ Hours
Front-End Development BootcampCareer Accelerator
4 Months
Backend Development Bootcamp (Node JS)Skills
ReactOn-Demand Courses
Angular TrainingTop Resources
Top HTML ProjectsBlockchain
Understand how transactions and databases work in blockchain technology.
View All Blockchain Coursesicon-stop-squareBlockchain Certifications
40 Hours
Blockchain Professional Certification32 Hours
Blockchain Solutions Architect Certification32 Hours
Blockchain Security Engineer Certification24 Hours
Blockchain Quality Engineer Certification5+ Hours
Blockchain 101 CertificationOn-Demand Courses
NFT Essentials 101: A Beginner's GuideTop Resources
Blockchain Interview QsProgramming
Learn to code efficiently and design software that solves problems.
View All Programming Coursesicon-codeSkills
Python CertificationInterview Prep
Career Accelerator
3 Months
Software Engineer Interview PrepOn-Demand Courses
Data Structures and Algorithms with JavaScriptTop Resources
Python TutorialAs the name suggests, it is a linear process, which means it is 2 dimensional, i.e. it has 2 variables associated with it. These variables have continuous values (in contrast to 0s and 1s in logistic regression). The word ‘regression’ refers to finding relationship between two variables amongst which one is a dependant variable and the other one is independent.
Linear Regression is one of the most widely used and well understood algorithm in the field of statistics and Machine Learning.
Linear Regression refers to an approach/algorithm that helps establish a linear relationship between the dependant and the independent variable.
In simple words, it goes like this- we will be provided with a basic linear equation, say y = 3x-1. Here ‘y’ is considered to be the dependant variable (since it depends on the value of x) and ‘x’ (trivially) is the independent variable. This means, as and when ‘x’ changes, the value of ‘y’ keeps changing according to the above-mentioned linear equation. Different values for ‘x’ are supplied, which helps calculate various values for ‘y’. The values for ‘x’ and ‘y’ have been shown in a table below:
X | Y |
|---|---|
1 | 2 |
2 | 5 |
3 | 8 |
4 | 11 |
5 | 14 |
6 | 17 |
7 | 20 |
These values are plotted on a graph and we try to fit all these points (or most of them) to a straight line. During the process of fitting these values to a straight line, we try and grab most of the points whose vertical distance from the straight line (that is being fit) is minimum. Some points don’t make it on the straight line since they don’t contribute in forming a straight line. These are the ones whose vertical distance from the straight line isn’t the smallest. The idea is to grab all the points in the graph and fit them on a straight line that have minimum vertical distance from the line. Below is an example illustrating the same:

When the number of points that don’t contribute to fitting a straight line are more in comparison to the ones that contribute to fitting the line, it is considered that the ‘prediction error’ is more. The ‘error’ basically refers to the shortest distance (vertical distance) between the line and the point.
From the above graph, it can be observed that points 1,2,3 and 4 beginning from the bottom left corner don’t really fit the line, and don’t contribute to forming the straight line.
When such a linear regression model is trained, it helps calculate an attribute called ‘cost function’, that helps in measuring the ‘Root Mean Squared Error’ or RMSE in short. RMSE basically gives the difference between the values that are predicted and the input values. These values are then normalized by squaring them so as to remove any negative values and calculating the average of these values (i.e. dividing them by the total number of observations) and taking the square root of this value.
The resultant is a single number that is used to understand how well the regression algorithm has predicted output for a given input value and how close it is to the actual output. The ‘cost function’ needs to be minimal, thereby corresponding to a minimum difference between the actual value and the predicted value.
Gradient Descent
Gradient descent is an optimization algorithm which is used to minimize the cost function by providing the right values for the parameters used in the linear function (the gradient is actually a derivative of the loss). This doesn’t happen in a single step, but takes multiple steps to finally arrive at a value which is minimum, and going further from there would lead to no other better value.
Inferences that can be made with the help of the gradient descent:
If the gradients obtained are positive, the loss increases when the data element’s value is increased by a small amount and the loss reduces when the data element’s value is decreased by a small amount.
If the gradients obtained are negative, the loss decreases when the data element’s value is increased by a small amount and the loss increases when the data element’s value is decreased by a small amount.
Stochastic Gradient Descent is another variation of Gradient Descent whose ultimate goal is to minimize the cost function.
PyTorch is an open source machine learning library, which was developed (is currently being updated as well as maintained) by social media giant Facebook. It is based on the Torch library (Torch is open-source, ML based library, scripting language as well as a scientific computing framework), which is currently not being actively developed. Hence PyTorch came into existence.
It is widely used in building deep-learning models, and natural language processing tasks (NLP) since it comes with features including Python support, easy-to-use API, and support to build on-the-go computational graphs. It contains multiple machine learning libraries that could be used with Python to build interesting applications and solve real-life problems. It comes with CUDA support, which helps in delivering higher speed by enabling it to make use of GPU and its computing resources. The CUDA characteristic can be ignored as well, based on our requirement.
Now, let us dive into implementing Linear Regression using PyTorch.
In PyTorch, many platforms (local machine, cloud, and mobile) can be used to implement Linear Regression. Below is a snip showing the different platform options.

We will see the implementation on local machine. Before diving into the code, the PyTorch package needs to be installed. This can be done by referring to the below snip. After selecting the preferences, the last row of the below snip gives the command which needs to be executed on your local machine’s command prompt to install the PyTorch package.
Instead of this, one could also install PyTorch locally on their system with the help of the below command on the terminal:
pip install torch Note: If you wish to implement Linear Regression using PyTorch in a virtual environment, make sure to activate it and then install the framework in that virtual environment so that it doesn’t conflict with other libraries.
Note: Your IDE should have the pre-requisites (numpy) and Python 3.x since PyTorch currently doesn’t support Python 2.x

It will take some time for the packages to be downloaded. If your system is not CUDA capable or you simply don’t require CUDA, execute the below command on your command prompt:

During the download, your screen should look like the below image:

To verify that PyTorch has been successfully installed, execute the following lines of code in your IDE:
from __future__ import print_function
import torch
x = torch.rand(2, 1)
print(x)
The output should look something similar to the below lines:
tensor([[0.7887],
[0.8678]])
#First step is to generate the data that is required for Linear Regression. We will use the same linear#equation y = 3x-1 to generate various ‘x’ and ‘y’ values using the numpy library.
import numpy as np
x_values = [i for i in range(9)]
x_train = np.array(x_values, dtype=np.float32)
x_train = x_train.reshape(-1, 1)
y_values = [3*i - 1 for i in x_values]
y_train = np.array(y_values, dtype=np.float32)
y_train = y_train.reshape(-1, 1)
#Defining the model architecture- The below code defines a class named linearRegression, which is asubclass to the parent class named torch.nn.Module. This parent class is basically a neural network that contains many functions that help process the data that is generated and grab the underlying relationship between the data elements and predict the target value. In the __init__ method of the linearRegression class, one input and one input size have been defined, which indicates that there is the dimension of the predicted output is 1, i.e one target value for each input needs to be predicted.
import torch
from torch.autograd import Variable
class linearRegression(torch.nn.Module):
def __init__(self, inputSize, outputSize):
super(linearRegression, self).__init__()
self.linear = torch.nn.Linear(inputSize, outputSize)
def forward(self, x):
out = self.linear(x)
return out
#The model is instantiated, which is a class for linear regression. Its parent (torch.nn.module) is a simpleneural network which contains all the functions required to implement a basic neural network, as well as a method that implements linear regression. The model is instantiated with the below lines of code:
model.cuda()
model = linearRegression(input_dimensions, output_dimensions)
The loss, i.e. Mean Squared Error (MSE) is initialized, which gives the difference between the actual value and the predicted value, and an optimization algorithm (Stochastic Gradient Descent, SGD) that reduces the cost function is initialized. These two initializations are used to train the model. To the SGD, the parameters of the learning model are passed.
criterion = torch.nn.MSELoss() #The mean squared error is initialized, which is the loss function
optimizer = torch.optim.SGD(model.parameters(), lr=learningRate) #SGD is used to find the proper parameters
#to the function so that it minimizes the
#cost function to the greatest extent
#Model training based on the data generated in previous steps:
Here, the input and labels are converted to the type ‘Variable’. A gradient buffer is present that stores the recently calculated gradient at every step. This is cleared every time a new gradient is calculated; the gradient buffer clears so that there is no accumulation of gradients.
for epoch in range(epochs):
if torch.cuda.is_available():
inputs = Variable(torch.from_numpy(x_train).cuda()) #input is converted to type Variable
labels = Variable(torch.from_numpy(y_train).cuda()) #label is converted to type Variable
else:
inputs = Variable(torch.from_numpy(x_train))
labels = Variable(torch.from_numpy(y_train))
optimizer.zero_grad() #The gradient buffer is cleared so that no gradient from previous process is #carried ahead.
outputs = model(inputs)
loss = criterion(outputs, labels) #For the predicted output, the loss (MSE) is calculated.
print(loss)
loss.backward()
optimizer.step()
..................Column Break..................
#The gradients are obtained, which could be improved further, if required #If there is scope for improvement, the parameters are updated.
print('epoch {}, loss {}'.format(epoch, loss.item()))
#Testing:
Once the Linear Regression model’s training is completed, it needs to be tested with new data.
with torch.no_grad():
if torch.cuda.is_available():
predicted = model(Variable(torch.from_numpy(x_train).cuda())).cpu().data.numpy()
else:
predicted = model(Variable(torch.from_numpy(x_train))).data.numpy()
print(predicted)
#The below lines of code plot the linear equation on a graph to demonstrate the line fitting.
import matplotlib.pyplot as plt
plt.clf()
plt.plot(x_train, y_train, 'go', label='True data', alpha=0.5)
plt.plot(x_train, predicted, '--', label='Predictions', alpha=0.5)
plt.legend(loc='best')
plt.show()
Note: Try changing the ‘epochs’ value to see how the graph changes. This is because the model trainsbetter thereby fitting the line better.
Output:
(We have only shown 10 epochs output here, but whatever number the epoch is defined to be, that many times the tensor is trained)
tensor(232.5576, grad_fn=<MseLossBackward>)
epoch 0, loss 232.55755615234375
tensor(66.7721, grad_fn=<MseLossBackward>)
epoch 1, loss 66.77213287353516
tensor(19.7759, grad_fn=<MseLossBackward>)
epoch 2, loss 19.77593231201172
tensor(6.4467, grad_fn=<MseLossBackward>)
epoch 3, loss 6.446735858917236
tensor(2.6594, grad_fn=<MseLossBackward>)
epoch 4, loss 2.6594393253326416
tensor(1.5766, grad_fn=<MseLossBackward>)
epoch 5, loss 1.5765886306762695
tensor(1.2603, grad_fn=<MseLossBackward>)
epoch 6, loss 1.260332465171814
tensor(1.1614, grad_fn=<MseLossBackward>)
epoch 7, loss 1.161449670791626
tensor(1.1243, grad_fn=<MseLossBackward>)
epoch 8, loss 1.1242772340774536
tensor(1.1047, grad_fn=<MseLossBackward>)
epoch 9, loss 1.1047004461288452
[[ 0.92321473]
[ 3.5736573 ]
[ 6.2241
[ 8.874542 ]
[11.524985 ]
[14.175427 ]
[16.82587
[19.476313 ]
[22.126755 ]]
The output:

Finally, all the code in a single screen for ease of use:
Shapeimport numpy as np
import torch
x_values = [i for i in range(9)]
x_train = np.array(x_values, dtype=np.float32)
x_train = x_train.reshape(-1, 1)
y_values = [3*i - 1 for i in x_values]
y_train = np.array(y_values, dtype=np.float32)
y_train = y_train.reshape(-1, 1)
from torch.autograd import Variable
class linearRegression(torch.nn.Module):
def __init__(self, inputSize, outputSize):
super(linearRegression, self).__init__()
self.linear = torch.nn.Linear(inputSize, outputSize)
def forward(self, x):
out = self.linear(x)
return out
input_dimensions = 1
output_dimensions = 1
learningRate = 0.01
epochs = 200
model = linearRegression(input_dimensions, output_dimensions)
if torch.cuda.is_available():
model.cuda()
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learningRate)
for epoch in range(epochs):
if torch.cuda.is_available():
inputs = Variable(torch.from_numpy(x_train).cuda())
labels = Variable(torch.from_numpy(y_train).cuda())
else:
inputs = Variable(torch.from_numpy(x_train))
labels = Variable(torch.from_numpy(y_train))
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
print(loss)
loss.backward()
optimizer.step()
print('epoch {}, loss {}'.format(epoch, loss.item()))
with torch.no_grad():
if torch.cuda.is_available():
predicted = model(Variable(torch.from_numpy(x_train).cuda())).cpu().data.numpy()
else:
predicted = model(Variable(torch.from_numpy(x_train))).data.numpy()
print(predicted)
import matplotlib.pyplot as plt
plt.clf()
plt.plot(x_train, y_train, 'go', label='True data', alpha=0.5)
plt.plot(x_train, predicted, '--', label='Predictions', alpha=0.5)
plt.legend(loc='best')
plt.show()
In this post, we understood what linear regression is, the significance of PyTorch and the implementation of Linear Regression using PyTorch.