top

Search

Machine Learning Tutorial

Batch size: It refers to the number of examples that are present in the group/batch.  Categorical data: Every feature in a dataset might take up different (discrete) values based on other features.  Data Analysis: This process helps in gaining better understanding of the data with the help of sampling, measuring and visualizing data.  Discrete feature: A feature that has a fixed set of values which wouldn’t change.  Epoch: One full cycle of executing a training phase with respect to a dataset. It can be understood as the value which is obtained by calculating total number of examples in the dataset / batch size.  Feature: An input variable or an entry from the dataset that is used to make predictions on the dataset.  Feature engineering: The process used to determine which features would be useful and help in training the model better, thereby helping make better predictions. Once the right/relevant features are selected, they are converted into features.  Feature extraction: It is just another name for feature engineering. It could also refer to extracting intermediate features which need to be passed as inputs to other machine learning models.  Feature set: The set of features on which the machine learning model is trained upon Feature vector: The feature values in the form of a list which is used to represent the example that ispassed as an input to the machine learning model. Fine tuning: The process of performing operations on the already trained model by tuning theparameters so that it predicts better and gives better outputs. Ground truth: The correct value or the answer to a task or problem in hand. Heuristics: A quick solution to a problem in hand. This might not be the optimal or well-planned but canbe thought of as a quick fix. Hyperparameter: The parameters which are fine-tuned when a model is trained. Instance: It is a synonym for ‘example’. It refers to a single row of a dataset or a part of the dataset orthe entire dataset itself. Label: This is in the context of ‘supervised learning’ since models that implement supervised algorithmslearn from examples that are labelled. It refers to a naming that is provided to a specific feature which distinguishes it from the other features. Labelled example: An example in the dataset that contains features and label. Loss: It is useful in determining how good/bad the model is. This can be determined by defining a ‘loss’function that essentially tells how far the predicted values are from the original ones. Loss curve: It helps in determining how well the model fits the data. It also helps in determining whetherthe data and model overfit, underfit, or converge. Model: It represents the learning of the machine learning system, which would have learnt patterns andother learning from the input data supplied to it. Model capacity: It tells about how complicated problems the machine learning system can learn. Thiscorresponds to the capacity, the more complicated problem that the system can learn, higher is its capacity. The capacity of the model increases with increase in number of parameters of the model. Model training: The process of finding out the best model to train the data on to get good results withpredictions. Model function: A method of implementing the processes involved in machine learning. They includetraining, evaluation and inference. Noise: Something which shouldn’t be a part of dataset but is a part. It can be an entity that doesn’tallow effective training of data. It could be because a device mis-measured something or a human made a mistake while noting values or labelling the data. Normalization: The process of conversion of a range of values into a standardized range of values, whichcould be between -1 to +1 or 0 to 1. Data can be normalized with the help of subtraction and division. Numerical data: Features in a dataset that are represented as integers or real-valued numbers. They arealso known as continuous features. When a feature is represented as a numerical value, it can be inferred that the values have a mathematical relationship with each other. Objective: The goal which the machine learning algorithm is trying to achieve. It can also be defined as ametric which is trying to be optimized so that the algorithm would yield better predictions. Outliers: Values which don’t synchronize or fall in range with the other values present in the samedataset. Parameter: A variable value of a model which is used by the machine learning system to be trained. Pipeline: The infrastructure which is used with the machine learning algorithm. This include datagathering, placing this data into separate files, training this data and exporting the obtained models into production environment. Prediction: The output of a model after it has been trained on input data. Conclusion In this post, we understood about the various Machine Learning terminologies which are used in the world of ML. 
logo

Machine Learning Tutorial

Machine Learning Terminologies

  • Batch size: It refers to the number of examples that are present in the group/batch.  
  • Categorical data: Every feature in a dataset might take up different (discrete) values based on other features.  
  • Data Analysis: This process helps in gaining better understanding of the data with the help of sampling, measuring and visualizing data.  
  • Discrete feature: A feature that has a fixed set of values which wouldn’t change.  
  • Epoch: One full cycle of executing a training phase with respect to a dataset. It can be understood as the value which is obtained by calculating total number of examples in the dataset / batch size.  
  • Feature: An input variable or an entry from the dataset that is used to make predictions on the dataset.  
  • Feature engineering: The process used to determine which features would be useful and help in training the model better, thereby helping make better predictions. Once the right/relevant features are selected, they are converted into features.  
  • Feature extraction: It is just another name for feature engineering. It could also refer to extracting intermediate features which need to be passed as inputs to other machine learning models.  
  • Feature set: The set of features on which the machine learning model is trained upon 
  • Feature vector: The feature values in the form of a list which is used to represent the example that ispassed as an input to the machine learning model. 
  • Fine tuning: The process of performing operations on the already trained model by tuning theparameters so that it predicts better and gives better outputs. 
  • Ground truth: The correct value or the answer to a task or problem in hand. 
  • Heuristics: A quick solution to a problem in hand. This might not be the optimal or well-planned but canbe thought of as a quick fix. 
  • Hyperparameter: The parameters which are fine-tuned when a model is trained. 
  • Instance: It is a synonym for ‘example’. It refers to a single row of a dataset or a part of the dataset orthe entire dataset itself. 
  • Label: This is in the context of ‘supervised learning’ since models that implement supervised algorithmslearn from examples that are labelled. It refers to a naming that is provided to a specific feature which distinguishes it from the other features. 
  • Labelled example: An example in the dataset that contains features and label. 
  • Loss: It is useful in determining how good/bad the model is. This can be determined by defining a ‘loss’function that essentially tells how far the predicted values are from the original ones. 
  • Loss curve: It helps in determining how well the model fits the data. It also helps in determining whetherthe data and model overfit, underfit, or converge. 
  • Model: It represents the learning of the machine learning system, which would have learnt patterns andother learning from the input data supplied to it. 
  • Model capacity: It tells about how complicated problems the machine learning system can learn. Thiscorresponds to the capacity, the more complicated problem that the system can learn, higher is its capacity. The capacity of the model increases with increase in number of parameters of the model. 
  • Model training: The process of finding out the best model to train the data on to get good results withpredictions. 
  • Model function: A method of implementing the processes involved in machine learning. They includetraining, evaluation and inference. 
  • Noise: Something which shouldn’t be a part of dataset but is a part. It can be an entity that doesn’tallow effective training of data. It could be because a device mis-measured something or a human made a mistake while noting values or labelling the data. 
  • Normalization: The process of conversion of a range of values into a standardized range of values, whichcould be between -1 to +1 or 0 to 1. Data can be normalized with the help of subtraction and division. 
  • Numerical data: Features in a dataset that are represented as integers or real-valued numbers. They arealso known as continuous features. When a feature is represented as a numerical value, it can be inferred that the values have a mathematical relationship with each other. 
  • Objective: The goal which the machine learning algorithm is trying to achieve. It can also be defined as ametric which is trying to be optimized so that the algorithm would yield better predictions. 
  • Outliers: Values which don’t synchronize or fall in range with the other values present in the samedataset. 
  • Parameter: A variable value of a model which is used by the machine learning system to be trained. 
  • Pipeline: The infrastructure which is used with the machine learning algorithm. This include datagathering, placing this data into separate files, training this data and exporting the obtained models into production environment. 
  • Prediction: The output of a model after it has been trained on input data. 

Conclusion

In this post, we understood about the various Machine Learning terminologies which are used in the world of ML. 

Leave a Reply

Your email address will not be published. Required fields are marked *