It is easy to master the art of Data Science by extensively practicing and working your way through data related problems. Data Sets are a great way to do so. These sets are divided according to the knowledge level of the person attempting the set -
Iris Data Set: One of the most popular, versatile, easy and resourceful data set, the Iris Data Set works by identifying and recognizing patterns. The set has 50 rows and 4 columns. Practice Problem: Predict the class of a flower on the basis of these parameters.Loan Prediction Data Set: The banking sector uses data analytics and data science methodologies. The Loan Prediction data set works along with the concepts related to the industry. The data set has 615 rows and 13 columns. It is a classification problem data set. Practice Problem: Predict if a given loan will be approved by the bank or not.
Bigmart Sales Data Set: The retail sector is another industry which uses data analytics. Data Science makes management easier and efficient. The Bigmart Sales Data Set has 8523 rows and 12 variables. Practice Problem: Predict the sales of a retail store.
Black Friday Data Set: This data set includes sales transactions that have been captured from a retail store. It helps in understanding the shopping experiences of millions of customers. The set has 550,069 rows and 12 columns. Practice Problem: Predict the amount of the total purchase made.
Human Activity Recognition Data Set: This set works by using a collection of data of 30 human subjects. This data has been recorded using smartphones. Human Activity Recognition Data Set has 10,299 rows and 561 columns. Practice Problem: Predict the human activity category.
Text Mining Data Set: This data set has safety reports which mentions that problems faced on flights. The set contains 21,519 columns and 30,438 rows. Practice Problem: Classify the documents on the basis of their labels.
Urban Sound Classification: The Urban Sound Classification helps in finding solutions to concepts of Machine Learning. The data set has 8732 sound clippings which are categorized in 10 classes. It helps in processing various audio files. Practice Problem: Classify the type of sound that is obtained from a particular audio.
Identify the digits data set: With 7000 images, with dimensions 28x28, this data set helps in studying, analyzing and recognizing the numerous elements in an image.
Practice Problem: Identify the digits present in a given image
Vox Celebrity Data Set: This is another data set that deals in audio processing. This set is meant for speaker identification on a large scale. It contains100000 spoken words.
Practice Problem: Identify the celebrity that a given voice belongs to.