Data Cleaning and management in Azure ML Studio

In this module, an overview of how to import data and general approaches to clean data, choose model, and enhance the prediction will be discussed and shown. In Lesson 4, first a discussion on how to access to the data presented, next, how to import data into Azure ML Studio presented, Audience get familiar with Import data module and how to get data from web or Blob storage has been shown. Next, they have seen how to join two datasets using Join module. The audience learns how they able to use SQL Transformation module to join more data and in general how to write code in SQL editor in Azure ML Studio. Finally, how to enter data manually into Azure ML Studio has been shown. In lesson 5, first a discussion on how much data we need to avoid Overfitting and Underfitting and their concepts have been discussed. Next, some discussion on Variance and Baise presented. In the last part, some other features related to Azure ML Studio have been shown. Next in Lesson 6, some basic techniques of data cleaning from Missing value, detect outliers and remove them by clip values has been shown. Also how to normalize data and why we need to do that has been discussed and demonstrated. In Lesson 7, how we able to choose the model for machine learning will be discussed, and the related Azure ML cheat sheet has been shown.
Moreover, the best practice of applying multiple machine learning model on a data set has been discussed. The process of feature selection has been explained and the different feature selection approaches in Azure ML Studio has been shown. Finally, how to evaluate the model presented. IN last lesson of this module, Lesson 8, a brief discussion on how to evaluate models in Azure ML Studio will be provided, even comparing more than two models. Also how we able to interpret the evaluation result in classification and regression has been discussed.

Back to: Definitive Guide to Azure Machine Learning Studio