Week1 - Raymkindo/Weeklyreport GitHub Wiki

Machine Learning Is the branch of Artificial intelligence which give the computer the ability to learn from data. In Machine Learning we are dealing with Data and Algorithms which help the machine to learn. There are few types of machine learning algorithms but the following are main such as: -

Supervised Machine Learning
Unsupervised Machine Learning
Reinforcement Machine Learning

Supervised, unsupervised Machine Learning can solve different social problems in our community and make the life of computation easy and accuracy in short time compare to hardcoded programming. Sample of problems which machine learning can solve are

• Machine which can detect frauds in transaction
• Machine which can detect tumor based on medical data
• Machine which can predict different product on ecommerce website or Movies website
• Can identify topics on blog post of the website or blogs

NOTE:

All these problems can be solved by good prepared data which computer can understand. Machine Learning depend on Data and Algorithms as they go along as one. Poor data preparation or poor algorithms chosen can make failure of machine prediction and outcome. So it is more important to understand the data you’re going to use which can give you some right algorithms for your project.

**Supervised Machine Learning. ** These dealing with the pair of input and output data to determined or predict the future result. As the word supervised, it need something to guide it as a teacher to supervise what and how it should be trained. The other concept I grab is that the Supervised Machine Learning has divided into two types

Classification
Regression Where by

Classfication goal is to predict a class label which is chosen from a predefined list of possibilities. Binary Classification try to answer YES|NO issues. Example of problems which can be done with classification are

Froud Detection
Image Classification
Customer Retention
Diagnostics

While Regression goals is to predict a continuous numbers or a floating point numbers in program terms. The Question is, how to distinguish between Classification and Regression Problems when you dealing with Supervised Machine Learning? *****The idea is to check and asking yourself if there is continuity on output. If there is that is Regression and otherwise is based on Classification.

Another point to be noted is simple understanding of Generalization, Overfitting and Underfitting.

Generalization

When the model is able to make accurate prediction on unseen data.

Overfitting

occurs when a statically model or machine learning algorithms capture the noise of the data. If the model or algorithms shows low bias but high variance. Overfitting occurs. These is the result of an excessive complicated model.

Overfitting can be prevented by fitting multiple models and using _validation or cross-validation _to compare their predictive accuracies on test data.

**Underfitting **

This occurs when a statistical model or machine learning algorithms cannot capture the underlying trend of data or can happen when the data cannot fit well enough. This has low variance and High bias

NOTE:

_Both overfitting and Underfitting lead to poor prediction on new datasets. _

This is do far I cover!!