ML questions - SoojungHong/MachineLearning GitHub Wiki

A machine learning interview is definitely not a pop quiz and one must know what to expect going in. So here are 50 top questions you might expect in the interview.

What is the difference between inductive machine learning and deductive machine learning?
How will you know which machine learning algorithm to choose for your classification problem?
Mention the difference between Data Mining and Machine learning?
What is ‘Overfitting’ in Machine learning?
Why overfitting happens?
How can you avoid overfitting?
Is rotation necessary in PCA? If yes, Why? What will happen if you don’t rotate the components?
You are given a data set. The data set has missing values which spread along 1 standard deviation from the median. What percentage of data would remain unaffected? Why?
Why is Naïve Bayes machine learning algorithm naïve?
How will you explain machine learning in to a layperson?
What is inductive machine learning?
What are the different Algorithm techniques in Machine Learning?
List out some important methods of reducing dimensionality.
Explain prior probability, likelihood and marginal likelihood in context of naïve Bayes algorithm?
What are the three stages to build the hypotheses or model in machine learning?
What is the standard approach to supervised learning?
What is ‘Training set’ and ‘Test set’?
List down various approaches for machine learning?
How to know that your model is suffering from low bias and high variance. Which algorithm should you use to tackle it? Why?
How is kNN different from kmeans clustering?
Name some feature extraction techniques used for dimensionality reduction.
List some use cases where classification machine learning algorithms can be used.
What kind of problems does regularization solve?
How much data will you allocate for your training, validation and test sets?
Which one would you prefer to choose – model accuracy or model performance?
What is the most frequent metric to assess model accuracy for classification problems?
Describe some popular machine learning methods.
What is not Machine Learning?
Explain what is the function of ‘Unsupervised Learning’?
When will you use classification over regression?
How will you differentiate between supervised and unsupervised learning? Give few examples of algorithms for supervised learning?
Explain the tradeoff between bias and variance in a regression problem.
What is linear regression? Why is it called linear?
How does the variance of the error term change with the number of predictors, in OLS?
Do we always need the intercept term? When do we need it and when do we not?
How interpretable is the given machine learning model?
What will you do if training results in very low accuracy?
Does the developed machine learning model have convergence problems?
Which tools and environments have you used to train and assess machine learning models?
How will you apply machine learning to images?
What is collinearity and what to do with it?
How to remove multicollinearity?
What is overfitting a regression model? What are ways to avoid it?
What is loss function in a Neural Network?
Explain the difference between MLE and MAP inference.
What is boosting?
If the gradient descent does not converge, what could be the problem?
How will you check for a valid binary search tree?
How to check if the regression model fits the data well?
Describe some of the different splitting rules used by different decision tree algorithms.