ML questions - SoojungHong/MachineLearning GitHub Wiki

A machine learning interview is definitely not a pop quiz and one must know what to expect going in. So here are 50 top questions you might expect in the interview.

  • What is the difference between inductive machine learning and deductive machine learning?
  • How will you know which machine learning algorithm to choose for your classification problem?
  • Mention the difference between Data Mining and Machine learning?
  • What is ‘Overfitting’ in Machine learning?
  • Why overfitting happens?
  • How can you avoid overfitting?
  • Is rotation necessary in PCA? If yes, Why? What will happen if you don’t rotate the components?
  • You are given a data set. The data set has missing values which spread along 1 standard deviation from the median. What percentage of data would remain unaffected? Why?
  • Why is Naïve Bayes machine learning algorithm naïve?
  • How will you explain machine learning in to a layperson?
  • What is inductive machine learning?
  • What are the different Algorithm techniques in Machine Learning?
  • List out some important methods of reducing dimensionality.
  • Explain prior probability, likelihood and marginal likelihood in context of naïve Bayes algorithm?
  • What are the three stages to build the hypotheses or model in machine learning?
  • What is the standard approach to supervised learning?
  • What is ‘Training set’ and ‘Test set’?
  • List down various approaches for machine learning?
  • How to know that your model is suffering from low bias and high variance. Which algorithm should you use to tackle it? Why?
  • How is kNN different from kmeans clustering?
  • Name some feature extraction techniques used for dimensionality reduction.
  • List some use cases where classification machine learning algorithms can be used.
  • What kind of problems does regularization solve?
  • How much data will you allocate for your training, validation and test sets?
  • Which one would you prefer to choose – model accuracy or model performance?
  • What is the most frequent metric to assess model accuracy for classification problems?
  • Describe some popular machine learning methods.
  • What is not Machine Learning?
  • Explain what is the function of ‘Unsupervised Learning’?
  • When will you use classification over regression?
  • How will you differentiate between supervised and unsupervised learning? Give few examples of algorithms for supervised learning?
  • Explain the tradeoff between bias and variance in a regression problem.
  • What is linear regression? Why is it called linear?
  • How does the variance of the error term change with the number of predictors, in OLS?
  • Do we always need the intercept term? When do we need it and when do we not?
  • How interpretable is the given machine learning model?
  • What will you do if training results in very low accuracy?
  • Does the developed machine learning model have convergence problems?
  • Which tools and environments have you used to train and assess machine learning models?
  • How will you apply machine learning to images?
  • What is collinearity and what to do with it?
  • How to remove multicollinearity?
  • What is overfitting a regression model? What are ways to avoid it?
  • What is loss function in a Neural Network?
  • Explain the difference between MLE and MAP inference.
  • What is boosting?
  • If the gradient descent does not converge, what could be the problem?
  • How will you check for a valid binary search tree?
  • How to check if the regression model fits the data well?
  • Describe some of the different splitting rules used by different decision tree algorithms.