machine learning - bobbae/gcp GitHub Wiki
Machine learning (ML) is the study of computer algorithms that improve automatically through experience and data.
Machine learning is an application of Artificial Intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves.
https://www.youtube.com/watch?v=9MWj__4s9hk&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie
https://cloud.google.com/architecture/guidelines-for-developing-high-quality-ml-solutions
https://github.com/collections/machine-learning
https://www.youtube.com/watch?v=QR_LQQ-vvko
Vertex AI brings AutoML and AI Platform together into a unified API, client library, and user interface.
AI Hub is a platform that lets us centralize our code and knowledge in a way that can step up the pace of deployment and learnings globally.
AI Platform is a development platform to build AI applications that run on GCP and on-premises.
AutoML lets you train high-quality custom machine learning models with minimal effort and machine learning expertise.
BigQuery ML lets you create and execute machine learning models in BigQuery using standard SQL queries.
AutoML can be used to create your own custom machine learning models that are tailored to your business needs, and then integrate those models into your applications.
AI Platform enables many parts of the machine learning (ML) workflow.
https://cloud.google.com/ai-platform/docs/ml-solutions-overview
https://www.youtube.com/watch?v=pm_-pVPvZ-4
https://www.youtube.com/watch?v=m0rqccviLNM
https://www.youtube.com/watch?v=OHIEZ-Scek8
https://www.youtube.com/watch?v=ieaqfU1BwJ8
https://www.youtube.com/watch?v=CReeC8YuEd8
Cloud Vision includes several options that you can use to integrate machine learning vision models into your applications.
https://www.youtube.com/watch?v=kgxfdTh9lz0
https://www.youtube.com/watch?v=BN8aO0LULyw
Video Intelligence includes several options that you can use to integrate machine learning video intelligence models into your applications.
https://www.youtube.com/watch?v=h1zU0Qor9J8
The Cloud Natural Language API provides natural language understanding technologies to developers, including sentiment analysis, entity analysis, entity sentiment analysis, content classification, and syntax analysis.
https://cloud.google.com/natural-language/docs
https://www.youtube.com/watch?v=UFtXy0KRxVI
Qwiklabs GSP097
https://www.qwiklabs.com/focuses/582?parent=catalog
https://www.youtube.com/watch?v=3iOtK0sRNMI
https://www.youtube.com/watch?v=MNvT5JekDpg
Cloud Translation can dynamically translate text between thousands of language pairs.
The Translation API covers a huge number of language pairs and does a great job with general-purpose text. Where AutoML Translation really shines is for the "last mile" between generic translation tasks and specific, niche vocabularies.
https://www.youtube.com/watch?v=YapTts_An9A
Text-to-Speech converts text or Speech Synthesis Markup Language (SSML) input into audio data of natural human speech.
https://www.youtube.com/watch?v=OK1ZmlaFIV8
https://cloud.google.com/speech-to-text/docs
https://www.youtube.com/watch?v=naZ8oEKuR44
https://cloud.google.com/blog/products/ai-machine-learning/top-google-cloud-speech-api-codelabs
Google AutoML Natural Language is much more powerful than the Natural Language API because it allows the user to train models that are customized for their specific dataset and domain.
The Google Natural Language API is an easy to use interface to a set of powerful NLP models which have been pre-trained.
The major advantage of the Google Natural Language API is its ease of use. No machine learning skills are required and almost no coding skills.
The Google Natural Language API is a very convenient option for quick, out-of-the-box solutions.
If the Natural Language API is not flexible enough for your business purposes, then AutoML Natural Language might be the right service.
Step 1, read this comicbook.
Step 2, head over to this tutorial.
Step 3, look at these videos: https://www.youtube.com/playlist?list=PLblh5JKOoLUICTaGLRoHQDuF_7q2GfuJF
Step 4, go through ML learning materials
https://hackernoon.com/where-to-learn-machine-and-deep-learning-for-free
https://github.com/eugeneyan/applied-ml
https://github.com/microsoft/ML-For-Beginners
https://youtube.com/channel/UC12LqyqTQYbXatYS9AA7Nuw
A Data Scientist models and analyzes key data and continually improves the way the business utilizes data. Data Scientists aim to make accurate predictions about the future using in-depth data modeling and deep learning.
https://en.wikipedia.org/wiki/Predictive_analytics
AI Platform enables many parts of the machine learning (ML) workflow.
https://cloud.google.com/ai-platform/docs/ml-solutions-overview
Gather data, prepare data, choose the model, train the model, evaluate, tune parameters, review prediction or inference.
https://towardsdatascience.com/the-7-steps-of-machine-learning-2877d7e5548e
https://www.youtube.com/watch?v=nKW8Ndu7Mjw
https://datasetsearch.research.google.com/
A supervised machine learning algorithm (as opposed to an unsupervised machine learning algorithm) is one that relies on labeled input data to learn a function that produces an appropriate output when given new unlabeled data.
https://towardsdatascience.com/supervised-vs-unsupervised-learning-14f68e32ea8d
The most common tasks within unsupervised learning are clustering, representation learning, and density estimation. In all of these cases, we wish to learn the inherent structure of our data without using explicitly-provided labels. Some common algorithms include k-means clustering, principal component analysis, and autoencoders. Since no labels are provided, there is no specific way to compare model performance in most unsupervised learning methods.
Two common use-cases for unsupervised learning are exploratory analysis and dimensionality reduction.
In situations where it is either impossible or impractical for a human to propose trends in the data, unsupervised learning can provide initial insights that can then be used to test individual hypotheses.
Dimensionality reduction, which refers to the methods used to represent data using less columns or features, can be accomplished through unsupervised methods. In representation learning, we wish to learn relationships between individual features, allowing us to represent our data using the latent features that interrelate our initial features. This sparse latent structure is often represented using far fewer features than we started with, so it can make further data processing much less intensive, and can eliminate redundant features.
Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data.
https://www.tensorflow.org/tfx/tutorials/tfx/penguin_tft
- https://www.kdnuggets.com/2018/12/feature-engineering-explained.html
- https://towardsdatascience.com/feature-engineering-for-machine-learning-3a5e293a5114
- https://towardsdatascience.com/feature-engineering-in-machine-learning-23b338ea48f4
In digital circuits and machine learning, a one-hot is a group of bits among which the legal combinations of values are only those with a single high (1) bit and all the others low (0).
https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/
https://hackernoon.com/what-is-one-hot-encoding-why-and-when-do-you-have-to-use-it-e3c6186d008f
One-shot learning is a classification task where one, or a couple, examples are used to classify many new examples in the future.
https://en.wikipedia.org/wiki/One-shot_learning
Binning (also called bucketing) is the process of converting a continuous feature into multiple binary features called bins or buckets, typically based on value range.
https://towardsdatascience.com/binning-for-feature-engineering-in-machine-learning-d3b3d76f364a
Normalization is the process of converting an actual range of values which a numerical feature can take, into a standard range of values, typically in the interval [≠1, 1] or [0, 1].
By normalizing all of our inputs to a standard scale, we're allowing the network to more quickly learn the optimal parameters for each input node.
Standardization (or z-score normalization) is the procedure during which the feature values are rescaled so that they have the properties of a standard normal distribution.
In some cases, the data comes to the analyst in the form of a dataset with features already defined. In some examples, values of some features can be missing.
https://towardsdatascience.com/7-ways-to-handle-missing-values-in-machine-learning-1a6326adf79e
One technique consists in replacing the missing value of a feature by an average value of this feature in the dataset.
Once you have got your annotated dataset, you can split the dataset into three subsets: training, validation, and test.
https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/
L1 and L2 regularization methods are also combined in what is called elastic net regularization with L1 and L2 regularizations being special cases.
https://towardsdatascience.com/l1-and-l2-regularization-methods-ce25e7fc831c
Once you have a model built using the training set, how can you say how good the model is? You use test set to assess the model.
https://heartbeat.fritz.ai/introduction-to-machine-learning-model-evaluation-fa859e1b2d7f
Accuracy is necessarily relevant or good way of evaluating a model. Accuracy is given by the number of correctly classified examples divided by the total number of classified examples.
Accuracy may be useful when errors in predicting all classes are equally important. In case of spam/not spam this may not be the case. You would tolerate false positives less than false negatives. A false positive may mean you don't get an important email. False negative is no big deal, even though it is annoying to get a spam.
Accuracy can be not useful when all classes not not equally important. Predicting click stream can be biased because of very few real positive clicks per rendered pages. In other words, almost no clicks can be the norm. In that case, a model that is 99.999% accurate can be created by returning "no click" as answer every time.
https://en.wikipedia.org/wiki/Accuracy_and_precision
https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff
Confusion Matrix is a table that summarizes how successful the classification model is at predicting examples belonging to various classes.
Confusion Matrices can be used to calculate two important performance metrics: precision and recall.
The two most frequently used metrics to assess the model are precision and recall. Precision is the ratio of correct positive predictions to overall number of positive predictions. Recall is the ratio of positive predictions to the overall number of positive examples in the test set.
https://www.youtube.com/watch?v=j-EB6RqqjGI&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie&index=3
https://towardsdatascience.com/beyond-accuracy-precision-and-recall-3da06bea9f6c
Imagine that you are given an image and asked to detect all the cars within it. Which metric do you use? Because the goal is to detect all the cars, use recall. This may misclassify some objects as cars, but it eventually will work towards detecting all the target objects.
Now say you're given a mammography image, and you are asked to detect whether there is cancer or not. Which metric do you use? Because it is sensitive to incorrectly identifying an image as cancerous, we must be sure when classifying an image as Positive (i.e. has cancer). Thus, precision is the preferred metric.
F-measure is the harmonic mean of Precision and Recall and gives a better measure of the incorrectly classified cases than the Accuracy Metric.
https://machinelearningmastery.com/precision-recall-and-f-measure-for-imbalanced-classification/
https://towardsdatascience.com/accuracy-precision-recall-or-f1-331fb37c5cb9
Log-loss is indicative of how close the prediction probability is to the corresponding actual/true value (0 or 1 in case of binary classification). The more the predicted probability diverges from the actual value, the higher is the log-loss value.
https://towardsdatascience.com/intuition-behind-log-loss-score-4e0c9979680a
https://dzone.com/articles/ml-metrics-sensitivity-vs-specificity-difference
Receiver Operating Characteristic curve and Area Under the Curve use a combination of the true positive rate and false positive rate to build up a summary picture of the model performance.
An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds. This curve plots two parameters: True Positive Rate and False Positive Rate.
AUC provides an aggregate measure of performance across all possible classification thresholds. One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example.
AUC ranges in value from 0 to 1. A model whose predictions are 100% wrong has an AUC of 0.0; one whose predictions are 100% correct has an AUC of 1.0.
https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc
https://towardsdatascience.com/intuition-behind-roc-auc-score-1456439d1f30
https://cloud.google.com/vertex-ai/docs/training/evaluating-automl-models
https://cloud.google.com/blog/products/ai-machine-learning/7-tips-for-trouble-free-ml-model-training
Classification is a task that requires the use of machine learning algorithms that learn how to assign a class label to examples from the problem domain. An easy to understand example is classifying emails as “spam” or “not spam.”
Classification algorithms are used when you have a dataset of observations where we'd like to use the features associated with an observation to predict its class.
Bayes' theorem, named after 18th-century British mathematician Thomas Bayes, is a mathematical formula for determining conditional probability. Conditional probability is the likelihood of an outcome occurring, based on a previous outcome occurring.
https://www.youtube.com/watch?v=HZGCoVF3YvM
https://machinelearningmastery.com/naive-bayes-classifier-scratch-python/
Naive Bayes classification methods are quite simple (in terms of model complexity) and commonly used for tasks such as document classification and spam filtering. This algorithm works well for datasets with a large amount of features (ex. a body of text where every word is treated as a feature) but it is naive in the sense that it treats every feature as independent of one another. This is clearly not the case for language, where word order matters when trying to discern meaning from a statement. Nonetheless, these methods have been used quite successfully for various text classification tasks.
Regression and classification lead to ways of splitting data.
https://en.wikipedia.org/wiki/Regression_analysis
Classification is a problem of automatically assigning a label to an unlabeled example. Spam detection is a famous example of classification.
https://en.wikipedia.org/wiki/Statistical_classification
Linear regression is used to predict an outcome given some input value(s). While machine learning classifiers use features to predict a discrete label for a given instance or example, machine learning regressors have the ability use features to predict a continuous outcome for a given instance or example.
https://www.youtube.com/watch?v=K_EH2abOp00&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie&index=2
https://www.youtube.com/watch?v=nk2CQITm_eo
Polynomial regression is very similar to linear regression, with a slight deviation in treatment of the feature-space.
https://www.youtube.com/watch?v=wBVSbVktLIY&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie&index=5
The goal of logistic regression, as with any classifier, is to figure out some way to split the data to allow for an accurate prediction of a given observation's class using the information present in the features.
https://www.youtube.com/watch?v=YMJtsYIp4kg&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie&index=4
https://www.youtube.com/watch?v=yIYKR4sgzI8
Decision trees are one of the oldest and most widely-used machine learning models, due to the fact that they work well with noisy or missing data, can easily be formed as more robust predictors, and are incredibly fast at runtime.
Decision trees are desirable in that they scale well to larger datasets, they are robust against irrelevant features, and it is very easy to visualize the rationalization between a decision tree's predictions.
Support vector machines classifier works well in complicated feature domains, albeit requiring clear separation between classes.
SVM is a supervised machine learning model that uses classification algorithms for two-group classification problems.
Compared to newer algorithms like neural networks, they have two main advantages: higher speed and better performance with a limited number of samples.
https://www.youtube.com/watch?v=05VABNfa1ds&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie&index=6
https://monkeylearn.com/blog/introduction-to-support-vector-machines-svm/
SVMs don't work well with noisy data, and the algorithm scales roughly cubic O(n3) to input depending on your implementation.
Random forests inherit the benefits of a decision tree model whilst improving upon the performance by reducing the variance.
https://www.youtube.com/watch?v=mld0TnA2jEs&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie&index=8
Boosting is an iterative process where models are trained in a sequential order.
Content Classification analyzes a document and returns a list of content categories that apply to the text found in the document.
https://cloud.google.com/natural-language/docs/classifying-text
Using AutoML to classify text.
https://www.youtube.com/watch?v=ieaqfU1BwJ8
The Cloud Data Loss Prevention (DLP) helps you understand, manage, and protect sensitive data. With the Cloud DLP, you can easily classify and redact sensitive data contained in text-based content and images, including content stored in Google Cloud storage repositories.
https://cloud.google.com/dlp/docs/classification-redaction
Clustering is a popular technique to find groups or segments in your data that are similar. This is an unsupervised learning algorithm in the sense that you don't train the algorithm and give it examples for what you'd like it to do, you just let the clustering algorithm explore the data and provide you with new insights.
K-means clustering is a simple method for partitioning n data points in k groups, or clusters.
https://www.youtube.com/watch?v=O6b2L_lYH9k&list=PLTl9hO2Oobd9UuNwS9R5Z6HcTesBMCvie&index=9
https://www.youtube.com/watch?v=4b5d3muPQmA
The k-nearest neighbors (KNN) algorithm is a simple, easy-to-implement supervised machine learning algorithm that can be used to solve both classification and regression problems.
Dimensionality reduction is used to reduce the dimension of our feature-space while maintaining the maximum amount of information.
Principal components analysis (PCA) allows us to take an n-dimensional feature-space and reduce it to a k-dimensional feature-space while maintaining as much information from the original dataset as possible in the reduced dataset.
Autoencoders are an unsupervised learning technique in which we leverage neural networks for the task of representation learning.
Neural networks are one of the most popular approaches to machine learning today, achieving impressive performance on a large variety of tasks.
https://www.youtube.com/watch?v=fkqZyYo_ebs&list=PLTl9hO2Oobd-GaTYQWIuIs2yyNy7TYbEj&index=1
https://developers.google.com/machine-learning/crash-course/introduction-to-neural-networks/anatomy
Neural networks are a biologically-inspired algorithm that attempt to mimic the functions of neurons in the brain. Each neuron acts as a computational unit, accepting input from the dendrites and outputting signal through the axon terminals. Actions are triggered when a specific combination of neurons are activated.
Activation functions are used to determine the firing of neurons in a neural network. Given a linear combination of inputs and weights from the previous layer, the activation function controls how we'll pass that information on to the next layer.
Backpropagation computes the gradient of the loss function with respect to the weights of the network for a single input–output example, and does so efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used.
Gradient descent is an optimization technique commonly used in training machine learning algorithms. Often when we're building a machine learning model, we'll develop a cost function which is capable of measuring how well our model is doing. This function will penalize any error our model makes by assigning a cost with respect to the current parameter values. By minimizing the cost function we can find the optimal parameters that yield the best model performance.
https://www.youtube.com/watch?v=sDv4f4s2SB8
It is often necessary to tune hyperparameters by experimentally finding the best combination of values, one per hyperparamter.
One of the key hyperparameters to set in order to train a neural network is the learning rate for gradient descent. The learning rate parameter scales the magnitude of our weight updates in order to minimize the network's loss function.
If your learning rate is set too low, training will progress very slowly as you are making very tiny updates to the weights in your network. However, if your learning rate is set too high, it can cause undesirable divergent behavior in your loss function.
CNN are used heavily in image recognition applications of machine learning. Convolutional neural networks provide an advantage over feed-forward networks because they are capable of considering locality of features.
https://www.youtube.com/watch?v=m8pOnJxOcqY&list=PLTl9hO2Oobd-GaTYQWIuIs2yyNy7TYbEj&index=2
https://www.youtube.com/watch?v=YRhxdVk_sIs
UNet, evolved from the traditional convolutional neural network, was first designed and applied in 2015 to process biomedical images. As a general convolutional neural network focuses its task on image classification, where input is an image and output is one label, but in biomedical cases, it requires us not only to distinguish whether there is a disease, but also to localise the area of abnormality.
https://towardsdatascience.com/unet-line-by-line-explanation-9b191c76baf5
Recurrent neural networks are good for learning from sequential data.
RNNs are often used in text and speech processing because sentences and texts are naturally sequences of either words/punctuation marks or sequences of characters.
https://www.youtube.com/watch?v=yZv_yRgOvMg&list=PLTl9hO2Oobd-GaTYQWIuIs2yyNy7TYbEj&index=3
https://www.youtube.com/watch?v=LHXXI4-IEns
Long short-term memory networks are an extension for recurrent neural networks, which basically extends the memory. Therefore it is well suited to learn from important experiences that have very long time lags in between.
https://www.youtube.com/watch?v=QciIcRxJvsM&list=PLTl9hO2Oobd-GaTYQWIuIs2yyNy7TYbEj&index=4
LSTMs enable RNNs to remember inputs over a long period of time.
https://www.youtube.com/watch?v=xI0HHN5XKDo
https://www.youtube.com/watch?v=C1YUYWP-6rE&list=PLTl9hO2Oobd-GaTYQWIuIs2yyNy7TYbEj&index=5
https://www.youtube.com/watch?v=TQQlZhbC5ps
https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/one-vs-all
https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/softmax
https://developers.google.com/machine-learning/crash-course/training-neural-networks/best-practices
Reinforcement learning is an approach to machine learning where agents are rewarded to accomplish some task.
The Markov Decision Process is a method for planning in a stochastic environment.
The Monte Carlo approach approximates the value of a state-action pair by calculating the mean return from a collection of episodes.
Most supervised learning algorithms are model-based, e.g. SVM. Model-based learning algorithms use the training data to create a model that has parameters learned from the training data. After the model was built, the training data can be discarded.
Instance-based learning algorithms use the whole dataset as the model. One instance-based algorithm frequently used in practice is k-Nearest Neighbors (kNN). In classification, to predict a label for an input example the kNN algorithm looks at the close neighborhood of the input example in the space of feature vectors and outputs the label that it saw the most often in this close neighborhood.
https://www.kaggle.com/getting-started/179177
A shallow learning algorithm learns the parameters of the model directly from the features of the training examples. Most supervised learning algorithms are shallow. The exceptions are neural network learning algorithms, specifically those that build neural networks with more than one layer between input and output. Such neural networks are called deep neural networks. In deep neural network learning (or, deep learning), contrary to shallow learning, most model parameters are learned not directly from the features of the training examples, but from the outputs of the preceding layers.
https://www.mathworks.com/discovery/deep-learning.html
https://www.malicksarr.com/type-of-machine-learning-algorithms-the-complete-overview/
https://serokell.io/blog/machine-learning-algorithm-classification-overview
https://towardsdatascience.com/a-tour-of-machine-learning-algorithms-466b8bf75c0a
Term frequency-inverse document frequency (TF-IDF) vectorization is a mouthful to say, but it's also a simple and convenient way to characterize bodies of text.
https://www.datacamp.com/community/tutorials/text-analytics-beginners-nltk
Bidirectional Encoder Representations from Transformers is described in this paper.
https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270
Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce human-like text.
https://en.wikipedia.org/wiki/GPT-3
https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this-article-gpt-3
https://github.com/elyase/awesome-gpt3
https://github.blog/2021-06-29-introducing-github-copilot-ai-pair-programmer/
https://transformer.huggingface.co/doc/gpt
https://www.ibm.com/blogs/watson/2020/12/how-bert-and-gpt-models-change-the-game-for-nlp/
https://360digitmg.com/gpt-vs-bert
https://dl.acm.org/doi/10.1145/3442188.3445922
https://faculty.washington.edu/ebender/stochasticparrots.html
https://bigscience.huggingface.co/blog/bloom
https://openai.com/blog/dall-e/
https://imagen.research.google/
https://blog.paperspace.com/dalle-mini/
https://github.com/salesforce/CodeT5
https://github.com/codota/TabNine
ELMo is a deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy).
http://jalammar.github.io/illustrated-bert/
Transfer Learning is the process of training a model on a large-scale dataset and then using that pre-trained model to process learning for another target task.
Transfer Learning became popular in the field of NLP thanks to the state-of-the-art performance of different algorithms like ULMFiT, Skip-Gram, Elmo, BERT etc.
https://towardsdatascience.com/transfer-learning-using-elmo-embedding-c4a7e415103c
https://jalammar.github.io/illustrated-transformer/
https://www.elderresearch.com/blog/trends-in-natural-language-processing/
An embedding is a relatively low-dimensional space into which you can translate high-dimensional vectors to capture some of the semantics of the input by placing semantically similar inputs close together in the embedding space.
https://cloud.google.com/blog/topics/developers-practitioners/meet-ais-multitool-vector-embeddings
https://developers.google.com/machine-learning/crash-course/embeddings/categorical-input-data
https://developers.google.com/machine-learning/crash-course/embeddings/obtaining-embeddings
spaCy supports a number of transfer and multi-task learning workflows that can often help improve your pipeline’s efficiency or accuracy.
https://github.com/src-d/awesome-machine-learning-on-source-code
https://cloud.google.com/vertex-ai/docs/featurestore/overview
https://www.hopsworks.ai/post/feature-store-the-missing-data-layer-in-ml-pipelines
Created by the Google Brain team, TensorFlow is an open source library for numerical computation and large-scale machine learning.
TensorFlow has a pretty large API surface, but the part we are going to focus on is high-level APIs, called Estimators.
https://towardsdatascience.com/plain-and-simple-estimators-d8d3f4c185c1
The Estimators API gives us a nice workflow of getting our raw data, passing it through an input function, setting up our feature columns and model structure, running our training, and running our evaluation.
https://www.youtube.com/watch?v=G7oolm0jU8I
Scikit Learn provides a range of supervised and unsupervised learning algorithms via a consistent interface.
Keras is a neural network library. It wraps the efficient numerical computation libraries Theano and TensorFlow and allows you to define and train neural network models.
PyTorch is an awesome source machine learning library based on the Torch library.
https://github.com/karpathy/micrograd
https://github.com/geohot/tinygrad
Kubeflow Pipelines is a platform for building, deploying, and managing multi-step ML workflows based on Docker containers. Kubeflow offers several components that you can use to build your ML training, hyperparameter tuning, and serving workloads across multiple platforms.
MLOps is the process of taking an experimental Machine Learning model into a production web system.
https://wikipedia.org/wiki/Fairness_(machine_learning)
https://eugeneyan.com/writing/first-rule-of-ml/
https://en.wikipedia.org/wiki/P-value
https://spectrum.ieee.org/deep-learning-computational-cost
https://developers.google.com/machine-learning/crash-course
https://danfo.jsdata.org/examples/titanic-survival-prediction-using-danfo.js-and-tensorflow.js
https://cloud.google.com/bigquery-ml/docs/tutorials
https://towardsdatascience.com/the-making-of-an-ai-storyteller-c3b8d5a983f5
https://cloud.google.com/ai-platform/docs/getting-started-keras
https://medium.com/spikelab/anomalies-detection-using-river-398544d3536
https://www.youtube.com/watch?v=o6nGn1euRjk&list=PLIivdWyY5sqLsaG5hNms0D9aZRBE7DHBb&index=7
https://developers.google.com/machine-learning/glossary?hl=en
- https://developers.google.com/machine-learning/crash-course/
- https://codelabs.developers.google.com/ml-for-developers
- https://www.mygreatlearning.com/blog/machine-learning-tutorial/
- https://github.com/ujjwalkarn/Machine-Learning-Tutorials
- https://cloud.google.com/blog/topics/developers-practitioners/new-ml-learning-path-vertex-ai
- Machine Learning Tutorial
- https://medium.com/sarus/distributed-ml-with-dask-and-kubernetes-on-gcp-97fdd6533736
- https://towardsdatascience.com/preprocessing-time-series-data-for-supervised-learning-2e27493f44ae
- http://themlbook.com/wiki/doku.php?id=start
- https://d2l.ai/
- https://www.coursera.org/learn/machine-learning
- https://www.javatpoint.com/machine-learning
- self driving car tutorial https://youtu.be/Rs_rAxEsAvI
- https://github.com/ashishpatel26/Real-time-ML-Project
- https://github.com/melvfnz/data_science_portfolio
- StatQuest videos
- Google AI Fun Projects
- AI Platform Training and Prediction sample code repo
- Guide to bring code to ML GCP
- Labs and demos for courses for GCP ML and Bigdata Training
- Official repo for Google AI Platform
- Building Machine Learning and Deep Learning Models on GCP
- Hands-On Machine Learning on GCP
- Machine Learning Mastery
- Awesome Machine Learning
- https://www.qwiklabs.com/quests/50
- https://www.qwiklabs.com/quests/32
- https://www.qwiklabs.com/focuses/3389?parent=catalog
- https://www.qwiklabs.com/focuses/3393?parent=catalog
- https://google.qwiklabs.com/quests/82
- https://www.qwiklabs.com/focuses/3391?parent=catalog
- https://www.qwiklabs.com/focuses/1241?parent=catalog