ref.machine_learning - jgrey4296/jgrey4296.github.io GitHub Wiki
Constructs a decision tree classifier. Uses Information gain/entropy. Single-pass pruning. Continuous and discrete. Human Readable.
Operates on continuous data Weakness: sensitivity to outliers and initial choice of centroids
Learns Hyper planes, divides data into 2 classes. Margins: distance between the hyperplane and 2 closest data points from each class. Attempts to maximize margins.
Learns association rules of a database of transactions. Works using size of itemset (associations of 2,3,n), support: number of transactions containing itemset / total transactions confidence: conditional probability of an item given other items in itemset.
Approach: join -> prune -> repeat (So a bottom up approach)
see Asaini’s Apriori and Aturhoo’s apriori
Expectation-maximization Process: E-step -> M-step -> repeat E-Step calculates probabilities for assignments of each data point to a cluster M-Step updates model parameters based on cluster assignments Weaknesses: slows down in later iterations, gets stuck in local optima.
Determines relative importance of some object within a network of objects. Has a networkx implementation.
Multiple round learning of multiple classifiers Uses folds of training data on separate classifiers, weighting data that did that was hard on the previous round. Implemented in scikit-learn.
K-Nearest Neighbours. Lazy, only labels new data after training. Uses distance metrics, like Euclidean distance. Transform discrete data into continuous (such as hamming distance, binary features). Weakenesses: expensive on large datasets, weak on noisy data, large ranges can dominate distance metric, storage requirements, importance of a good distance metric.
Implemented in scikit-learn.
Implemented in scikit-learn.
Uses Gini Impurity. (A measure of how often a random element would be incorrectly labelled). Cost-complexity pruning. Decision nodes can only be binary. Uses surrogates (pseudo data that resembles test features that send data to the left or right node appropriately)
Implemented in scikit-learn.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
def linear_regression(data):
""" Acting on a ndarray of shape (n, 2) """
means = data.mean(axis=0)
errors = data - means
error_sq = (pow(errors[:,0],2)).sum()
errors = (errors[:,0] * errors[:,1]).sum()
coefficient = errors / error_sq
y_intercept = means[1] - (coefficient * means[0])
return (y_intercept, coefficient)
data1 = np.random.random((20,2))
data2 = np.random.random((20,2)) * 10
data = np.row_stack((data1, data2))
reg1 = linear_regression(data1)
reg2 = linear_regression(data2)
print("Regression: {},{}".format(*reg))
plt.figure()
plt.style.use('classic')
plt.plot([0, 10], [reg1[0], reg1[0] + reg1[1]])
plt.plot([0, 10], [reg2[0], reg2[0] + reg2[1]])
plt.plot(data[:,0], data[:,1], 'ro')
plt.show()
import numpy as np
a = np.random.random((5,2))
variance = a.var()
assert(variance == pow((a - a.mean()),2).mean())
import numpy as np
a = np.random.random((5,2))
covar = ((a[:,0]-m[0])*(a[:,1]-m[1])).mean()
http://efavdb.com/gaussian-processes/
https://github.com/edublancas/sklearn-evaluation http://billchambers.me/tutorials/2015/01/14/python-nlp-cheatsheet-nltk-scikit-learn.html
http://scikit-learn.org/stable/auto_examples/applications/plot_stock_market.html#stock-market
http://scikit-learn.org/stable/documentation.html
http://scikit-learn.org/stable/modules/naive_bayes.html
http://scikit-learn.org/stable/modules/preprocessing.html#binarization
http://scikit-learn.org/stable/user_guide.html
https://dashee87.github.io/data%20science/general/Clustering-with-Scikit-with-GIFs/
https://egghead.io/courses/introductory-machine-learning-algorithms-in-python-with-scikit-learn
https://github.com/aigamedev/scikit-neuralnetwork
https://pypi.python.org/pypi/scikit-neuralnetwork/0.3
https://scikit-learn.org/stable/modules/classes.html http://adataanalyst.com/machine-learning/apriori-algorithm-python-3-0/
http://blog.christianperone.com/2011/09/machine-learning-text-feature-extraction-tf-idf-part-i/
http://blog.webkid.io/datasets-for-machine-learning/
http://blog.yhat.com/posts/harry-potter-classification.html
http://cironline.org/blog/post/using-machine-learning-extract-quotes-text-3687
http://dalelane.co.uk/blog/?p=3381
http://en.wikipedia.org/wiki/Information_extraction
http://en.wikipedia.org/wiki/Reinforcement_learning
http://en.wikipedia.org/wiki/Restricted_Boltzmann_machine
http://humancompatible.ai/bibliography
http://inversed.ru/AIS.htm
http://johanneskopf.de/publications/pixelart/
http://kevintechnology.com/post/71621133663/using-machine-learning-to-recommend-heroes-for
http://machinelearning.wustl.edu/mlpapers/paper_files/LT17.pdf
http://machinelearningmastery.com/
http://machinelearningmastery.com/discover-feature-engineering-how-to-engineer-features-and-how-to-get-good-at-it/
http://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/
http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
http://matt.eifelle.com/2013/05/02/comparison-of-optimization-algorithms/
http://mccormickml.com/2013/12/13/adaboost-tutorial/
http://michaeljflynn.net/2017/02/06/a-tutorial-on-principal-component-analysis/
http://ml4a.github.io/classes/itp-F18/
http://ml4a.github.io/classes/itp-S19/
http://ml4a.github.io/demos/itpf18_viewer.html
http://pandas.pydata.org/
http://paradise.caltech.edu/~cook/papers/TwoNeurons.pdf
http://pybrain.org/docs/
http://pyml.sourceforge.net/tutorial.html
http://radimrehurek.com/gensim/tutorial.html
http://rare-technologies.com/word2vec-tutorial/
http://rayli.net/blog/data/top-10-data-mining-algorithms-in-plain-english/
http://rhizome.org/editorial/2016/nov/21/simulating-enron/
http://science.sciencemag.org/content/356/6334/183
http://scikit-learn.org/stable/auto_examples/applications/plot_stock_market.html#stock-market
http://scikit-learn.org/stable/documentation.html
http://scikit-learn.org/stable/modules/naive_bayes.html
http://scikit-learn.org/stable/modules/preprocessing.html#binarization
http://scikit-learn.org/stable/user_guide.html
http://seaborn.pydata.org/index.html
http://seat.massey.ac.nz/personal/s.r.marsland/MLbook.html
http://sebastianruder.com/optimizing-gradient-descent/
http://selfdrivingcars.mit.edu/deeptrafficjs/
http://synaptic.juancazala.com/#/
http://timdettmers.com/2015/03/26/convolution-deep-learning/
http://vertex.ai/blog/announcing-plaidml
http://visual.cs.ucl.ac.uk/pubs/handwriting/
http://webdocs.cs.ualberta.ca/~sutton/book/the-book.html
http://www.abigailsee.com/2017/08/30/four-deep-learning-trends-from-acl-2017-part-1.html
http://www.cs.ucf.edu/courses/cap6412/fall2009/papers/Berwick2003.pdf
http://www.datasciencecentral.com/m/blogpost
http://www.datasciencecentral.com/profiles/blogs/17-short-tutorials-all-data-scientists-should-read-and-practice
http://www.deeplearningbook.org/
http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/
http://www.joelsimon.net/dimensions-of-dialogue.html
http://www.johndcook.com/blog/2016/07/14/kalman-filters-and-functional-programming/
http://www.johnwittenauer.net/machine-learning-exercises-in-python-part-1/
http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/Luxburg07_tutorial_4488%5B0%5D.pdf
http://www.learndatasci.com/k-means-clustering-algorithms-python-intro/
http://www.marioai.org/LearningTrack/getting-started
http://www.mattkenney.me/
http://www.public.asu.edu/~cbaral/papers/aaai2016-sub.pdf
http://www.reddit.com/r/MachineLearning/comments/3az4qj/large_scale_deep_neural_net_falling_down_the/
http://www.statsblogs.com/2017/03/19/ml-and-metrics-viii-the-new-predictive-econometric-modeling/
http://www.technologyreview.com/view/535451/data-mining-indian-recipes-reveals-new-food-pairing-phenomenon/
http://www.visiondummy.com/2014/04/curse-dimensionality-affect-classification/
http://yerevann.com/a-guide-to-deep-learning/
https://abebabirhane.github.io/
https://ai.stanford.edu/~kdtang/papers/cmj10-jazzgrammar.pdf
https://applyingml.com/
https://applyingml.com/resources/discovery-system-design/
https://applyingml.com/resources/ml-production-guide/
https://arxiv.org/abs/1611.04135
https://arxiv.org/abs/1706.09520
https://arxiv.org/abs/1707.05589
https://arxiv.org/abs/1708.05866
https://arxiv.org/abs/1709.02755
https://arxiv.org/abs/1801.04016
https://arxiv.org/pdf/1706.10199.pdf
https://becominghuman.ai/cheat-sheets-for-ai-neural-networks-machine-learning-deep-learning-big-data-science-pdf-f22dc900d2d7
https://blog.acolyer.org/2016/12/16/tensorflow-a-system-for-large-scale-machine-learning/amp/
https://blog.acolyer.org/2017/01/04/learning-to-learn-by-gradient-descent-by-gradient-descent/
https://blog.jle.im/entry/purely-functional-typed-models-1.html
https://blog.openai.com/evolution-strategies/
https://blog.openai.com/science-of-ai/
https://blog.sicara.com/07-2017-best-big-data-new-articles-this-month-acb58d4bb15d
https://blog.slavv.com/the-1700-great-deep-learning-box-assembly-setup-and-benchmarks-148c5ebe6415
https://boringml.com/
https://crfm.stanford.edu/2021/10/18/reflections.html
https://cs.brown.edu/~dabel/blog/posts/misc/nips_2017.pdf
https://dashee87.github.io/data%20science/general/Clustering-with-Scikit-with-GIFs/
https://deepmind.com/blog/population-based-training-neural-networks/
https://developers.google.com/machine-learning/glossary/
https://distill.pub/
https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-algorithm-choice
https://egghead.io/courses/introductory-machine-learning-algorithms-in-python-with-scikit-learn
https://en.wikipedia.org/wiki/Association_rule_learning
https://en.wikipedia.org/wiki/Confusion_matrix
https://en.wikipedia.org/wiki/Machine_learning
https://eng.uber.com/accelerated-neuroevolution/
https://experiments.withgoogle.com/living-archive-wayne-mcgregor
https://gab41.lab41.org/the-10-algorithms-machine-learning-engineers-need-to-know-f4bb63f5b2fa?gi=4d857d2d5018#.zhgvlskgn
https://github.com/OpenMined/PySyft/tree/master/examples/tutorials
https://github.com/anvaka/sayit
https://github.com/asaini/Apriori
https://github.com/carpedm20
https://github.com/chartbeat-labs/textacy
https://github.com/ctgk/PRML
https://github.com/ethanfetaya/NRI
https://github.com/ethanfetaya/nri
https://github.com/facebook/MemNN
https://github.com/fchollet/keras-resources
https://github.com/iesl/institution_hierarchies
https://github.com/jakevdp/PythonDataScienceHandbook
https://github.com/jphall663/awesome-machine-learning-interpretability
https://github.com/markriedl/easygen
https://github.com/oliviaguest
https://github.com/pymc-devs/pymc
https://github.com/pytorch/pytorch
https://github.com/taolei87/sru
https://github.com/uber/causalml
https://github.com/vahidk/EffectiveTensorflow
https://github.com/vvanirudh/Pixel-Art
https://goelhardik.github.io/2016/10/04/fishers-lda/
https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/
https://hackernoon.com/finding-magic-the-gathering-archetypes-with-latent-dirichlet-allocation-729112d324a6
https://hbr.org/2016/12/a-guide-to-solving-social-problems-with-machine-learning
https://homes.cs.washington.edu/~msap/atomic/
https://howwegettonext.com/silicon-valley-thinks-everyone-feels-the-same-six-emotions-38354a0ef3d7
https://huggingface.co/
https://idc9.github.io/stor390/notes/clustering/clustering.html
https://jeremykun.com/2017/02/27/the-reasonable-effectiveness-of-the-multiplicative-weights-update-algorithm/
https://karpathy.github.io/2015/05/21/rnn-effectiveness/
https://keras.io/#installation
https://koaning.io/posts/goodheart-bad-metric/
https://lifehacker.com/find-specialty-subreddits-with-this-tool-1831773643
https://lilianweng.github.io/posts/2022-02-20-active-learning/
https://magenta.tensorflow.org/music-transformer
https://magenta.tensorflow.org/studio
https://makingnoiseandhearingthings.com/2018/08/31/what-you-can-cant-and-shouldnt-do-with-social-media-data/
https://medium.com/@Francesco_AI/ai-knowledge-map-how-to-classify-ai-technologies-6c073b969020
https://medium.com/@ageitgey/machine-learning-is-fun-80ea3ec3c471
https://medium.com/@blaisea/physiognomys-new-clothes-f2d4b59fdd6a
https://medium.com/@gk_/text-classification-using-algorithms-e4d50dcba45#.ge2p15jwp
https://medium.com/@james_aka_yale/the-8-neural-network-architectures-machine-learning-researchers-need-to-learn-2f5c4e61aeeb
https://medium.com/@samim/musical-novelty-search-2177c2a249cc
https://medium.com/analytics-vidhya/building-a-powerful-dqn-in-tensorflow-2-0-explanation-tutorial-d48ea8f3177a
https://medium.com/analytics-vidhya/building-a-powerful-dqn-in-tensorflow-2-0-explanation-tutorial-d48ea8f3177a?_branch_match_id=763867126928800599
https://medium.com/syncedreview/the-staggering-cost-of-training-sota-ai-models-e329e80fa82
https://medium.com/thoughts-and-reflections/racial-bias-and-gender-bias-examples-in-ai-systems-7211e4c166a1
https://medium.freecodecamp.org/explained-simply-how-deepmind-taught-ai-to-play-video-games-9eb5f38c89ee
https://medium.freecodecamp.org/the-hitchhikers-guide-to-machine-learning-algorithms-in-python-bfad66adb378
https://mml-book.github.io/
https://monkeylearn.com/blog/introduction-to-support-vector-machines-svm/
https://mubaris.com/2017-09-28/linear-regression-from-scratch
https://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/Index.ipynb
https://observablehq.com/@tophtucker/inferring-chart-type-from-autocorrelation-and-other-evils
https://openai.com/blog/faulty-reward-functions/
https://pypi.python.org/pypi/scikit-neuralnetwork/0.3
https://quality-diversity.github.io/papers
https://rayli.net/blog/data/top-10-data-mining-algorithms-in-plain-english/
https://rbharath.github.io/what-cant-deep-learning-do/
https://research.googleblog.com/2017/08/transformer-novel-neural-network.html
https://rockt.github.io/pdf/rocktaschel2017end-slides.pdf
https://sadanand-singh.github.io/posts/treebasedmodels/#.WXT8Kli2pUw.hackernews
https://scholarship.law.duke.edu/cgi/viewcontent.cgi?article=1315&context=dltr
https://setosa.io/ev/principal-component-analysis/
https://spinningup.openai.com/en/latest/
https://srconstantin.wordpress.com/2017/01/28/performance-trends-in-ai/
https://stackoverflow.com/questions/10059594/a-simple-explanation-of-naive-bayes-classification/20556654#20556654
https://stackoverflow.com/questions/34518656/how-to-interpret-loss-and-accuracy-for-a-machine-learning-model#34519264
https://techcrunch.com/2012/12/14/ray-kurzweil-joins-google-as-engineering-director-focusing-on-machine-learning-and-language-tech/
https://thestackcanary.com/from-python-pytorch-to-elixir-nx/
https://towardsdatascience.com/ai-architecture-f9d78c6958e0?gi=ba57e7504245
https://towardsdatascience.com/the-advent-of-architectural-ai-706046960140?gi=7ffeaec03907
https://towardsdatascience.com/the-most-underrated-python-packages-e22bf6049b5e
https://towardsdatascience.com/understanding-the-mathematics-behind-gradient-descent-dde5dc9be06e
https://towardsdatascience.com/use-kaggle-to-start-and-guide-your-ml-data-science-journey-f09154baba35?gi=67279a870d21
https://utkuufuk.github.io/2018/05/04/learning-curves/
https://web.archive.org/web/20030903185326/http://www.aisb.org.uk/news/mljresign.html
https://web.archive.org/web/20160729170700/http://numenta.com/
https://www.alexirpan.com/2018/02/14/rl-hard.html
https://www.cc.gatech.edu/~riedl/pubs/aaai-keg17.pdf
https://www.cs.ox.ac.uk/people/yarin.gal/website/blog_3d801aa532c1ce.html
https://www.cs.princeton.edu/news/bias-machine-internet-algorithms-reinforce-harmful-stereotypes
https://www.nature.com/articles/s41598-017-08028-4
https://www.oreilly.com/ideas/visualizing-convolutional-neural-networks
https://www.polygon.com/2018/10/25/18010142/machine-learning-president-2020-election-larp
https://www.sciencenews.org/article/machines-are-getting-schooled-fairness
https://www.techdirt.com/articles/20130110/14542221635/ibm-researcher-feeds-watson-supercomputer-urban-dictionary-very-quickly-regrets-it.shtml
https://www.technologyreview.com/s/608380/machines-are-developing-language-skills-inside-virtual-worlds/
https://www.technologyreview.com/s/613630/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/?utm_campaign=the_download.unpaid.engagement&utm_source=hs_email&utm_medium=email&utm_content=73419330&_hsenc=p2ANqtz-_bEFSwiNCaX2ewrkLJMvV6uqgEPuDv9EaDkl2ulug1XcyiDfE6ni0TOY6OWvbNpExPMpxFIHKWB8UZ8zA-hi55UyLMLQ&_hsmi=73419330
https://www.tensorflow.org/tutorials/mnist/beginners/
https://www.wired.com/story/machines-shouldnt-have-to-spy-on-us-to-learn/
https://www.wired.com/story/sobering-message-future-ai-party/
https://www.youtube.com/playlist?list=PLqYmG7hTraZDNJre23vqCGIVpfZ_K2RZs
https://yanirseroussi.com/2016/02/14/why-you-should-stop-worrying-about-deep-learning-and-deepen-your-understanding-of-causality-instead/
https://zefsguides.com/