Lecture 11 - data-ppf/data-ppf.github.io GitHub Wiki

This week we turn from "data analytics" 1962->2006 to "machine learning" itself [1].

First we'll look at a paper evidencing the split in the AI community between "heuristics" and "learning" as two major approaches from Simon [2]:

Simon, Herbert A. "Why should machines learn?." In Machine Learning, Volume I, pp. 25-37. 1983.

Next will look at a paper emphasizing the split in the ML community from the statistics community, written by Leo Breiman [3], whose peripatetic trajectory [4] contributed to his role as culture broker (and tribal critic).

Breiman, Leo. "Statistical modeling: The two cultures (with comments and a rejoinder by the author)." Statistical science 16, no. 3 (2001): 199-231.

Fast forward to (near) present day, we'll look at a 2015 review article by two of the most famous living names in machine learning -- Michael Jordan [5] and Tom Mitchell [6], who founded the 1st department of machine learning and has been writing definitive books defining the field since the early 1980s (the Simon lecture appears in print in an edited volume of which Simon is editor). This article lays out the "modern" partitioning of machine learning into

unsupervised (descriptive)
supervised (predictive)
reinforcement (prescriptive) learnings.

Jordan, Michael I., and Tom M. Mitchell. "Machine learning: Trends, perspectives, and prospects." Science 349, no. 6245 (2015): 255-260.

Missing from the above documents is the contemporary explosion of interest in "deep learning." An enjoyable nontechnical introduction is from New York Times Magazine last year. The piece is long (though readable and enjoyable!). Please for this class make sure you read

"Prologue", and
"A Deep Explanation of Deep Learning".

Lewis-Kraus, Gideon. "The great AI awakening." The New York Times Magazine (2016): 1-37. available online via http://publicservicesalliance.org/wp-content/uploads/2016/12/The-Great-A.I.-Awakening-The-New-York-Times.pdf

OPTIONAL OTHER READINGS

There's plenty more to be said about machine learning. Those of you who wish to know more are encouraged to read some of the following:

Tom Mitchell's founding document for the department of machine learning at CMU:

Mitchell, Tom Michael. The discipline of machine learning. Vol. 9. Carnegie Mellon University, School of Computer Science, Machine Learning Department, 2006.

https://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf

An essay by your professors on the subject, available via Slack:
Consider reading the rest of "The Great AI Awakening!"
A deeper --- and highly selective --- dive by a practitioner of ML:

Nilsson, Nils J. The quest for artificial intelligence. Cambridge University Press, 2009.

[1] Although usually attributed to IBM researcher Arthur Samuel for a 1959 publication, the phrase also appears that year in two conference proceedings, one of which is from NSA mathematician-cryptanalyst Howard Campaigne. In the 1950's there are several prior references to "learning machines" and "mechanized learning", including one from Claude Shannon.

[2] https://en.wikipedia.org/wiki/Herbert_A._Simon

[3] https://en.wikipedia.org/wiki/Leo_Breiman

[4] https://projecteuclid.org/download/pdf_1/euclid.ss/1009213290

[5] https://en.wikipedia.org/wiki/Michael_I._Jordan

[6] https://en.wikipedia.org/wiki/Tom_M._Mitchell

2017:

focus on some of the professional history of computing literature that's on our syllabus.

Starting with Lecture 11, we'll be experiencing how the development of AI -- among mathematicians, cognitive scientists, and the nascent computer science field --
comes to collide with academic and industrial statistics. Useful to that end will be a dive into two works exploring the intellectual and cultural turmoil of bringing explicitly computational techniques, e.g., data assimilation and simulation, into an existing scientific community.

Paul Edwards, A Vast Machine (MIT Press, 2010), chs 5-7. NOTE THIS IS A DIFFERENT ASSIGNMENT than on syllabus
Peter Galison, ["Computer simulations and the trading zone,"] (http://www.medientheorie.com/doc/galison_simulation.pdf) in The disunity of science: Boundaries, contexts, and power, ed. Peter Galison and David J. Stump (Stanford, CA: Stanford University Press, 1996), 118-157

The Edwards is amazing about data accumulation, modeling, and so forth, for climate data. It's also important and theoretically rich, and will give us more ways to think about data-driven sciences and their infrastructure.

Galison is about monte carlo, and a great follow up to what we've been doing.

(cf. https://data-ppf.slack.com/archives/C3SJQ5FH9/p1491032123318543 ):

For Tuesday, we're moving into more recent streams in machine learning and statistics.

Nilsson, Nils J. The Quest for Artificial Intelligence: A History of Ideas and Achievements. Cambridge ; New York: Cambridge University Press, 2010, Machine Learning, online version: https://ai.stanford.edu/~nilsson/QAI/qai.pdf ; only read these sections:

29.5 Unsupervised learning pp 513-515
29.6 Reinforcement learning pp 515-524
29.7 Enhancements pp 524-527

Nilsson is an AI researcher who contributed, among other things, the "A* algorithm" we discussed briefly on Thursday as a heuristic approximation to the shortest path algorithm.

Breiman, Leo. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (2001): 199–215. http://www.jstor.org/stable/2676681

Breiman is a New Yorker, Columbia alumnus, former merchant marine, pure mathematician, and, later, an evangelist for machine learning among the statisticians.

Cleveland, William S. “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.” International Statistical Review / Revue Internationale de Statistique 69, no. 1 (April 2001): 21. http://www.jstor.org/stable/1403527 Cleveland was a statistician at Bell Labs and worked closely with Tukey there.

optional:

Tukey's paper after 40 years, Colin Mallows Source: Technometrics, Vol. 48, No. 3 (Aug., 2006), pp. 319-325 https://www.jstor.org/stable/pdf/25471200.pdf is an assessment of the history of applied computational statistics, with particular attention to the paper "Frontiers of data analysis", Tukey 1962, which we read earlier.
For more on Breiman, with plenty of history, see Olshen, Richard. A Conversaton with Leo Breiman. Statist. Sci. 16 (2001), no. 2, 184--198. doi:10.1214/ss/1009213290 , http://projecteuclid.org/euclid.ss/1009213290