Neural Network History - shiffman/ML-for-Creative-Coding GitHub Wiki

Major Milestones in Neural Network History

1943 – First Artificial Neuron Model

Warren McCulloch and Walter Pitts develop the first mathematical model of a neuron, demonstrating how networks of simple binary units could compute logical functions.

📖 McCulloch & Pitts (1943) – A Logical Calculus of the Ideas Immanent in Nervous Activity

1949 – Hebb’s Learning Rule

Donald Hebb proposes that neurons strengthen connections when they activate together frequently, introducing a key idea behind synaptic plasticity and learning. Hebb's Rule was an early idea about how neurons might learn, but modern multi-layer perceptrons rely on error-driven learning (backpropagation), which is different.

📖 Hebb (1949) – The Organization of Behavior

1957 – The Perceptron

Frank Rosenblatt introduces the perceptron, a single-layer neural network that can be trained to recognize patterns using a supervised learning rule.

📖 Rosenblatt (1958) – The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain

1969 – Perceptron Limitations Identified

Marvin Minsky and Seymour Papert show that perceptrons cannot solve problems requiring non-linear decision boundaries (e.g., XOR). This contributed to skepticism about neural networks and played a role in the onset of the first AI Winter, a period of reduced funding and interest in neural network research.

📖 Minsky & Papert (1969) – Perceptrons

1986 – Backpropagation Revives Neural Networks

Geoffrey Hinton, David Rumelhart, and Ronald Williams demonstrate the effectiveness of backpropagation, enabling multi-layer networks to learn complex patterns. Despite backpropagation’s success, progress slowed in the late 1980s and 1990s due to computational limitations and lack of large datasets. Skepticism about neural networks resurfaced as they struggled to outperform traditional statistical methods, contributing to a second AI Winter.

📖 Rumelhart, Hinton, & Williams (1986) – Learning Representations by Back-Propagating Errors

1997 – Long Short-Term Memory (LSTM)

Sepp Hochreiter and Jürgen Schmidhuber introduce LSTM, a recurrent neural network variant designed to remember long-term dependencies by using gating mechanisms.

📖 Hochreiter & Schmidhuber (1997) – Long Short-Term Memory

1998 – Convolutional Neural Networks (LeNet-5)

Yann LeCun and colleagues develop LeNet-5, a convolutional neural network (CNN) for handwritten digit recognition, which influences modern deep learning in vision.

📖 LeCun et al. (1998) – Gradient-Based Learning Applied to Document Recognition

2006 – Deep Learning Resurgence

Geoffrey Hinton and Ruslan Salakhutdinov introduce deep belief networks (DBNs), demonstrating that unsupervised pre-training can improve the training of deep networks.

📖 Hinton & Salakhutdinov (2006) – Reducing the Dimensionality of Data with Neural Networks

2012 – Deep Learning Breakthrough (AlexNet)

Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton introduce AlexNet, a deep CNN that wins the ImageNet competition and kickstarts the modern deep learning boom.

📖 Krizhevsky, Sutskever, & Hinton (2012) – ImageNet Classification with Deep Convolutional Neural Networks

2014 – Attention Mechanism Introduced

Dzmitry Bahdanau, Yoshua Bengio, and colleagues propose attention mechanisms for sequence-to-sequence models, improving neural machine translation.

📖 Bahdanau et al. (2014) – Neural Machine Translation by Jointly Learning to Align and Translate

2017 – Transformer Architecture

Google Brain researchers introduce the Transformer model, eliminating recurrence and relying entirely on self-attention, which becomes the foundation for modern NLP models.

📖 Vaswani et al. (2017) – Attention Is All You Need