GenAI

History of GenAI

  • https://huggingface.co/learn/llm-course/en/chapter1/4#a-bit-of-transformer-history
  • Jun 2017: Transformer Architecture Introduced by Google
  • Jun 2018: GPT released, first pretrained model
  • Oct 2018: BERT by Google released, better summaries
  • Feb 2019: GPT-2, bigger model, not released
  • May 2020: GPT-3. bigger model, no fine tuning needed
  • Jan 2021: DALL-E released, text to image generation, Diffusion model surpassed GAN
  • Jul 2022: Midjourney released
  • Aug 2022: Stable Diffusion released
  • Nov 2022: ChatGPT released based on GPT-3.5
  • Jan 2023: Llama by Meta, as open-weights, generate text on variety of languages
  • Mar 2023: Mistral (7B) outperformed Llama
  • Feb 2024: Sora released, text to video generation
  • May 2024: Gemma-2 (2B-27B) by Google, smaller models using distillation
  • Jan 2025: DeepSeek R1, as open-weights, used SFT+RL, very good cost efficiency

Attention Is All You Need (2017)

  • Introduced Transformers architecture
  • 8-members Google Research Team
  • Primarily trying to improve machine translation
  • Trying to replace RNN and LSTM (a type of RNN) entirely with attention
  • NLP tasks like translation, predicting next word, speech recognition etc. were dominated by RNN/LSTM before
  • Able to parallelize training which was hard before

Major Deep Learning Events

  • https://ai-watch.ec.europa.eu/tools/ai-history-timeline_en
  • https://medium.com/@lmpo/a-brief-history-of-ai-with-deep-learning-26f7948bc87b
  • https://www.devopsschool.com/blog/evolution-and-timeline-machine-learning/
  • 1950: Alan Turing proposed Turing Test
  • 1956: McCarthy coined the term “Artificial Intelligence”
    • at Dartmouth College summer AI conference
    • Birthplace of AI
  • 1957: Perceptron created by Frank Rosenblatt at Cornell
  • 1969: Kunihiko Fukushima introduced the ReLU activation function
  • 1986: Backpropagation introduced
    • by Geoffrey Hinton, Ronald Williams and David Rumelhart
  • 1989: Yann LeCun showed CNNs work for handwriting zip codes
    • used Multi layer Neural Network and Backpropagation
  • 1995: Improved Support Vector Machine
    • used for classification of text, handwritten characters and images
  • 1997: DeepBlue by IBM beats Chess world champion
  • 1998: Yann LeCun introduced LeNet-5, an improved 7-level CNN
    • automatically extracting features without needing to handcraft them
    • used by banks for recognizing handwritten numbers
  • 2011: Siri released in iPhone
  • 2011: IBM Watson wins TV show Jeopardy
  • 2012: AlexNet by Geoffrey Hinton, Alex Krizhevsky and Ilya Sutskever
    • from University of Toronto
    • based on CNN
    • winner of ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
    • reduced 26.2% to 15.3% error rate
  • 2013: Word2vec introduced by Tomas Mikolov
    • these models produced word embeddings
  • 2014: GAN introduced by Ian Goodfellow
    • can generate data in any format like image, music, speech
  • 2015: Diffusion models proposed
  • 2015: Google Voice Speech recognition uses LSTM
    • cut transcription errors by 49%
  • 2016: Google Translate uses LSTM
    • reduce translation errors by 60%
  • 2016: AlphaGo beats world champion
    • used Reinforcement Learning
  • 2019: Deepfake created by Samsung based on GAN
    • can create videos by taking image as input
  • 2019: AlphaFold by Google
    • generates protein structure

Important Awards and Personalities

  • 2018 Turing Award: - Geoffrey Hinton, Yann LeCun, Yoshua Bengio
    • aka Godfather of Deep Learning
    • kept Deep Learning research alive
  • Ashish Vaswani
    • Lead author of Attention is all you need
  • Ilya Sutskever
    • Co-Author AlexNet
    • Co-Founder OpenAI
  • Andrej Karpathy
    • Popularized Deep Learning Education
    • Primary Instructor of first Deep Learning course at Stanford
  • Andrew Ng
    • AI education and industry adoption
    • Co-Founded Coursera
    • Founder of Google Brain
  • John Schulman
    • Co-Founder OpenAI
  • Key OpenAI Founders
    • Sam Altman
    • Elon Musk
    • Greg Brockman
  • Key Anthropic Founders
    • Daniela Amodei (former OpenAI)
    • Dario Amodei (former OpenAI)