Notes on The Turing Lectures: "What is generative AI?"

 The Turing Lectures: What is generative AI?

Generative - create new content (audio, code, images, text, video)

Artificial Intelligence - automatically using a computer program "a lot of myth about it, its just a tool", "we have to spell it out"

Generative AI is not a new concept. Google translate (17 yo) is an example of generative AI.

Another example is Apples Siri (11yo), another generative AI. Phone autocomplete or Google search autocomplete

Not that new so "what is the fuss"? Answer: chatGpt was launched. chatGpt is a lot more sophisticated than older AIs. You can have a conversation.

The technology is not extremely new. It uses language modeling. Given a context, the language model can predict

e.g. context: "I want to"

  • prediction: play, eat, shovel


AIs used to count to predict but latest use Neural Networks. Feed context (prompt text) into neural language model which predicts the answer.

Basic steps to build a language model:

  1. Step 1. Collect very large data (aka "corpus")  e.g. the web (wikipedia books, github, stack overflow)
  2. Step 2. Ask language model to predict the next word in a sentence
    • randomly truncate last part of input sentence, calculate probability of missing works, adjust and feedback to the model
    • e.g. "the trevi fountain is in ??"  Rome - good  Berlin - bad
  3. Step 3. Repeat over whole corpus. Keep going over months.
    • So it learns over time by practicing and correcting.


Neural Network Language Models

Parameters = # input units * other nodes


Transformers - "king of AI architectures"

 - uses mini neural networks


gpt: "generative pretrained transformer"

pre-training how is it done?

  • hire expert trainers who train the AI (correct answers)
  • in the wild, how people respond and provide multiple replies for users to choose


"Fine tuning" - specializing the network e.g. add medical data to generic pretrained model

gpt is fine tuned to be general purpose


How can a model become great? "Size matters, bigger is better"

Since 2018 model sizes have grown. gpt1 "ant brain" ...now gpt4 has 1 trillion parameters  "more than rat brain"

The more parameters, the more powerful the AI. When you get to half a billion parameters then "common sense reasoning" and "summarization" and "arithmetic" etc become possible and powerful

gpt4 has read almost all the human written text

gpt4 cost $100 million, openAI can do it because they have Microsoft backing them (not everyone can)


"HHH" framework i.e. Helpful, Honest and Harmless

  • How do you make the model HHH? You Fine tune it by letting it know if answers are correct or not.


LLMs generate hallucinations by fabricating nonexistent or false facts


Llama 2, a chatGpt like model from Meta

Comments

Popular posts from this blog

deep dive into Material UI TextField built by mui

angular js protractor e2e cheatsheet

react-router v6.4+ loaders, actions, forms and more