Notes on The Turing Lectures: "What is generative AI?"
The Turing Lectures: What is generative AI?
Generative - create new content (audio, code, images, text, video)
Artificial Intelligence - automatically using a computer program "a lot of myth about it, its just a tool", "we have to spell it out"
Generative AI is not a new concept. Google translate (17 yo) is an example of generative AI.
Another example is Apples Siri (11yo), another generative AI. Phone autocomplete or Google search autocomplete
Not that new so "what is the fuss"? Answer: chatGpt was launched. chatGpt is a lot more sophisticated than older AIs. You can have a conversation.
The technology is not extremely new. It uses language modeling. Given a context, the language model can predict
e.g. context: "I want to"
- prediction: play, eat, shovel
AIs used to count to predict but latest use Neural Networks. Feed context (prompt text) into neural language model which predicts the answer.
Basic steps to build a language model:
- Step 1. Collect very large data (aka "corpus") e.g. the web (wikipedia books, github, stack overflow)
- Step 2. Ask language model to predict the next word in a sentence
- randomly truncate last part of input sentence, calculate probability of missing works, adjust and feedback to the model
- e.g. "the trevi fountain is in ??" Rome - good Berlin - bad
- Step 3. Repeat over whole corpus. Keep going over months.
- So it learns over time by practicing and correcting.
Neural Network Language Models
Parameters = # input units * other nodes
Transformers - "king of AI architectures"
- uses mini neural networks
gpt: "generative pretrained transformer"
pre-training how is it done?
- hire expert trainers who train the AI (correct answers)
- in the wild, how people respond and provide multiple replies for users to choose
"Fine tuning" - specializing the network e.g. add medical data to generic pretrained model
gpt is fine tuned to be general purpose
How can a model become great? "Size matters, bigger is better"
Since 2018 model sizes have grown. gpt1 "ant brain" ...now gpt4 has 1 trillion parameters "more than rat brain"
The more parameters, the more powerful the AI. When you get to half a billion parameters then "common sense reasoning" and "summarization" and "arithmetic" etc become possible and powerful
gpt4 has read almost all the human written text
gpt4 cost $100 million, openAI can do it because they have Microsoft backing them (not everyone can)
"HHH" framework i.e. Helpful, Honest and Harmless
- How do you make the model HHH? You Fine tune it by letting it know if answers are correct or not.
LLMs generate hallucinations by fabricating nonexistent or false facts
Llama 2, a chatGpt like model from Meta
Comments
Post a Comment