GPT-4 - How does it work, and how do I build apps with it? - CS50 Tech Talk
First, you’ll learn how GPT-4 works and why human language turns out to play such a critical role in computing. Next, you’ll see how AI-native software is being made. Taught by Ted Benson, founder of Steamship, MIT Ph.D., & Y Combinator Alum; and Sil Hamilton, researcher of emergent AI behavior at McGill University. Slides at: https://cdn.cs50.net/2023/spring/talks/gpt4/gpt4.pdf *** This is CS50, Harvard University's introduction to the intellectual enterprises of computer science and the art of programming. *** HOW TO SUBSCRIBE http://www.youtube.com/subscription_center?add_user=cs50tv HOW TO TAKE CS50 edX: https://cs50.edx.org/ Harvard Extension School: https://cs50.harvard.edu/extension Harvard Summer School: https://cs50.harvard.edu/summer OpenCourseWare: https://cs50.harvard.edu/x HOW TO JOIN CS50 COMMUNITIES Discord: https://discord.gg/cs50 Ed: https://cs50.harvard.edu/x/ed Facebook Group: https://www.facebook.com/groups/cs50/ Faceboook Page: https://www.facebook.com/cs50/ GitHub: https://github.com/cs50 Gitter: https://gitter.im/cs50/x Instagram: https://instagram.com/cs50 LinkedIn Group: https://www.linkedin.com/groups/7437240/ LinkedIn Page: https://www.linkedin.com/school/cs50/ Medium: https://cs50.medium.com/ Quora: https://www.quora.com/topic/CS50 Reddit: https://www.reddit.com/r/cs50/ Slack: https://cs50.edx.org/slack Snapchat: https://www.snapchat.com/add/cs50 SoundCloud: https://soundcloud.com/cs50 Stack Exchange: https://cs50.stackexchange.com/ TikTok: https://www.tiktok.com/@cs50 Twitter: https://twitter.com/cs50 YouTube: http://www.youtube.com/cs50 HOW TO FOLLOW DAVID J. MALAN Facebook: https://www.facebook.com/dmalan GitHub: https://github.com/dmalan Instagram: https://www.instagram.com/davidjmalan/ LinkedIn: https://www.linkedin.com/in/malan/ Quora: https://www.quora.com/profile/David-J-Malan TikTok: https://www.tiktok.com/@davidjmalan Twitter: https://twitter.com/davidjmalan *** CS50 SHOP https://cs50.harvardshop.com/ *** LICENSE CC BY-NC-SA 4.0 Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License https://creativecommons.org/licenses/by-nc-sa/4.0/ David J. Malan https://cs.harvard.edu/malan malan@harvard.edu
GPT-4 - How does it work, and how do I build apps with it? - CS50 Tech Talk
Introduction
The speaker introduces the topic of the talk, which is AI and GPT. They mention that there is a lot of interest in this field and provide a URL for attendees to try out chat GPT.
- The speaker introduces the topic of the talk.
- There is a lot of interest in AI and GPT.
- Attendees can try out chat GPT using the provided URL.
McGill University and Steamship
The speaker introduces guests from McGill University and Steamship who will be discussing how they are making it easier to build, deploy, and share applications using AI technologies.
- Guests from McGill University and Steamship are introduced.
- They will discuss how they are making it easier to build, deploy, and share applications using AI technologies.
What is GPT?
One of the speakers provides an overview of what GPT is.
- GPT can be described as large language models or universal approximators.
- It is a generative AI that uses neural networks.
- Some people consider it a simulator of culture while others see it as just predicting text.
How Are People Building with GPT?
One of the speakers discusses how people are building apps with GPT.
- The speakers have seen both experimental apps and those that have been turned into companies or side projects.
- Examples of apps built with GPT are discussed.
- Developers with a CS50 background should be able to pick up building with GPT quickly.
Research Grounding into How It Works
One of the speakers provides research grounding into how GPT works.
- The speaker has published research exploring what is possible with language models and culture.
- They will describe what GPT is and what is going on inside of it.
Introduction to GPT
In this section, the speaker introduces GPT as a large language model and explains how it produces a probability distribution over some vocabulary.
What is GPT?
- GPT is a large language model that uses Transformer architecture.
- It produces a probability distribution over some vocabulary and predicts the next word of the sequence based on given words.
- It has a vocabulary of 50,000 words and knows which words are likely to follow other words in some sequence up to 32,000.
- For every word given to GPT, it tries to predict what the next word is across 50,000 words and gives every single one of those 50,000 words a number that reflects how probable it is.
Scaling Up GPT
In this section, the speaker talks about scaling up GPT by giving it more compute power and information downloaded from the internet.
How does scaling up work?
- As we scale up the model and give it more compute power, we can run the model on GPUs so we can speed up the process.
- We can give the model lots of information downloaded from the internet and it learns more as it sees more examples of English on the internet.
- The frequent probabilities that it gives you get better as it sees more examples of English on the internet.
Training GPT3
In this section, the speaker talks about how GPT3 was trained and how it can generate new text.
How was GPT3 trained?
- The scientists at OpenAI gave GPT3 a whole bunch of examples of question and answers such that they first traded on the internet and then they train it with a curriculum.
- Because it learns to replicate the internet, it knows how to speak in a lot of different genres of text and a lot of different registers.
Overall, the speaker provides an introduction to GPT as a large language model that produces probability distribution over some vocabulary. The speaker also explains how scaling up works by giving more compute power and information downloaded from the internet. Finally, the speaker talks about how GPT3 was trained using examples of question and answers from the internet.
Introduction to GPT and Q&A Form
The speaker explains how GPT works in a question and answer format, and how it can interface with the world in a solid way. They also discuss instruction tuning and reinforcement alignment with human feedback.
GPT as a Language Predictor
- GPT is a language model that predicts words in the framework of a question and answer.
- Once the model knows it has to solve a problem, it can interface with the world in a solid way.
- There are tools that build on this Q&A form, such as AutoGPT, Lang Chain, and React.
Instruction Tuning and Reinforcement Alignment
- Instruction tuning is key to making models work in a Q&A form.
- Reinforcement alignment with human feedback helps improve language predictors.
Using GPT as an Agent
The speaker discusses how they see people building things atop GPT by calling it over and over again. They also give examples of what can be built using language models today.
Building Things with GPT
- Anything was game when microprocessors were invented, just like anything is game now with GPT.
- We can script computation itself over and over again using computers.
- We can build broader software by incorporating GPT into our algorithms.
Examples of What Can Be Built Using Language Models Today
- Companionship, such as a friend or coach.
- Question answering, from newsroom bots to homework help.
- Utility functions, such as reading every tweet on Twitter and telling you which ones to read.
Building Companionship Bots
In this section, the speaker discusses how to build companionship bots and provides an example of a Mandarin idiom coach.
Wrapping GPT in an Endpoint
- A companionship bot is like a friend with a purpose.
- One way to build it is by wrapping GPT or a language model in an endpoint that injects some particular perspective or goal into the prompt.
- The prompt needs to be engineered so that it consistently performs as desired.
Example: Mandarin Idiom Coach
- The Mandarin idiom coach is a bot that generates Chinese idioms based on prompts.
- It was built using four lines of code wrapped around GPT.
- The bot responds with an example of the generated idiom and encourages the user to keep going.
- Non-programmers can deploy such bots easily and interact with them over Telegram or the web.
Personalized Endpoints
- In the future, personalized endpoints may become more common for interacting with GPT.
- These endpoints allow users to talk to different companions who have been pre-loaded with specific personalities and tools.
Two Types of AI Apps
In this section, the speaker discusses two types of AI apps that are becoming increasingly common.
Companionship Apps
- Companionship apps are one of the first types of AI apps that we're seeing.
Question Answering Apps
- Question answering apps have become popular in recent months.
- There are a couple of different ways these apps can work, but the general framework is that a user queries GPT and wants it to say something specific about an article they wrote or something specific about their course syllabus or a particular set of documents on a topic.
- To build these kinds of apps, you need to take the documents you want it to respond to and cut them up into fragments. Then, turn those fragments into embedding vectors which approximate meaning. Finally, put them in a vector database so that when someone asks a question, you can search the database for similar vectors and return relevant information.
Embedding Vectors
In this section, the speaker explains what embedding vectors are and how they work.
What Are Embedding Vectors?
- Embedding vectors are lists of numbers that approximate some point of meaning.
- Different models produce different size vectors.
- Chunking pieces of text turns them into a vector that approximates meaning.
How Do They Work?
- You can think of embedding vectors as points in space, with dimensions representing different features.
- When someone asks a question, you can search the vector database for similar vectors and return relevant information.
Building a Question Answering App
In this section, the speaker explains how to build a question answering app.
Steps to Build a Question Answering App
- Take the documents you want it to respond to and cut them up into fragments.
- Load the document into your code and embed it.
- Use a prompt that says "You're an expert in answering questions. Please answer user-provided question using source documents results from the database."
- Search the vector database for similar vectors and return relevant information.
Building with Prompts
In this section, the speaker discusses how prompts can be used to build various applications. The speaker emphasizes that prompt engineering is a creative and efficient way to solve problems algorithmically.
Using Source Materials for Question Answering
- A steamship hosts fragments from a PDF and then passes it to a prompt.
- The prompt asks an expert to answer a question about what CS50 teaches using only the source documents provided.
- The source materials are dynamically loaded into the prompt, allowing for creative and efficient problem-solving.
Creative Tactical Rearrangement of Prompts
- Many things boil down to creative tactical rearrangement of prompts.
- Once something is software, it can be repeated over and over again.
- CS50 teaches students how to think algorithmically and solve problems efficiently by focusing on topics such as abstraction.
Building Question Answering Systems with Prompts Alone
- Engineers should aspire to be lazy and set themselves up so they can pursue the lazy path.
- You can build a question answering system with just a prompt by loading information into an agent that has access to GPT.
- This bot can be connected to platforms like Telegram or Slack.
Automating Basic Language Understanding Tasks
In this section, the speaker discusses low-hanging fruit applications that automate tasks requiring basic language understanding. These tasks include generating unit tests, looking up documentation for functions, rewriting functions, making something conform to company guidelines, doing brand checks, etc.
Utility Functions
- There are many utility functions that automate tasks requiring basic language understanding.
- These tasks are relatively context-free operations or scoped context operations on a piece of information that requires linguistic analysis.
Examples of Low-Hanging Fruit Applications
- Generate unit tests
- Writing tests can be tedious
- Automation saves time and effort
- Look up documentation for a function
- Automation saves time and effort
- Rewrite a function to conform to company guidelines
- Automation saves time and ensures consistency
- Do a brand check
- Automation saves time and ensures consistency
The Importance of AI in Creativity
In this section, the speaker talks about how AI can be used to enhance creativity and encourages people to experiment with it.
Using AI as a Tool for Creativity
- The speaker compares AI to a new tool that can be used in all areas of human endeavor.
- He encourages people to experiment with AI and see what they can create from a builder's, tinkerer's, or experimentalist's point of view.
- The speaker acknowledges that while starting with a problem is generally true when building something, it's okay to start by playing around with the new tool.
Domain Knowledge and Creativity
- The speaker emphasizes the importance of domain knowledge when using AI for creative tasks.
- He explains that many people approach the creative process by coming up with a big idea, generating possibilities, editing down what was over-generated, and repeating the process.
- This particular task is fantastic for AI because it allows the AI to be wrong. Pre-agreeing on deleting lots of generated content makes it okay if it's too long.
- The speaker encourages users not to think about this as a technical activity but rather an opportunity for them to grasp the steering wheel tighter since they have domain knowledge.
Example Application: Writing Atlas Project
- The speaker shows an example application called Writing Atlas Project where users can browse different short stories and generate suggestions based on their input.
- It searches through the collection of stories for similar ones and uses GPT (Generative Pre-trained Transformer) to generate suggestions specifically through the lens of that perspective.
- It provides very targeted reasoning over something informed by domain knowledge.
How AI is Transforming Imagery
In this section, the speaker discusses how AI is transforming imagery and the questions it raises around IP and artistic style.
AI and Imagery
- The speaker acknowledges that AI has been growing in the imagery world because humans are such visual creatures.
- He explains that the images generated by AI are just staggering.
- However, this also brings up a lot of questions around intellectual property (IP) and artistic style.
The Template for Using AI in Creativity
In this section, the speaker provides a template for using AI in creativity and emphasizes the importance of domain knowledge.
The Template for Using AI in Creativity
- The speaker explains that the template for using AI in creativity is approximately as follows:
- Come up with a big idea
- Over-generate possibilities
- Edit down what was over-generated
- Repeat
- He emphasizes that domain knowledge is crucial when using AI for creative tasks.
Domain Knowledge and Utility Apps
- The speaker encourages users to bring their domain knowledge to utility apps.
- He shows an example application called Writing Atlas Project where users can generate suggestions based on their input informed by domain knowledge.
Python and GPT-3: Building a Baby AGI
In this section, the speaker explains how to build a baby AGI (Auto GPT) using Python and GPT-3. The process involves searching for similar stories in a database, pulling out data, running prompts with domain knowledge in Python, and preparing the output.
Building a Baby AGI
- The process of building a baby AGI involves prompts for audience, topic, and explanation.
- A baby AGI is an emergent behavior that starts with simple steps and planning bots.
- It is limited to what it has been given the tools to do and what skills it has.
- Prompt engineering is critical in building a baby AGI as it requires little code but can be wild.
Multi-step Planning Bots
- Multi-step planning bots are useful in controlling interactions with GPT.
- Emergent behaviors like Conway's Game of Life generate human life by playing forward math equations that fit on a t-shirt.
- Embodied agents have the ability to self-direct their actions.
Tools for Agents
- Agents are GPT plus some bigger body in which it's living.
- Tools are ways in which the agent can choose to do things like ordering pizza or generating a to-do list.
Getting Started with AI
In this section, the speaker talks about how to get started with AI and the different types of apps that people are building.
Starting with AI
- To start experimenting with AI, download one of the starter projects.
- The starter project will prompt you on how to kickstart the process of iteration.
Categories of Apps
- People are building different categories of apps using AI.
- All these apps are within reach for everyone.
Twitter as a Resource
- Twitter is a great place to hang out and build things.
- There are many AI builders on Twitter publishing their work.
Managing Hallucination Problem in AI
In this section, the speaker discusses how to manage the hallucination problem in AI and some practical ways developers can use to mitigate it.
Understanding Hallucination Problem
- The model doesn't have a ground truth or any sense of meaning derived from training process.
- Everything that it says is true from its perspective.
- It tries its best to give you the best answer possible even if it means lying or conflating two different topics.
Mitigating Hallucination Problem
- Giving more examples tends to work for acute things but not for all domains like physics.
- Prepending "my best guess" can reduce hallucinations by 80%.
- Letting it Google or Bing can help reduce hallucinations.
External Database for GPT
The speaker discusses the limitations of GPT's internal database and how an external database can help reduce hallucinations and improve its abilities.
Limitations of Internal Database
- GPT's internal database is not deterministic enough for domain-specific knowledge.
- Continuous spaces make it difficult to determine what is true or false.
Benefits of External Database
- An external database can help reduce hallucinations and improve GPT's abilities.
- Using a team-based approach with multiple software agents can draw upon collective intelligence to solve problems.
Human Systems for Error Correction
The speaker discusses how human systems deal with errors and proposes a similar approach for programming models.
Human Systems for Error Correction
- Companies use teams to check each other's work and provide final sign-off.
- Programming models should consider using a team-based approach with multiple software agents.
Overengineering in Spacecraft vs. Hallucinations in GPT
The speaker compares overengineering in spacecraft to prevent errors with using multiple models to prevent hallucinations in GPT.
Overengineering in Spacecraft vs. Hallucinations in GPT
- Most spacecraft have three computers that must agree on a particular step before proceeding.
- Hallucinations are generally not systemic problems, but rather one-off instances that can be prevented by using multiple models working together.
Shorthand Interactions with GPT
The speaker explains how shorthand interactions with GPT work and why they tend to be successful.
Shorthand Interactions with GPT
- Language models approximate how people talk to each other.
- Shorthand interactions kick the agent into a certain mode of interaction that tends to work.
- GPT can simulate personalities and predict how conscious beings would react in a particular situation.
GPT as a Narrative
The speaker discusses how GPT can simulate personalities and interact with other characters in a narrative.
GPT as a Narrative
- A narrative has different characters with clearly defined roles.
- GPT assumes the personality of a character and interacts with others in the narrative.
Conclusion
The speaker concludes the discussion on GPT and mentions that pizza is outside.
Conclusion
- Using multiple models working together can prevent hallucinations in GPT.
- Shorthand interactions with GPT tend to be successful because they approximate how people talk to each other.
- GPT can simulate personalities and interact with other characters in a narrative.
GPT-4 and its Value
In this section, the speaker discusses GPT-4 and its value in passing LSAT. The speaker also talks about how finicky it is to prompt GPT-4.
GPT-4's Ability to Pass Tests
- GPT-4 passed LSAT when it was released in March.
- Prompting GPT-4 is very sensitive, and it requires a specific approach.
- People have modified prompts to make GPT-4 pass literacy tests.
Rationality of the Model
- The model's ability to reason demonstrates some sort of rationale or logic within the model.
- There are still ongoing experiments on how to prompt GPT models effectively.
Business Value of AI Apps
In this section, the speaker talks about companies that host their primary product on top of OpenAI. The value that OpenAI adds is like any company, which is making something people want.
Companies Hosted by OpenAI
- Some companies host their primary product on top of OpenAI.
- OpenAI provides prioritized access where your product might be, but it's up to you as a builder to combine it with your data, domain knowledge, interface, etc., which then helps apply it to something.
Economic Value and Experiments
- There are many experiments going on right now for fun and trying to figure out where the economic value lies.
- It is likely that GPT models will be incorporated into absolutely everything, and it will be just another thing that computers can do.
Incorporation of GPT Models
In this section, the speaker talks about how GPT models will likely slide into the ether and become another thing that computers can do.
Progression of GPT Models
- Today, we call these models GPT or LLMs, but tomorrow they will just slide into the ether.
- It is useful to think of this as a second processor that computers can use.
- The trajectory of this is just another thing that computers can do and will be incorporated into absolutely everything.
Tricks for Getting Reliable Results from GPT Models
In this section, the speaker talks about tricks for getting reliable results from GPT models.
Getting Reliable Results
- It's hard to get reliable results from GPT models.
- Some tricks include giving examples, asking directly, looking at prompts others have used, post-processing it for checking, having a happy path in which it's a one-shot and you get your answer and then a sad path in which maybe you fall back on other prompts.
Privacy Implications of AI Prompts
In this section, the speaker discusses the privacy implications of using AI prompts and how different companies are addressing these concerns.
Different Types of Companies
- There are SAS companies where you're using somebody else's API and trusting that their terms in service will be upheld.
- There are companies that provide a model for hosting on one of the big cloud providers. This is thought of as the Enterprise version of software.
- The maximalist version involves running your own machines and models.
Open Source vs Publicly Hosted Models
- Privately obtainable versions of models will cross a threshold over time, making them as good as publicly hosted ones.
- The question then becomes which one to use: the better aggregate intelligence or the open source available one for which you know it'll perform well enough because it's crossed the threshold.
Privacy Policy Update
- Childcpt recently updated their privacy policy to not use prompts for the training process. Up until now, everything went back into the bin to be trained on again.