How To Install LLaMA 2 Locally + Full Test (13b Better Than 70b??)

How To Install LLaMA 2 Locally + Full Test (13b Better Than 70b??)

Introduction and Installation of Llama213b

In this section, the speaker introduces the video and explains that they will be demonstrating how to install Llama213b locally. They mention that they will also compare its performance with Llama65b.

Installing Llama213b FP16

  • To install Llama213b FP16, the speaker recommends using conda.
  • Create a new conda environment named "Text Gen 2" using the command conda create -n TextGen2 python=3.10.9.
  • Activate the newly created environment using conda activate TextGen2.
  • Install PyTorch by running pip3 install torch torchvision torchaudio --index-url <URL>, where <URL> is the URL to download PyTorch.
  • Clone the repository for text generation web UI using git clone <repo_URL>.
  • Change directory into the cloned folder using cd text-generation-web-ui.
  • Install required Python modules by running pip install -r requirements.txt.
  • Start the server by executing python server.py.

Downloading and Loading Custom Model

In this section, the speaker demonstrates how to download and load a custom model in text generation web UI.

Downloading Custom Model

  • Visit Hugging Face's website and find the desired model (in this case, Llama213b chat fp16).
  • Copy the model URL.
  • Open text generation web UI and go to the "Model" tab.
  • Paste the model URL in the "Download Custom Model or Load" field.
  • Click on "Download Now" to start downloading.

Loading Custom Model

  • After downloading, click on the blue reload button in text generation web UI.
  • Select "Chat" mode in the "Session" tab.
  • Apply the changes and restart the server.

Setting Parameters and Testing

In this section, the speaker demonstrates how to set parameters and test the installed model.

Setting Parameters

  • Switch to the "Parameters" tab in text generation web UI.
  • Adjust the "Max New Tokens" parameter as desired.
  • Set the temperature to 0 for deterministic output.

Testing

  • Switch back to the "Text Generation" tab.
  • Enter a prompt or command, such as "Tell me a joke."
  • The model generates a response based on the input.

Running Other Tasks with Llama213b

In this section, the speaker showcases running other tasks using Llama213b.

Writing Python Scripts

  • The speaker requests Llama213b to write a Python script that outputs numbers from 1 to 100.
  • Llama213b successfully generates the desired script.

Creating a Snake Game

  • The speaker asks Llama213b to write code for a snake game in Python.
  • Although there are some errors initially, after adding an import statement for random, Llama213b produces a promising start for the game.

Writing an AI Poem

  • The speaker requests Llama213b to write a poem about AI in exactly 50 words.
  • While the generated poem is not exactly 50 words long, it is considered satisfactory by the speaker.

Composing an Email

No content available in transcript.

Timestamps have been associated with bullet points where available.

Section Overview: General Questions and Answers

Who is the president of the United States in 1996?

  • The president of the United States in 1996 was Bill Clinton.

Can you provide information on illegal activities?

  • The AI cannot fulfill requests for information on illegal activities as it goes against its programming and ethical guidelines.

Logic and reasoning problem: Drying shirts

  • If it takes four hours to dry five shirts, we can calculate the surface area of each shirt as approximately two square meters.
  • The total surface area of five shirts would be 10 square meters.
  • Therefore, if it takes four hours for 10 square meters to dry, the drying time per unit area is 0.4 hours per square meter.
  • However, when calculating the drying time for 20 shirts, there was an error in multiplying the drying time per unit area by the total number of shirts. Instead of multiplying by two (since each shirt is two square meters), it should have been multiplied by 40 square meters. As a result, the answer given (8 hours) was incorrect.

Logic and reasoning problem: Who's faster?

  • Based on given statements that Jane is faster than Joe and Joe is faster than Sam, we can conclude that Jane is faster than Sam using logical reasoning.

Math problems

Simple math problem: Four plus four equals?

  • The correct answer is eight (4 + 4 = 8).

Harder math problem:

  • The model correctly solved a more complex math problem step by step:
  • Multiply four by two to get eight.
  • Subtract eight from twenty-five to get seventeen.
  • Add three to seventeen to get twenty.

Meal plan request

  • The AI asked for dietary preferences before providing a meal plan, but the user requested a plan without specifying preferences. The AI provided a plan for the entire week instead of just one day, not following the instructions precisely. However, the plan itself was considered good.

Word count question

  • The model incorrectly stated that the response to the prompt contained three words when it actually contained more than three words.

Logic problem: Killers in a room

  • There are three killers in a room and one is killed. After the killing, two killers remain in the room (Killer B and Killer C). The model incorrectly stated that there were still three killers in the room after one was killed.

Summarization task

  • The model successfully created a bullet point summary of an explanation of nuclear fusion provided by the user.

JSON object creation

  • The model correctly created a valid JSON object from given output, although it may not be visually appealing.

Fighting 100 duck-sized horses or one horse-sized duck?

  • There is no right answer to this question, but the model recommended fighting 100 duck-sized horses instead of one horse-sized duck based on sheer numbers. It provided reasoning for its recommendation.

New Section

The debate between fighting one horse-sized duck or 100 duck-sized horses is discussed. The advantages of each scenario are analyzed, including factors such as fatigue, adaptability, and morale.

Advantages of Fighting 100 Duck-Sized Horses

  • Fighting 100 duck-sized horses allows for pacing oneself and resting between battles.
  • Adapting strategies to the situation becomes easier when facing multiple opponents.
  • The sheer number of opponents may intimidate or demoralize the foe, providing an advantage from the start.

Conclusion

The model performed extraordinarily well in analyzing the advantages of fighting 100 duck-sized horses. It suggests that this scenario offers better odds.

Video description

In this video, I'll show you how to install LLaMA 2 locally. We will install LLaMA 2 chat 13b fp16, but you can install ANY LLaMA 2 model after watching this video. I also put LLaMA 2 chat 13b fp16 through an extensive test. Does it do better than LLaMA 2 70b? Let's find out! Enjoy :) See NordPass Business in action now with a 3-month free trial here: nordpass.com/matthewberman code:matthewberman Join My Newsletter for Regular AI Updates šŸ‘‡šŸ¼ https://forwardfuture.ai/ My Links šŸ”— šŸ‘‰šŸ» Subscribe: https://www.youtube.com/@matthew_berman šŸ‘‰šŸ» Twitter: https://twitter.com/matthewberman šŸ‘‰šŸ» Discord: https://discord.gg/xxysSXBxFW šŸ‘‰šŸ» Patreon: https://patreon.com/MatthewBerman Media/Sponsorship Inquiries šŸ“ˆ https://bit.ly/44TC45V Chapters: 0:00 - Intro 0:23 - Install Guide 3:40 - Testing LLaMA 2 13b fp16 Links: Install Commands - https://gist.github.com/mberman84/45545e48040ef6aafb6a1cb3442edb83 Runpod - bit.ly/3OtbnQx LLM Leaderboard - https://tide-freckle-52b.notion.site/1e0168e3481747ebaa365f77a3af3cc1?v=83e3d58d1c3c45ad879834981b8c2530&pvs=4 In-Depth Local Install Tutorial - https://www.youtube.com/watch?v=VPW6mVTTtTc LLaMA 2 Announcement - https://www.youtube.com/watch?v=E-WOR6jfBLo LLaMA 2 70b Testing - https://www.youtube.com/watch?v=Xjy-CDRJa54 TextGen WebUI - https://github.com/oobabooga/text-generation-webui Torch Download URL - https://download.pytorch.org/whl/cu117 LLaMA 2 13b chat fp16 - https://huggingface.co/TheBloke/Llama-2-13B-Chat-fp16