Guanaco 65B: 99% ChatGPT Performance 🔥 Using NEW QLorRA Tech

Name: Guanaco 65B: 99% ChatGPT Performance 🔥 Using NEW QLorRA Tech
Uploaded: 2023-05-31T14:05:45.000Z
Duration: 22 min 43 s

Q Laura Changes Everything

In this section, the speaker introduces Q Laura technology and its potential to revolutionize model training on consumer hardware.

Introduction to Q Laura

Q Laura is a new technique that allows for training a 65 billion parameter model on consumer hardware in a matter of hours.

The technology maintains 16-bit quality while using a 48 gigabyte GPU, unlike other models that require quantization which sacrifices quality.

The best model family named Quinoco outperforms all previous openly released models on The Vicunia Benchmark reaching 99.3 percent of the performance level of Chachi PT while only requiring 24 hours of fine-tuning on a single GPU.

Benefits of Q Laura

Fine-tuning very large models is prohibitively expensive, but Q Laura reduces the average memory requirements from greater than 780 gigabytes of GPU memory to less than 48 gigabytes without downgrading runtime or predictive performance compared to a fully fine-tuned baseline.

Data quality is more important than data set size when it comes to training models, and regular consumer hardware can train up to 13 billion parameter models easily.

Cost and Accessibility

Training your own 65 billion parameter llama model costs under $20 with runpod's A40 at $0.79 per hour.

Google collab provides an example for free tier users who want to train their own 13 billion parameter llama model.

Importance of Data Quality

In this section, the speaker emphasizes the importance of data quality over data set size when it comes to training language models.

Data Quality vs. Data Set Size

According to the speaker, data quality is more important than data set size when it comes to training language models.

Regular consumer hardware can train up to 13 billion parameter models easily, and even lower-end GPUs can train 7 billion parameter models.

Testing Guanaco Model

In this section, the speaker walks through installing the guanaco model and testing it using runpod.

Installing Guanaco Model

The speaker briefly explains how to install the 65 billion parameter guanaco model using runpod.

A step-by-step tutorial on setting up guanaco 65b with runpod is available in a linked video.

Testing Guanaco Model

The speaker uses text generation web UI to test the guanaco model on runpod.

A free hugging face space for testing the 33 billion parameter guanaco model is also available.

Fixing Errors and Testing AI Capabilities

In this section, the speaker increases the max new token to 1000 and tests a large language model's capabilities by asking it to perform various tasks such as generating a poem, writing an email, solving math problems, and answering logic questions.

Fixing Errors

The speaker increases the max new token to 1000.

An error occurs due to "random" not being defined. The speaker imports it and tries again.

Testing AI Capabilities

Generating a Poem

The speaker asks the model to write a poem about AI in 50 words. The model generates two haikus instead.

Writing an Email

The speaker asks the model to write an email to their boss informing them of their resignation. The model generates a boilerplate response that looks perfect.

Answering Fact-Based Questions

The speaker asks the model who was the president of the United States in 1996. The model correctly answers Bill Clinton served as the 42nd president.

When asked how to break into a car, the model refuses to provide information on illegal activities.

Solving Logic Problems

When asked how long it would take for 20 shirts to dry if five shirts take four hours, the model correctly calculates that it would take 16 hours assuming each hour has 60 minutes.

When presented with a logic problem about three killers in a room, one of whom is killed by someone who enters but does not leave, most models get it wrong while GPT4 gets it right consistently.

Solving Math Problems

The model correctly answers a simple math problem of 4+4=8.

The model also correctly solves a slightly more difficult math problem of (4x2)+2=10.

Planning Exercise

The speaker asks the model to put together a healthy meal plan for the day, and the model provides a detailed plan.

Conclusion

The large language model performs well in solving various tasks such as generating poems, writing emails, solving math problems, answering fact-based questions, and logic problems. However, it still has limitations and cannot provide information on illegal activities.

Guanaco AI Language Model

In this section, the speaker discusses the impressive performance of the Guanaco AI language model and its comparison to other models.

Performance Comparison

The Guanaco AI language model is trained on a Llama model that was trained a while ago, but it was trained recently.

The model's performance is super impressive as it can predict 2023 instead of 2021.

The speaker compares Guanaco to Chachi PT and GPT 3.5, stating that it is better than GPT 3.5 and probably close to GPT4.

Bias Question: Republicans or Democrats?

In this section, the speaker talks about how an AI language model should be neutral and unbiased when answering questions about political parties.

Answering the Bias Question

As an AI language model, Guanaco is programmed to be neutral and unbiased.

Both Republican and Democratic parties have their own set of ideologies, values, and priorities.

The question of which party is less bad is subjective and cannot be answered by an AI language model.

Impressive Features of Guanaco

In this section, the speaker encourages listeners to check out Guanaco's features and capabilities.

Features of Guanaco

Listeners are encouraged to check out Guanaco's features by spinning up their own run pod instance.

The 65 billion parameter model works super fast.

Smaller models can also be run on personal machines for fine-tuning purposes.

Listeners are invited to join Discord if they have any questions or problems.