How To Install TextGen WebUI - Use ANY MODEL Locally!

Name: How To Install TextGen WebUI - Use ANY MODEL Locally!
Uploaded: 2023-06-19T16:24:50.000Z
Duration: 18 min 47 s

Introduction to Text Generation Web UI

In this section, the speaker introduces the Text Generation Web UI, an open-source software for running large language models locally on a computer. The installation process and setup are discussed.

Installing Text Generation Web UI

To install the Text Generation Web UI, visit the GitHub repository by uba Booga.

One-click installers are available for Windows, Linux, Mac OS, and WSL but may not work as expected.

The recommended method is to use conda for installation.

Open the terminal and ensure that conda is already installed.

Create a new conda environment using the command conda create -n t g t g (replace t g t g with any desired name).

Activate the environment using conda activate TG.

Install torch libraries required for text generation using pip3 install torch torchvision torchaudio along with a specific URL.

Cloning Repository and Installing Python Modules

Clone the repository by clicking on the "Code" button and copying the URL.

In the terminal, use git clone <URL> to download all files from the repository.

Change directory into the cloned folder using cd text-generation-web-ui.

Install all required python modules using pip install -r requirements.txt.

Setting Up GPU Support (Optional)

If you want to use GPU support for running models, follow these steps:

Uninstall Llama CPP python module using pip uninstall -y <module_name>.

Set environment variables:

set cmake args=<value>

set ForceCmake=1

Reinstall Llama CPP python module with no cache directory flag.

Verifying CUDA Availability

Run a checker file (checker.py) to ensure CUDA is available for torch.

Execute the command python checker.py and check for a successful output.

Starting the Server

Start the server using the command python server.py.

If any errors occur, follow specific troubleshooting steps mentioned in the transcript.

The summary has been created based on the provided transcript.

How to Download and Run Models

In this section, the speaker explains three different methods for downloading and running models using Hugging Face.

Downloading Models

The first method is to go to huggingface.co/thebloke and find the desired model. Look for the gptq version as it is usually easier to run.

Click on the model and copy the URL.

In the Text Generation Web UI folder, paste the URL in the "Download Custom Model" section and click download.

The second method involves using a Python script from the command line. Open multiple terminals in the Text Generation Web UI folder and use the command python download-model.py followed by pasting in the author/model name.

The third method is manual. From huggingface.co/thebloke, navigate to Files and Versions. Download all required files, including the main file which is usually large. Place these files in the models folder of Text Generation Web UI.

Exploring Settings and Prompt Templates

This section covers additional settings in Text Generation Web UI and how to use prompt templates.

Settings

Generally, there's no need to modify settings, but they offer additional features.

Check The Bloke's Hugging Face page for specific instructions on setting up certain models.

CPU flag indicates running on CPU only; Auto devices flag allows running on both GPU and CPU.

Prompt Templates

Prompt templates are necessary for running models effectively.

In Text Gen Web UI's text generation tab, a list of prompt templates can be found.

If a specific template isn't available, users can create their own or try different ones that match with what The Bloke recommends.

Using Chat-based Models and Fine-tuning

This section explains how to use chat-based models and fine-tune the model's responses.

Chat-based Models

To run a chat-based model like Samantha, select "chat" in the interface mode settings.

Apply and restart the interface to have a chat-like interface where the model remembers previous prompts and responses.

Additional settings in the chat settings tab allow for uploading chat history and fine-tuning how the model responds.

Fine-tuning

The training tab is used for fine-tuning models, although it is not covered in this video tutorial.

Conclusion

The speaker concludes by summarizing what was covered in the video and encourages viewers to seek help if needed.

Viewers now know how to download Text Generation Web UI, download models, and run them on their local machines.

If there are any issues with setup, viewers can reach out for assistance on The Bloke's Discord server.

Viewers are encouraged to like, subscribe, and consider future tutorials from The Bloke.