Build an LLM from Scratch 1: Set up your code environment

Build an LLM from Scratch 1: Set up your code environment

Introduction to the Coding Along Video Series

Overview of the Video Series

  • Sebastian Raschka introduces himself as the author of "Build a Large Language Model (from Scratch)" and explains that he will create supplementary videos for each chapter.
  • The videos aim to provide a casual yet interesting perspective on code examples, complementing the book's content.

Focus of Chapter One

  • Chapter one lacks code examples; thus, this video focuses on setting up a Python environment for running code from chapters two to seven.
  • Raschka emphasizes his personal preference for setting up Python and aims to keep it simple while using current recommendations.

Setting Up Your Python Environment

GitHub Repository Resources

  • A GitHub repository (RASBT/LLM-from-scratch) is available with setup recommendations and resources for readers.

Hardware Compatibility

  • The author mentions using an older MacBook Air, ensuring that the code runs well across various hardware configurations, including GPUs and older systems.

Alternatives to Local Setup

Cloud Computing Options

  • If local setup fails, Raschka recommends cloud resources like Lightning Studio and Google Colab for executing Jupyter Notebooks easily in a browser.

Personal Preferences in Python Setup

Optional Setup Preferences

  • The video demonstrates how Raschka would set up Python on a fresh macOS environment, highlighting optional preferences that users can choose to follow or not.

Package Management Tools

  • He discusses his previous use of Conda as a package manager but now prefers UV due to its speed and design advantages over traditional methods like PIP.

Installing Python

Installation Process Overview

  • Raschka outlines steps for installing Python, starting by checking if it's already installed via terminal commands.

Version Awareness

How to Install Python and Set Up a Virtual Environment

Installing Python

  • The recommended method for installing Python is by visiting the official website, python.org, where users can download versions suitable for different operating systems.
  • It is advisable to install an older version of Python (e.g., 3.12 or 3.11) as some packages like PyTorch may not support the latest releases immediately.
  • Using slightly older versions ensures compatibility with scientific computing tasks without missing out on essential features.
  • Homebrew is mentioned as a personal preference for installation; it’s a package manager for macOS that simplifies the installation of various tools, including Python.
  • Users can also install specific versions of Python using Homebrew commands (e.g., brew install python@3.11).

Setting Up Package Managers

  • After installing Python, the next step involves setting up UV, a faster alternative to pip, which can be installed via pip install uv.
  • UV enhances convenience in managing packages compared to pip and is recommended for better performance.

Creating a Virtual Environment

  • A virtual environment is crucial as it creates an isolated space on your computer where all packages are stored separately from system-wide installations.
  • This isolation allows users to experiment without affecting their main system setup; if issues arise, they can simply delete and recreate the environment.

Downloading Project Files

  • Users should download necessary project files from GitHub repositories and organize them conveniently on their desktops for easy access during setup.

Activating the Virtual Environment

  • To create a virtual environment using UV, users execute commands that specify the desired version of Python (e.g., uv env -p python3.11).
  • Once created, this folder contains its own version of Python and can be deleted or recreated if needed without impacting other projects.

Managing Visibility of Folders

  • Note that folders starting with a dot are hidden by default in Finder; enabling visibility settings will allow users to see these folders when necessary.

Final Steps in Setup

Setting Up Python Environment for Deep Learning

Virtual Environment Setup

  • The speaker demonstrates using a virtual environment in macOS to execute Python, ensuring that the correct version (Python 3.11) is utilized from the specified path.
  • Installation of packages can be done using uv pip install, with an example given for installing a package called "packages" and PyTorch, which is referred to as "torch" in Python.

Managing Dependencies

  • The speaker mentions that all necessary packages for the book are listed in a requirements.txt file, allowing users to install them all at once using pip install -r requirements.
  • A note on potential issues when running TensorFlow on Windows due to its requirement for accessing GPT weights, even though the book primarily uses PyTorch.

Handling TensorFlow Issues

  • Users may encounter problems with TensorFlow installation on Windows; it’s suggested to either remove TensorFlow from the requirements file or run PIP install without UV.
  • An alternative solution involves using Google Colab, which provides a cloud-based environment where notebooks can be executed without local setup complications.

Using Google Colab

  • In Google Colab, users can directly install required packages by copying the link from requirements.txt and executing it within their notebook.
  • Since Google Colab operates as a virtual environment itself, creating an additional virtual environment is unnecessary; users can simply use system commands.

Running Jupyter Lab

  • After installing packages, users can launch Jupyter Lab to work with notebooks. It opens in the primary browser and allows access to different chapters or creation of new notebooks.
  • The speaker recommends clearing outputs of existing notebooks before starting fresh coding sessions for better learning experiences rather than just copying code.

Conclusion and Support

Video description

Links to the book: - https://amzn.to/4fqvn0D (Amazon) - https://mng.bz/M96o (Manning) Link to the GitHub repository: https://github.com/rasbt/LLMs-from-scratch This is a supplementary video explaining how to set up a Python environment using uv. 00:00 Introduction 01:33 Setup info on GitHub 03:30 Optional setup preferences and uv 05:00 uv pip vs uv add syntax 05:35 1) Installing Python 09:05 2) Setting up uv 10:12 3) Creating a virtual environment 14:03 4) Installing packages 16:34 5) pip install fallback 16:57 6) If nothing works: Google Colab :) 19:07 7) uv run to run Jupyter Lab locally In particular, we are using `uv pip`, which is explained in this document: https://github.com/rasbt/LLMs-from-scratch/blob/main/setup/01_optional-python-setup-preferences/README.md Alternatively, the native `uv add` syntax (mentioned but not explicitly covered in this video) is described here: https://github.com/rasbt/LLMs-from-scratch/blob/main/setup/01_optional-python-setup-preferences/native-uv.md