Day-26 | Multi Stage Docker Builds | Reduce Image Size by 800 % | Distroless Container Images | #k8s

Name: Day-26 | Multi Stage Docker Builds | Reduce Image Size by 800 % | Distroless Container Images | #k8s
Uploaded: 2023-02-10T13:30:25.000Z
Duration: 1 h 2 min 31 s

Introduction and Overview

In this video, Abhishek introduces the topic of multi-stage Docker builds and discusses the importance of understanding production issues with Docker. He also mentions that he will cover the concept of "destroyless images" and provide practical examples.

Understanding Production Issues with Docker

Abhishek explains that it is important to be prepared to answer interview questions about practical problems faced with Docker or containers.

He mentions that many people have requested interview-based scenario sessions on this topic.

The goal is to learn about multi-stage Docker builds and destroyless images practically and theoretically.

Concept of Multi-stage Docker Builds

Abhishek introduces the concept of multi-stage Docker builds and explains its relation to creating efficient and optimized Docker images.

Usual Process for Writing a Dockerfile

Abhishek uses an example of writing a calculator application as a Python application running in a Docker container.

He outlines the usual steps involved in writing a Dockerfile for this scenario, including using a base image (e.g., Ubuntu), setting the work directory, installing dependencies, building binaries, and executing the application.

However, he points out that there is a problem with this approach as it results in an image with unnecessary overhead.

Problem with Traditional Approach

The issue lies in using a base image like Ubuntu that comes with additional system dependencies and packages not required for running the application.

Abhishek emphasizes that only the Python runtime is needed to execute the calculator application, not all the extra components included in the base image.

This leads to inefficient use of resources and larger image sizes than necessary.

Introduction to Multi-stage Builds

To address this problem, Docker introduced multi-stage builds as a solution.

With multi-stage builds, you can split your Dockerfile into multiple stages or parts.

Each stage can have its own base image and set of instructions, allowing you to separate the build environment from the runtime environment.

Benefits of Multi-stage Docker Builds

Abhishek explains the benefits of using multi-stage Docker builds, including improved efficiency and reduced image size.

Separating Build and Runtime Environments

Multi-stage builds allow you to separate the build environment, where dependencies are installed and binaries are built, from the runtime environment, where only the necessary components for running the application are included.

This separation results in smaller and more optimized images.

Optimized Image Size

By discarding unnecessary components used during the build stage, such as development libraries or intermediate artifacts, you can significantly reduce the final image size.

The resulting image contains only what is required to run the application efficiently.

Example with Java Application

Abhishek provides an example with a Java application, highlighting that during the build stage, many Java libraries may be needed for compilation.

However, at runtime, only the Java Runtime Environment (JRE) and necessary binary files are required.

Using multi-stage builds allows you to create a minimal runtime image without including all the development dependencies.

Conclusion

Abhishek concludes by summarizing the concept of multi-stage Docker builds and their benefits in creating efficient Docker images.

Recap of Multi-stage Builds

Multi-stage builds involve splitting your Dockerfile into multiple stages or parts.

Each stage has its own base image and set of instructions for building different aspects of your application.

Benefits of Multi-stage Builds

Multi-stage builds help optimize Docker images by separating build and runtime environments.

They result in smaller image sizes by discarding unnecessary components used during the build stage.

Abhishek encourages viewers to watch the entire video for practical examples and provides a link to the GitHub repository where the content can be found.

Docker Multi-Stage Builds

In this section, the speaker explains the concept of multi-stage builds in Docker and how it can be used to optimize image size and reduce complexity.

Using Multi-Stage Builds in Docker

Multi-stage builds involve dividing the build process into multiple stages within a single Dockerfile.

The first stage is typically a base image that includes all the necessary dependencies for building the application.

The second stage is where the final image is created, containing only the required runtime environment and the built artifact.

By using different base images for each stage, you can choose a minimal image for the final runtime environment while still having access to all necessary dependencies during the build process.

Advantages of Multi-Stage Builds

One advantage is reducing image size significantly. Only the final stage will be included in the resulting image, eliminating unnecessary files from previous stages.

Another advantage is simplifying complex applications. Instead of installing all dependencies in a single base image, you can use separate stages for different components (e.g., frontend, backend, database) and install specific dependencies as needed.

This approach helps avoid bloating your Docker image with unnecessary packages and reduces overall complexity.

Example: Three-Tier Architecture Application

For a more complex application with frontend, backend, and database components, multi-stage builds can be particularly beneficial.

Each component can have its own stage within the Dockerfile, allowing for selective installation of dependencies based on specific requirements.

By using rich base images that include all necessary dependencies at each stage, you can simplify the build process without worrying about excessive image size.

Benefits of Multi-Stage Builds

In this section, the speaker further emphasizes the benefits of multi-stage builds by comparing them to traditional approaches.

Traditional Approach vs. Multi-Stage Builds

Traditional Approach

In the traditional approach, all dependencies for different components are installed in a single base image.

This can lead to a large and complex Docker image, as each component requires its own set of dependencies.

For example, installing Java, React, MySQL, and other dependencies in a single base image can result in a bloated image size.

Multi-Stage Builds

With multi-stage builds, you can use separate stages for each component and install only the necessary dependencies at each stage.

This allows for a more streamlined and efficient build process, reducing complexity and resulting in smaller final images.

By choosing rich base images that already include common dependencies, you can avoid manually installing them in every stage.

Advantages of Multi-Stage Builds

The primary advantage is reducing image size significantly. Only the final stage is included in the resulting Docker image.

Simplifies the build process by separating components into individual stages with their own specific requirements.

Avoids bloating the Docker image with unnecessary packages and reduces overall complexity.

Conclusion

Multi-stage builds in Docker provide an effective way to optimize image size and simplify complex application builds. By dividing the build process into multiple stages within a single Dockerfile, it becomes easier to manage dependencies and reduce unnecessary bloat. This approach is particularly beneficial for applications with multiple components or tiers.

Building a Multi-Stage Docker Image

In this section, the speaker explains how to build a multi-stage Docker image to reduce the final image size.

Building a Multi-Stage Docker Image

A multi-stage Docker build allows for reducing the final image size by using different stages.

Start with a base image, such as OpenJDK, and copy necessary files from it.

Create an alias for the base image and copy specific files needed for the application.

As part of the entry point or CMD, execute the specific application file.

The advantage of multi-stage Docker builds is reducing the image size significantly.

Introduction to Minimalistic Images

This section introduces minimalistic or "destroyless" images and their advantages in terms of reduced size and increased security.

Minimalistic Images

Minimalistic images are very lightweight Docker images that only contain necessary runtime environments.

Choosing minimalistic images like Python destroyless images can greatly reduce container image size.

Destroyless images may not even include language runtimes if they are not required, further reducing image size (e.g., Golang applications).

The main purpose of minimalistic images is to have a minimal runtime environment for executing specific applications.

Advantages of Destroyless Images

This section highlights the advantages of using destroyless images, including reduced container size and improved security.

Advantages of Destroyless Images

Destroyless images significantly reduce container image sizes compared to traditional base images like Ubuntu or Python runtimes.

By having only necessary runtime environments, destroyless images improve security by minimizing potential vulnerabilities exposed by hackers.

The transcript does not provide any additional information beyond this point.

Advantages of Digitalized Images and Security

In this section, the speaker discusses the advantages of using digitalized images in terms of security. They explain that by implementing digitalized images, applications are not exposed to operating system vulnerabilities. The speaker also highlights that using languages like Go provides an even higher level of advantage due to not requiring a runtime.

Benefits of Digitalized Images for Security

Digitalized images provide a high level of security and protect applications from operating system-related vulnerabilities.

Applications written in languages like Go have an additional advantage as they do not require a runtime, making them even more secure.

Example Demonstration: Multi-stage Docker Build with Golang

In this section, the speaker introduces an example demonstration related to multi-stage Docker builds using Golang. They provide information about their GitHub repository called "Docker 0 To Hero" where all the Docker classes are available. The speaker explains that they have prepared content on networking and volumes for future classes.

Example Repository and Folder Structure

The example demonstration is available in the "Docker 0 To Hero" repository on GitHub.

The examples folder contains the demonstration for multi-stage Docker build using Golang.

Cloning the Repository and Executing the Docker File

In this section, the speaker demonstrates how to clone their GitHub repository and execute the provided Docker file for the example demonstration.

Steps to Clone Repository and Execute Docker File

Clone the "Docker 0 To Hero" repository from their GitHub organization.

Navigate to the examples folder and locate the "golang multi-stage Docker build" folder.

Execute the Docker file to see the difference between using multi-stage Docker build and not using it.

Calculator Application Demonstration

In this section, the speaker demonstrates a calculator application that will be containerized and used for comparing the results of multi-stage Docker builds.

Calculator Application Functionality

The calculator application allows users to input calculations and provides the corresponding output.

The demonstration showcases basic calculations like multiplication and division.

Docker File Comparison: With and Without Multi-stage Build

In this section, the speaker compares two versions of a Docker file - one with multi-stage build and one without.

Steps in Docker File Comparison

The first version of the Docker file does not use multi-stage build. It installs Go language, copies source code, builds the binary, and runs it as an entry point.

The second version of the Docker file utilizes multi-stage build by copying source code, building the Go binary, and running it without requiring a runtime. This results in a significantly smaller image size compared to the non-multi-stage version.

Please note that these summaries are based on limited information from specific timestamps in the transcript provided.

Building the Calculator Application with Docker

In this section, the speaker demonstrates how to build a calculator application using Docker and discusses the size of the resulting Docker image.

Building the Initial Docker Image

Use Docker build -t <tag> command to build the Docker image for the calculator application.

The initial image size is 861 MB, which may be considered too large for a basic calculator application.

Reducing Image Size with Multi-stage Builds

The concept of multi-stage builds can significantly reduce the size of Docker images.

Splitting the Dockerfile into two stages: a base image stage and a final stage.

In stage one, use an Ubuntu base image to install dependencies and compile source code.

In stage two, use a minimalistic distro-less image called "scratch" as the base image.

Copy only the necessary binary from stage one to stage two using COPY --from=<stage> syntax.

Execute the binary as part of the entry point in stage two.

Resulting Image Size

After implementing multi-stage builds, the size of the final Docker image is reduced to 1.83 MB.

This reduction in size is approximately 800 times smaller than the initial image size.

Using Distro-less Images for Python or Java Applications

The speaker explains that while using scratch (minimalistic distro-less) images works well for Go language applications, it may not be suitable for Python or Java applications due to their runtime requirements. Alternative options are discussed.

Installing Runtimes on Scratch Images

If using scratch images for Python or Java applications, additional steps are required to install their respective runtimes on top of scratch.

For Python applications, choose python-based digitalized images available on platforms like GitHub repositories.

Similarly, for Java applications, choose Java-based digitalized images.

Advantages of Multi-stage Builds and Distro-less Images

The speaker emphasizes the advantages of using multi-stage builds and distro-less images in containerization.

Size Reduction and Security

Multi-stage builds significantly reduce the size of Docker images.

Distro-less images eliminate unnecessary components, resulting in smaller image sizes.

Smaller image sizes improve efficiency and reduce resource consumption.

Using distro-less images also enhances security by reducing vulnerabilities.

Future of Containers

Understanding multi-stage builds and distro-less images is crucial for the future of containerization.

Moving towards these concepts ensures secure and efficient container deployment.

Interviewers often look for knowledge in these areas when discussing containerization.

Finding Distro-less Images

The speaker provides guidance on finding distro-less (digitalized) images for different programming languages.

Searching GitHub Repositories

Search for "distroless" or "digitalized" images on GitHub repositories.

A GitHub repository dedicated to digitalized images can be found, containing various folders with different language-specific options.

New Section Using Digitalized Images with Open JDK 11 and Open JDK 17

In this section, the speaker discusses the use of digitalized images with Open JDK 11 and Open JDK 17. They explain how to replace existing images with digitalized ones and highlight the advantages of using different programming languages.

Using Digitalized Images with Open JDK 11 and Open JDK 17

To use digitalized images, you can replace existing images with them.

If you want to use Open JDK 11, you will need to use a specific digitalized image for it.

Similarly, if you want to use Open JDK 17, there is a separate digitalized image available for it.

The speaker mentions that they have personally tried using both versions in a Java application but ultimately chose Golang.

Advantages of Digitalized Images

When using a Java application, the image size decreases to around 200 MB when using digitalized images.

However, when using a Golang application with minimal dependencies, the image size reduces significantly to only 1.83 MB.

This highlights one of the biggest advantages of using digitalized images - reduced image size.

Conclusion

The speaker concludes by stating that they hope the video was informative and that viewers have learned from it. They encourage viewers to like the video if they found it helpful and share any feedback in the comment section.

Timestamps are provided where available.