Style LoRA Training guide for Stable diffusion 1.5 and SDXL Concepts Results and Conclusion

Name: Style LoRA Training guide for Stable diffusion 1.5 and SDXL Concepts Results and Conclusion
Uploaded: 2023-08-17T19:09:45.000Z
Duration: 48 min 55 s

Introduction to Training a Style for Stable Diffusion 1.5 and SDXL with Laura

This video focuses on training a style for stable diffusion 1.5 and SDXL with Laura. It explains the concepts of style training, the differences from object or character training, data and training parameters, determining if a style is trained well, and what to expect from it.

Training Parameters and Data Preparation

To create your own style, start by defining a folder for each type of object you want to train.

Collect images from Google or various websites related to the specific object type.

Choose larger image sizes if possible for better results.

Organize the images in folders based on their category (e.g., animals, buildings, clothing).

Prepare the image folder by putting the images in subfolders named "Style" and "Regularization."

Use regularization images that match the desired style for one repeat.

Caption the images using captioning utilities like wd14 captioning with a prefix indicating the style name.

Removing Undesired Tags

After generating tags for captions, use a tool like Borrow Dataset Tag Manager to remove undesired tags related to your art style.

Review each caption if possible but removing common undesired tags like grayscale, monochrome, line art, etc., is recommended.

Save all changes periodically due to potential bugs in the software.

Conclusion

Training a style for stable diffusion 1.5 and SDXL requires careful data preparation and parameter selection. By organizing images into specific folders and removing undesired tags from captions, you can create your own unique art style.

New Section

This section focuses on selecting folders and giving suitable names for the output folder.

Selecting Folders and Naming Output Folder

Choose the "images" folder instead of other folders.

Do not select the "a class" folder unless regularization is needed.

Give a suitable name to the output folder, which doesn't have to be the same as the instance property.

New Section

In this section, we learn about naming parameters and choosing a suitable network rank and alpha values.

Naming Parameters and Choosing Network Rank and Alpha Values

The name given to parameters does not need to match the instance property.

Consider including version or other relevant information in parameter names.

Choose a suitable network rank, such as 32.

Lowering alpha value can result in smoother training.

Consider using values like 304 for convolutional ranker.

New Section

This section discusses determining the number of epochs based on steps and selecting a standard Laura model for training.

Determining Number of Epochs and Selecting Model

The number of epochs depends on the target number of steps.

For example, if there are 2 repeats with 400 images (800 steps per epoch), targeting 8,000 steps would require 10 epochs.

The choice of epochs also depends on testing requirements.

Use a standard Laura model for training, which is commonly used by people.

New Section

Here we explore generating captions for images using BF format and caching Latins to disk for multiple models.

Generating Captions and Caching Latins

Use BF format for captioning mixed bridge institutions with RTX Nvidia cards.

Cache Latins to disk to speed up training when creating multiple models.

Creating a small file for each image helps in testing new settings.

New Section

This section covers the learning rate, schedule, and advanced settings for training.

Learning Rate, Schedule, and Advanced Settings

Keep the learning rate as it is by default.

Use the same schedule as the default setting.

Consider using other adapters like Adam for lower VRAM.

Use 768 for batch size if not having different resolutions.

Enable pockets if working with different resolutions.

New Section

Here we discuss advanced settings related to samples and noise offset.

Samples and Noise Offset

Generate one sample every epoch to check if the style is working.

The number of samples depends on testing requirements.

Set noise offset according to recommended values.

New Section

This section emphasizes that training lower dimension networks is faster than higher dimension networks.

Training Lower Dimension Networks

Training lower dimension networks is faster than higher dimension networks.

Example: 1.8 iterations per second for C Bear dataset using an 8GB RTX 3070 GPU.

Check results during training to assess progress and acceptability of style.

New Section

In this section, we observe the results of training with different ebooks and styles.

Observing Results with Different Ebooks and Styles

Test different styles by running a Koya process with various ebooks (e.g., young woman, landscape).

Note that training lower dimension networks is faster than higher dimension networks.

Assess the quality of drawing style at different ebook stages.

New Section

This section discusses the results obtained from training with regularization.

Results of Training with Regularization

Trained for 5 ebooks and 8,000 steps using a regularization factor of 2.

The first ebook's results are acceptable but not good.

The second ebook shows very good drawing style.

Changes in style observed between the third, fourth, and fifth ebooks.

Images generated do not exist in the dataset, indicating learning of style rather than specific images.

New Section

This section concludes the discussion on training with regularization and provides final observations.

Final Observations on Training with Regularization

Trained for 5 ebooks and 8,000 steps using a regularization factor of 2.

First ebook's results are acceptable but not good.

Second ebook shows very good drawing style.

Changes in style observed between the third, fourth, and fifth ebooks.

Images generated do not exist in the dataset, indicating learning of style rather than specific images.

Model Selection and Training Options

In this section, the speaker discusses the model selection and training options for the SDXL-based model. They mention that it is possible to train with or without classification images and recommend testing both options with regularization to determine which produces better results.

Model Selection

The speaker selects the SDXL-based model.

They choose the "images" folder as input data.

Classification images are optional for training.

Training Options

Regularization is recommended for training.

Two repeats per image are used.

A set of one is chosen for mixed precision or FB or BS (suitable for NVIDIA training).

Default settings are kept for other parameters.

Image Resolution and Network Rank

This section focuses on image resolution and network rank settings for training the SDXL-based model.

Image Resolution

The recommended image resolutions are 1024 by 1024.

Pockets can be enabled if there are different resolutions in the dataset, but in this case, all files have a resolution of 1024.

Network Rank

Lower network ranks should be used compared to the original SD 1.5 model.

For example, if SD 1.5 has a network rank of 128, a value around 16 or 8 should be used in this case.

Experimental values like network Alpha 4 or 1 can be tried.

Advanced Settings and GPU Considerations

This section covers advanced settings and considerations related to GPU capabilities during training.

Advanced Settings

Gradient checkpointing can be used to reduce VRAM usage when GPU memory is limited (e.g., using a GPU with only 12GB VRAM).

Xformers can be used with additional parameters for training units only, but it may not produce significant differences.

GPU Considerations

The speaker mentions having a GPU with 24GB VRAM.

If the GPU has 12GB VRAM, gradient checkpointing is recommended to reduce VRAM usage.

Noise Value and Instance Prompt

This section discusses setting a noise value and using an instance prompt during training.

Noise Value

It is recommended to set a noise value based on recommended values such as 0.05 or 0.375 for SDXL training.

Instance Prompt

An existing instance prompt in SDXL can be used for faster training and potentially better results.

For example, if training a character similar to another character in SDXL, the name of the character can be put as the instance prompt.

Art Styles and Training Options

This section focuses on art styles and different training options for SDXL.

Art Styles

SDXL already covers most art styles, so selecting a specific art style may not produce significant improvements.

Training Options

The speaker starts testing with the "unit only" option.

Another test will be conducted without using units (full version 2).

The same parameters as before will be used for both tests.

Testing and Comparing Results

In this section, the speaker explains how to conduct testing and compare results using different settings.

Testing Process

Select a checkpoint (e.g., Photon for Stable Fusion 1.5) for testing.

Use XYZ prompts to replace values in the script.

Comparing Regularization vs. Non-Regularization

Test with regularization and without regularization.

Use a random seed to generate multiple images.

Comparing Different Weights

Test using different weights (e.g., Epoch 2 and 4) for style.

Generate pictures for each epoch and compare the results.

Results Analysis

This section focuses on analyzing the results obtained from testing different settings.

Style Strength and Regularization

Full strength weight of 1 produces the best results.

Regularization generally produces better results than without regularization.

The summary has been created based on the provided transcript.

Regularization and Results Comparison

The speaker discusses the impact of regularization on the results obtained in image sketching. They compare the outcomes with and without regularization, highlighting that regularization tends to produce better results.

Regularization Effects on Sketching Results

With regularization, the produced sketches appear more refined compared to those without regularization.

The speaker notes that at a weight of one, the results are good and acceptable.

When mixed with another lure at half the weight, it turns the image into black and white while slightly altering its appearance.

At a weight of two, although not always perfect, it still produces recognizable objects.

Further Testing and Flexibility of Regularization

The speaker emphasizes conducting additional tests using different checkpoints and weights to explore the effects of regularization further. They highlight that regularization generally provides more flexible results in terms of sketching and style.

Testing Different Checkpoints and Weights

Tests should be conducted on various checkpoints with different objects to observe how regularization performs.

At a weight of one or 1.5, good results resembling hand sketches are achieved.

Even at a weight of two, acceptable outcomes can be obtained for certain objects.

Regularization proves to be more flexible in producing desired sketching effects compared to other methods.

Style Learning in Stable Diffusion SDXL

The speaker discusses applying style learning techniques in stable diffusion SDXL. They compare different epochs and variations in training options to evaluate their impact on image style.

Comparing Different Training Options

Comparisons are made between different epochs and training options (e.g., unit only).

Results show slight differences between epochs but no significant improvement is observed with the unit only option.

The tree image style changes slightly when trained with units only, resulting in a smaller file size.

Regularization is suggested to enhance results even further.

Results and Conclusion

The speaker summarizes the findings of the study and draws conclusions based on the experimental results. They highlight the applicability of the principles discussed and emphasize the importance of regularization for achieving desired outcomes.

Findings and Final Conclusion

The principles discussed in this video can be applied to various styles.

Regularization consistently improves results compared to not using it.

Stable diffusion SDXL demonstrates potential for learning style, although it is an extension of existing techniques.

Further experimentation and testing are recommended to explore different strengths and variations in regularization.