
Ways to save space on a drive/NAS: Compression and Deduplication
In this video I go over some of the methods I use to save space on my NAS and file shares. I go over ways I find files that can be deleted, remove duplicate files, and compress files to save space. Some of the programs I used in this video: Winidirstat: Disk usage statistics viewer https://windirstat.net/ Ncdu: Disk usage statistics viewer for command line https://dev.yorhel.nl/ncdu (also on most package managers) Fdupes: Basic duplicate file finder https://github.com/adrianlopezroche/fdupes (also on most package managers) Czkawka: Powerful duplicate finder, with fuzzy matching https://qarmin.github.io/czkawka/ 7zip: Compression tool with near the best compression ratio https://www.7-zip.org/ (on most package managers as p7zip-full) JpegXL: A new image compression format that offers some of the best compression ratio and lossless jpeg recompression https://jpegxl.info/ (the cjxl and djxl tools I used as available on most package managers in the libjxl package or on Github here https://github.com/libjxl/libjxl) Adobe DNG Converter: Converts RAW files to the DNG format, often with better compression https://helpx.adobe.com/camera-raw/using/adobe-dng-converter.html FFMPEG: Powerful video and audio compression tool https://ffmpeg.org/ Shutter encoder: GUI Compression Tool using FFmpeg backend https://www.shutterencoder.com/en/ Ghostscript: PDF tool with compression abilities https://ghostscript.com/docs/9.56.1/Install.htm (also on most package managers) 00:00 Intro 00:16 Visual disk usage viewers 01:03 Using find to remove matching files 01:34 Finding duplicate files 02:39 Filesystem deduplication 03:25 Replacing duplicate files with hard links 04:06 Compression overview 05:34 File compression with 7zip 07:27 Filesystem compression 08:14 Image compression with JpegXL 09:59 Lossless compression ration comparison 11:16 Raw image compression with DNG 12:13 Video compression with AV1 14:11 PDF Compression 15:07 Audio compression 15:53 Conclusion
Ways to save space on a drive/NAS: Compression and Deduplication
Ways to Save Space on Your NAS or Drive
In this video, the speaker discusses various methods to save space on a NAS or drive. They cover topics such as finding files to delete, removing duplicates, and compressing files. The speaker also mentions several software utilities that can be used for these purposes.
Finding Files to Delete
- Windows Stat and NCDU are useful utilities for visually analyzing drives and finding large files and file types.
- : Introduction to Windows Stat and its features.
- : Introduction to NCDU utility in Linux for visual analysis of drives.
- Dot DS Store files left by Mac systems can be safely deleted using the find utility in Linux.
- : Using the find utility with specific file names (e.g., .DS_Store) to remove unwanted files.
- : Using the find utility with specific folder names (e.g., temp) to identify folders that can be removed.
Removing Duplicates
- Fdupes is a command-line utility in Linux that helps identify duplicate files.
- : Demonstrating the use of f dupes command with recursive option to find identical files.
- GUI utilities like VisiPics can help identify both identical and similar duplicate files across different platforms.
- : Introduction to VisiPics as a GUI utility for identifying duplicate files based on content similarity.
- Other web pages provide additional utilities for removing duplicate photos or videos based on content analysis.
- : Mentioning web pages with utilities for removing duplicate files based on content analysis.
- Some file systems, such as ZFS, NTFS, Windows Server, and BTRFS, support built-in duplicate data removal.
- : Discussing file systems that support automatic duplicate data removal.
Using Hard Links to Save Space
- Replacing duplicate files with hard links can save space by linking multiple files to the same data on disk.
- : Explaining the concept of hard links and their advantages in saving space.
- The rdup command in Linux can be used to identify and remove duplicate files using hard links.
- : Demonstrating the use of rdup command with dry run option to identify files suitable for hard link replacement.
Compression Methods
- Compression can be categorized into lossy compression (some quality loss) and lossless compression (original file extraction).
- : Introduction to lossy and lossless compression methods.
- Various programs can be used for compressing different file types.
- : Mentioning the availability of different programs for compressing specific file types.
Overall, this video provides insights into efficient ways of managing storage space by finding and deleting unnecessary files, removing duplicates, utilizing hard links, and employing compression techniques.
Compression Methods and File Size Reduction
In this section, the speaker discusses different compression methods and their impact on reducing file size while maintaining quality. The importance of lossless data compression for text files is highlighted, while lossy compression is necessary for large data files like videos.
Lossless Compression
- Lossless compression reduces file size without compromising the data.
- It works by identifying repeated data patterns and removing them.
- Lossless compression is effective for compressible data like text and programming binary files.
- Examples of lossless compression algorithms include 7-Zip archiving program and NTFS file system's built-in compression feature.
Specific Compression Methods for Images
- Images can be large files due to the amount of pixel data they contain.
- JPEG has been a common format but newer formats like JPEG XL, AVIF, and HIF offer better compression with similar image quality.
- JPEG XL allows lossless recompression of JPEG images, resulting in smaller file sizes without losing image quality.
Other Rule of Thumb with Compression
This section explores another rule of thumb when it comes to compression: the more you know about the file, the better you can compress it. The speaker explains how using specific compression methods tailored to the type of file can lead to better results.
Understanding File Characteristics for Better Compression
- The more information you have about a file, the more effectively you can compress it.
- For example, using a lossless file compression algorithm specifically designed for images (like JPEG XL) will yield better results than simply zipping an uncompressed image.
- Tailoring the compression method to match the characteristics of the file improves both CPU time usage and overall compression ratio.
Lossy vs. Lossless Compression
This section delves into the differences between lossy and lossless compression methods. The speaker explains that lossy compression is suitable for large data files like videos, while lossless compression is necessary for text and compressible binary files.
Lossy Compression
- Lossy compression reduces file size by removing non-essential information.
- It is commonly used for video files where the raw sensor data is massive.
- Lossy compression allows significant reduction in file size without noticeable quality degradation.
- However, it may not be suitable for already compressed files like photos and videos.
Archiving Files with 7-Zip
This section demonstrates how to use the 7-Zip archiving program to compress and archive files. The speaker showcases the command line usage of 7-Zip on a Linux system.
Compressing Files with 7-Zip
- Using the command
7z a -mx=9
followed by the desired output filename and input folder, you can create a highly compressed archive file.
- Adjusting the
-mx
parameter allows you to balance CPU time usage and compression ratio.
- Archiving multiple files into one can improve file operations' speed.
- On Windows systems with 7-Zip installed, right-clicking on a file or folder provides similar options for creating archives.
File Compression at the File System Level
This section explores file compression at the file system level, specifically focusing on NTFS in Windows. The speaker demonstrates how to enable file compression under NTFS properties.
File Compression in NTFS
- Some file systems, like NTFS in Windows, support transparent file compression.
- By right-clicking on a folder, going to Properties > Advanced, and selecting "Compress contents to save disk space," you can compress all existing files within that folder.
- File compression saves disk space but may increase CPU usage when accessing compressed files.
Compression Methods for Images
This section discusses different compression methods specifically tailored for image files. The speaker highlights the advantages of newer formats like JPEG XL, AVIF, and HIF over traditional JPEG.
Newer Image Compression Formats
- Traditional image formats like JPEG can be improved upon by using newer formats such as JPEG XL, AVIF, and HIF.
- These formats offer better compression ratios while maintaining or even enhancing image quality.
- JPEG XL allows lossless recompression of existing JPEG images, resulting in smaller file sizes without sacrificing image quality.
File Compression Algorithms
In this section, the speaker discusses the effectiveness of different file compression algorithms and their impact on file size.
Specific Compression Algorithms
- The speaker demonstrates that more specific file compression algorithms tend to work better.
- Examples are given using different algorithms such as 7-Zip, PNG, and JPEG Excel.
- Newer algorithms like JPEG XL perform significantly better than older ones like PNG or 7-Zip.
- Lossy compression can remove fine noise that is not easily noticeable to humans but takes up a lot of disk space.
Raw Image Files
- Raw files store maximum information and can quickly consume space.
- Using Adobe's DNG converter with older lossless compressors can save some space.
- However, converting from lossy raw formats may result in bigger DNG files without any benefit.
Video File Compression
This section focuses on video file compression techniques and codecs.
Lossy Video Compression
- Video files are usually compressed using lossy compression to reduce file size.
- Compressing from lossy video to a more lossy format can further save space.
- AV1 codec is recommended for low bit rate but decent quality videos.
Encoding Videos with ffmpeg
- ffmpeg with SVT av1 preset 6 provides fast encoding for b-roll videos.
- Opus codec with lower bit rates is suitable for talking audio.
Alternative Tool: Shutter Encoder
- Shutter Encoder is a GUI-based program that offers similar features to ffmpeg.
- It supports various codecs and settings for video and audio encoding.
PDF Compression
This section discusses techniques for compressing PDF files.
- Adobe Acrobat provides detailed information on space usage in PDFs.
- Ghostscript's script utility can be used to compress images in PDFs, resulting in smaller file sizes.
The transcript is already in English, so there is no need to translate the headings or content.
New Section
The speaker discusses compressing audio files using different codecs and the trade-off between file size and quality.
Compressing Audio Files
- Uncompressed audio from a CD can be compressed using codecs like Opus to reduce file size with no quality loss.
- With Opus compression, the file size can be reduced to about half without any noticeable loss in quality.
- For music or situations where audio quality is important, it is recommended to stick with lossless compression if you have the original source.
- However, for general audio purposes, especially when listening on lower-quality headphones, compressing the audio to a smaller size can be beneficial.
New Section
The speaker continues discussing the benefits of compressing audio files for different media types.
Benefits of Compressing Audio Files
- Compressing audio files to a smaller size can be advantageous for various media types.
- It allows for easier storage and transfer of files.
- When listening on lower-quality headphones or devices, the difference in sound quality may not be noticeable.
- Examples of media types where compressing audio can be beneficial include:
- Audiobooks
- Podcasts
- Voice recordings
- Background music in videos