What is Hashing? Hashing Algorithm, Hash Collisions & Hash Functions
Understanding Hashing and Its Importance
Introduction to Hashing
- Hashing is a foundational concept in computer science, crucial for various applications.
- The video aims to clarify what hashing is, including hash functions, hash collisions, and the implications of these collisions.
Functions vs. Hash Functions
- A function typically takes inputs to produce outputs; however, not all functions are hash functions.
- A hash function must adhere to three essential rules:
- It must be deterministic (same input yields the same output).
- The output length should be fixed regardless of input size.
- The output must be irreversible (cannot deduce input from output).
Characteristics of a Good Hash Function
- An example illustrates that basic arithmetic functions do not qualify as hash functions due to variable output lengths.
- A valid hash function's output cannot reveal the original inputs; multiple inputs can yield the same result.
Creating a Valid Hash Function: hashify
- The
hashifyfunction adds two numbers and returns only the last digit, satisfying all three rules of a hash function:
- Deterministic: Same inputs always yield the same single-digit output.
- Fixed Length: Always returns one digit regardless of input size.
- Irreversible: Impossible to determine original inputs from a single digit.
Terminology and Concepts
- The result of a hash function is called a "hash," while converting an input into a hash is referred to as "hashing."
Exploring Another Example
- Another example involves returning the last character of a string as its hash:
- This method also satisfies all three rules for being considered a good hash function.
Understanding Hash Collisions
- Multiple different inputs can lead to the same output (e.g., strings ending with 'o'), which results in what is known as a "hash collision."
- Minimizing collisions is critical; ideally, no collisions should occur in an effective hashing system.
Understanding Hash Functions and Their Importance
The Basics of Hashing
- Developers cannot reverse engineer the actual password due to hashing. When logging in, the password is hashed again using the same function, ensuring that if it matches the stored hash, login is successful.
- A poor hash function can lead to vulnerabilities. For example, a password like "123 confringo" could generate a simple hash (lowercase 'o'), allowing an impersonator to log in with any string ending in 'o', such as "999 ascendio".
Hash Collisions and Security Risks
- High collision rates in hash functions increase security risks. Many different inputs yielding the same hash heighten the chances of malicious logins, emphasizing the need for effective collision minimization.
- Fortunately, there are established algorithms designed to minimize collisions. Algorithms like SHA256, bcrypt, and Argon2 have been developed through extensive research by computer scientists.
Practical Applications of Hashing
- Testing various hashing algorithms shows consistent output: identical input strings produce identical hashes regardless of algorithm used. This property ensures that reversing from output to input is impossible.
- The length of hashes remains constant across different input sizes; for instance, MD5 always yields 32 hex characters while SHA1 produces 40 hex characters. This consistency is crucial for data integrity.
Further Exploration and Use Cases
- Hashing has broad applications beyond passwords; it's integral in system design concepts like load balancing and technologies such as blockchain. Future discussions will delve into real-life use cases of hashing technology.