# Hashes

## Data integrity checks

### Hash functions

Hash functions (take input and return fixed length output) (h=hash(m))

#### Data integrity checks

Needs to be very different for small changes. so typo has different hash for example. corrput data needs to be noticed.

#### Checksums

if two files are the same then hashes the same

#### Introduction

Want following properties for a hash function

Deterministic, so the same hash is always created.

Quick to compute hash

Cannot generate input from hash, except for brute forcing inputs

Small changes to document should cause large charges to hash, such that the two hashes appear uncorrelated

Can’t find multiple documents with the same hash, practically.

Can be used to verify files, check passwords.

So possible vulernabilities are:

Given hash, find message (Pre-image resistence)

Given input, find another input with the same hash (second pre-image resistance)

Collission resistance (find two inputs with same hash)

We want to prevent accidental changes to file, and deliberate changes to file. Vulerabilites are more importnat for latter.

### Pre-image attacks

Given hash value h, can we find message m?

### Second pre-image attacks

Given $$m_1$$, can we find $$m_2$$ with same hash?

### Hash collision

Can i find any two matching messages?

#### Hash collision attacks

I can get someone to vouch for one of the messages, and then claim they vouched the other.

### Salting

It is possible to brute force hashes, especially for smaller inputs such as short passwords.

If password hashes for a hashing algorithm were brute forced, then passwords could easily be recovered from another hash table.

To prevent this a salt can be added to the document.

If a password is “apple”, then instead the salt “xyz” could be added to create “applexyz”. This prevents the previous cracking of “apple” to be used.

The salt would then be stored in plaintext alongside the password hash.

## Examples of hash functions

### Data integrity checks

Hash functions (take input and return fixed length output) ($$h=hash(m)$$) Needs to be very different for small changes. so typo has different hash for example. corrput data needs to be noticed. Checksums (if two files are the same then hashes the same)

Brute force attacks

Pre-image attacks (h3 on attack, h3 on defence: Given hash value $$h$$, can we find message $$m$$?) Second pre-image attacks (h3 on attack, h3 on defence: given $$m_1$$, can we find $$m_2$$ with same hash?) Hash collisions (h3 on attack, h3 on defence: can i find any two matching messages? )