Don't like this style? Click here to change it! blue.css

LOGIN:
Welcome .... Click here to logout

Hashes, Passwords, Salts, Stretching

Here are the current cutting edge specs for password storage systems:

The stages of our story will be the following:

  1. Simple password systems
  2. Basic Hashing of strings
  3. Rainbow Tables
  4. Salting your hashes
  5. Stretching your hashes

Hashing

Here is a site where you can get a random dictionary word, hashed with a sophisticated hashing algorithm, and then recreate it in javascript:

See the Pen wMYbyw by Andy Novocin (@AndyNovo) on CodePen.

Play with that a bit, and let's collectively list properties of the hashing process.

Give yourself a challenge: fork and edit that pen. Replace https://websec.prof.ninja/scripts/sha512_with_answer.php with https://websec.prof.ninja/scripts/sha512.php. Now, using the language of your choice, find the word from https://websec.prof.ninja/dictionary.txt that created the hash challenge you came up with. If you're feeling lazy I have a cousin of what you need in python2 just below this.

Time how long it took you to reverse that hash.

If it costs $5/month to run a single core VM on AWS (EC2) which can hash at about the same rate as your timing, how much money would it take to reverse the hash of a password which is one random word? How about two random words? Three? Four? Five?

XKCD for anything

If your password were randomly chosen from a set of N options how large would N need to be to have it cost an opponent 10 million USD to crack your password?

Google the following hash: 4ce18085d50ac35207a3e167aafb7cfb42ada855a377bb3c19e0cc0ac55f0d72c862deb698ce7f68411ffb916489cb4563c1c4fbaab9c5fde46de9475713cb8a did it have any results? What word was hashed to make that digest?

If Google has indexed your hash digest before, even if it came from a set of N so large that it costs $10 Million to crack, does it still take $10 million to hack?

Take a password from the top 100 most common passwords, and take the sha256 hash of it, then google that. Now add some "random" bytes to the back of it, how much do you need to add before google doesn't recognize it?

That task shows you, by way of google, that hashing is not enough. Here I take a weak password and hash it using SHA256. Next I will add a keyboard cat salt to the end and hash that. Now the next task should convince you that salting helps protect your users from themselves (at least a little).

I'd like to take a second and talk algorithmic complexity, specifically the difference between \(\mathcal{O}(n)\) and \(\mathcal{O}(\log{}n)\) in terms of intuition:

Here's a table: http://cs.prof.ninja/slides/class3/#7 and the a Scale of the Universe

Rainbow Tables and Salts

Create a database where you have hashed every word in the dictionary using sha256 and sha512. If you use an index how does the search cost change vs no index in reverse hashing now? This idea is the rainbow table. Tools like John the Ripper use this kind of strategy.

Adapt the login SQL query/ies to now use a hash and a salt. It used to be something like select * from user_auth where userid=:userid and password=:password

Stretching

This is one of my favorite concepts. It is a simple trick that lets you make your hash function take arbitrarily long to compute. That means that you can change the math on your attackers. Instead of 2 million hashes per CPU-second it can be 1 hash per CPU second, or even a fraction of a hash per second. So now you are in control of setting how hard it is for an attacker to break you.

Now we will apply some asymptotic leverage to this situation by "stretching" the passwords. We will cobble together a more super-powered hashing scheme by repeating the hash over and over again.

Here is the pseudo-code:

$$ x_0 := "" $$

for \(r\) iterations: $$ x_i := h(x_{i-1} \parallel p \parallel s) $$

$$ K := x_r $$

Here the \( \parallel \) notation means concatenate as strings.

That notation is particularly mathy, so it might seem too intense. That is a knee-jerk reaction from math-anxiety. It's just different and you can treat it like learning the rules to baseball. Here is my simpler explanation:

\(x_0 := ""\) means set a variable \(x_0\) to the empty string, the \(:=\) notation is assignment, so if I wanted to set \(w\) to 23 I could write \(w := 23\) and you would read it more like a command than an observation.

Then the \(r\) iterations. This is saying that we want to loop over many hashes. Let me show you by example:

If \(x_0 :=\) "" then \(x_1\) will be the hash of "\(pass1234SALTHERE\)". Suppose that the hash of that is "\(abcd\)" (it wouldn't be of course but we're explaining here). Then \(x_2\) would be the hash of
"\(abcdpass1234SALTHERE\)" and so on and so on and so on.

Picking an r: The more often you iterate this process the more expensive it is to do a single hash. So for instance with \(r\) around 1 million we just increased the brute-force reverse cost to 1 core year even with just 2 random words. The cost of doing the hashing should be as large as our user can stand, normally this about 1 second.

Kata: Salt and stretch a passphrase with specified stretch factor

Test this out: Write a hash function that consumes, password, salt, r where r is the number of iterations/stretches we do. Hand compute the result for a specific password and salt when r=3 then make sure your function's output matches (even the screenshot I have above this task).

In Summary

1 Flag

Just login

2 Flag