How To Better Store Password In Database

How Would You Store Your User's Password?

How would you store a password if you were asked to create an authentication system? The easiest solution would be to store the plain text inputted by the users. For example, if the users inputs their email as codecurated@codecurated.com and password as weakpassword, then we can insert it into our users table, right?

When the user tries to log in, we can query the table with SELECT * FROM users WHERE email = {email} AND password = {password}. If the query returns a result, then we can authenticate the user. Task done? Well, not quite.

Problem With Storing Plain Text Password

Even though the solution above will work, it is not secure and prone to many attacks. The most direct attack that might not be obvious is an internal attack. In this attack, the people or employees who have access to your database (including yourself) can easily see the user's password and get their credentials.

A data breach is also another thing why plain text password is terrible. Someone you don't intend to might acquire your database, and it happens all the time, even to a large company. With a plain text password, the attacker can get all of your user's credentials by querying your users table.

Better Way of Storing User's Password?

The first step that you want to take is to hash your user password with a hashing function before storing it in the database. Unlike encryption, hashing function can only go one way, and the result of hashing a specific string will always be the same. This makes hashing function a very suitable process in password storing.

One of the most popular and secure hashing functions is SHA256. If we try to hash weakpassword with SHA256, the result would be:

9b5705878182ccecf493b6c5ef3d2c723082141d0af33432c997b52dcc9f3e71

Hashing function only goes one way, so we can't convert the hash result back to weakpassword. Also, every time weakpassword is hashed using SHA256, the result will always be the same.

Looks good. Now the attackers wouldn't be able to know the users' passwords even if they can access your database. But, it's not good enough.

A Rainbow table attack is an attack with a table of a precomputed hash of common passwords. With a rainbow table attack, the attacker will be able to get the credential of the users with weak passwords in your database. To combat this, salt is usually used when hashing a password. salt is a randomly generated value that you can combine with the password before hashing it. For example, if we generated jvFJ4 as a salt and combine it with weakpassword and hash it (sha256("jvFJ4weakpassword")) it will produce:

b104c5bf49e2e4937ac2419e94864f7209014a96cae582302f6e5f891e426e22

Which is a completely different result that hashed plain weakpassword.

Now with salt, our table will become:

What do we need to do when the user logs in?

  • The user will submit their email and plain password
  • The system needs to query the users table by email, e.g., SELECT * FROM users WHERE email='codecurated@codecurated.com' to get the hashed password and salt
  • Compute SHA-256(salt + inputted password)
  • Compare the result with the hashed password. If it's the same, it means the user has inputted the correct password, and the system can authenticate the user.

We have mitigated many attacks with this design, but is it enough? Well...

The approaches we created previously can mitigate a lot of attacks. But not dictionary attack. What if the attacker gets the salt, combines it with a common password, hash the combination, and compares it with the password in the database? Only time will separate the attacker from getting your user's password. And actually, time is one variable that we can tune.

SHA256 is unsuitable for password hashing as it is designed to hash a complex enough input(which a password often does not) and compute it quickly. I tried to do a hashing on weakpassword 10 million times with my AMD Ryzen 5 3600(6 Cores, 12 Threads @3.6GHz) CPU I can finish it very fast.

h := sha256.New()  
start := time.Now()  
for i := 0; i < 10000000; i++ {  
   h.Write([]byte("weakpassword"))  
}  
elapsed := time.Since(start)  
log.Printf("SHA256 took %s \n", elapsed)

The result is:

2022/07/28 16:57:00 SHA256 took 337.2232ms 

With SHA-256, the attacker would be able to hash and compare many passwords quickly with it. We need a slower hashing function.

Bcrypt

This is where Bcrypt comes in. Bcrypt is a password hashing function based on Blowfish in which you can determine its cost to run. This trait, in particular, is perfect for password hashing because it will future-proof the hashing function when a faster machine comes up.

Let's see Bcrypt it in action:

for i := 10; i < 21; i++ {  
   start := time.Now()  
   bcrypt.GenerateFromPassword([]byte("weakpassword"), i)  
   elapsed := time.Since(start)  
   fmt.Printf("cost: %d Elapsed time: %s\n", i, elapsed)  
}

The result is:

cost: 10 Elapsed time: 56.233ms
cost: 11 Elapsed time: 113.1521ms
cost: 12 Elapsed time: 213.0455ms
cost: 13 Elapsed time: 447.572ms
cost: 14 Elapsed time: 877.3284ms
cost: 15 Elapsed time: 1.8126554s
cost: 16 Elapsed time: 3.375513s
cost: 17 Elapsed time: 6.5935858s
cost: 18 Elapsed time: 13.3655301s
cost: 19 Elapsed time: 27.0033831s
cost: 20 Elapsed time: 53.8954938s

We can see that the time went parabolic compared to the cost. Higher cost means better security but a worse user experience. Just imagine if you set the cost as 20, the user will need to wait 53 seconds when logging in. But put it too low, it will be easier for the attacker to steal your user's credential.

✍️
Let's do some math. Suppose you have ten million users in your database, and the attacker has a dictionary of 1000 most common passwords. How long would it take for the attacker to calculate the password hash with SHA256 compared to Bcrypt with the cost of 12?

First, we will need to calculate how many hash operations the attacker needs to do, which we can get by multiplying how many users we have by the number of common passwords the attacker uses. So, 10.000.000 * 1000 = 10.000.000.00, Now we calculate the time the attacker needs to calculate the hash ten billion times.

For SHA256, we did a million calculations in 337.2232ms. So we can calculate all of the hash: 10.000.000.000 / 1.000.000 * 337.2232 ms = 3372232m, which is just under 1 hour.

Next, let's try with Bcrypt with cost 12: 10.000.000.000 * 213.0455ms = 2.130455 e+12m, which equals to 4053377 years. As you can see, using Bcrypt for your password hashing function makes a lot of difference. If a data breach happens to your database, it will buy you a lot of time to notice it and ask your users to change their passwords.

Besides determining cost, Bcrypt also uses salt by default, which means the attacker won't be able to do a rainbow table attack we discussed previously. Let's see what the result is if we hash weakpassword with Bcrypt:

hashedPassword,_ := bcrypt.GenerateFromPassword([]byte("weakpassword"), 10)  
fmt.Printf("hashed password: %s", hashedPassword)
hashed password: $2a$10$.krQtTcne8xlhG2rJONbKu9KZepUpwl8tyC/fFIB6lRmNufvPfge2

If we break the result down, we will get:

Bcrypt breakdown
  1. alg: The has algorithm identifier, $2a means Bcrypt
  2. cost: The cost of the Bcrypt, remember we set this as 10 in the code
  3. salt (22 characters): Random salt for password hashing generated by the Bcrypt hashing function
  4. hashed password (31 characters): The hashing function result.

Lastly, let's see how to validate the password hashed by Bcrypt:

hashedPassword,_ := bcrypt.GenerateFromPassword([]byte("weakpassword"), 10)  
err := bcrypt.CompareHashAndPassword(hashedPassword, []byte("weakpassword"))  
  
if err != nil {  
   fmt.Print(err)  
} else {  
   fmt.Printf("Password true")  
}

As we can see, we don't need to send the cost, alg, and salt when comparing the hash and password because every required input has been added to the Bcrypt hashing result itself.

Let's review, there are two essential traits of Bcrypt that make it suitable for password hashing:

  1. Bcrypt let us determine the cost to calculate the hash result, which makes it future-proof for faster machines.
  2. Bcrypt calculates its forces using salt, making a rainbow table attack impossible to do.

Next Step

We've discussed storing your user password correctly so attackers can't figure out your user's password quickly. But securely storing our users' passwords doesn't mean the attacker can't get the password. For example, the attacker can take a look when your user inputted their password to figure out the password. There is also a chance that the attacker can make a man-in-the-middle attack if your website doesn't use HTTPS.

If you want to understand more about how to secure the authentication process, I urge you to read about:

  1. MFA
  2. Passwordless Login (Login by OTP, Magic Link, and WebAuthn)
  3. TLS Protocol

References

1. Auth0 - Hashing in Action - Understanding Bcrypt
2. Topcoder - Bcrypt Algorithm
3. Wikipedia - Bcrypt