On Password Handling

March 6, 2021 9-minute read

Introduction

Recently, I’ve been thinking through the implications of building an authentication system. The amount of work to successfully pull of what amounts to a Boolean decision is staggering. One of the more controversial parts of authentication is proper handling of passwords. At this point it’s common knowledge that passwords should be hashed, but the how is still very much up for debate. When I work through designing security software, I try to lean on recommendations from the community. I do this because when it comes to security, rising tides raise all boats. If the recommendations are reasonable, it’s a great way to back up your decision. If they aren’t or are out of date, it’s a chance to have a discussion and hopefully improve things for others who are looking for the same advice. I started by looking through the OWASP ASVS to see what requirements there were for authentication. Section V2 is dedicated to the topic, and provides a nice foundation. Unfortunately, if you are trying to follow the advice from section 2.1.2 and the OWASP Password Storage Cheat Sheet you will immediately hit a snag. The problem lies in the following language:

Verify that passwords 64 characters or longer are permitted but may be no longer than 128 characters.

If you’ve examined BCrypt’s limitations, you will quickly realize that passwords of this length are silently truncated. There are of course ways to work around this with pre-hashing, but this is a sign that requirements have matured beyond BCrypt’s designed capabilities. Before I jumped straight to Argon2, I decided to ask some folks I know that regularly contribute to WASP:

OWASP ASVS folks, how do you properly resolve 2.1.2, 2.1.3, and the OWASP recommendation to use bcrypt? One of these will have to give.

/cc @mkonda @manicode @sempf
— Aaron Bedra (@abedra) February 15, 2021

The resulting conversation was excellent, and I highly recommend reading through the replies. At the end, the initial thought to use Argon2 was confirmed, and IMHO, should become the new OWASP recommendation, with the exception of PBKDF2 for those of you still working with government requirements. With that I set out to capture a generic algebra for password handling and a way to incorporate Argon2.

A Generic Algebra for Passwords

To an authentication system, a password can exist in 2 forms¹, plaintext and hashed. It’s important to separate these and provide evidence of what is being provided throughout your system. This post will demonstrate how to provide type level evidence of this and encode handling semantics. As I’ve mentioned in previous posts, this isn’t the only way to solve this problem. If you’re in a dynamic language that doesn’t make this type of evidence possible, it’s still reasonable to encode the difference to avoid confusion. Since a password can have two forms we can fit it nicely into a CoProduct:

public abstract class Password implements CoProduct2<Password.Plaintext, Password.Hashed, Password>

This tells a consumer that a password can be either plaintext, or hashed. Everything else we do will be predicated on having this context and seek to make it impossible to mishandle without blatantly disregarding the semantics. To do this we will provide two underlying types:

public static final class Plaintext extends Password {
    private final char[] value;

public static final class Hashed extends Password {
    private final String value;

The rest of the implementation will follow shortly, but it’s good to pause here and consider the value type of each. It wouldn’t be a cardinal sin to use String for the Plaintext class, but it’s important to remember that strings in Java are heap allocated and immutable. This means that the underlying value of any Plaintext object will reside in the heap until it can be garbage collected. It’s considered a better practice to use a char[], or potentially a byte[] to reduce the proliferation of strings. You could also consider the creation of a Plaintext object a violation of this constraint, but it’s also an essential part of our algebra going forward. There’s nothing that will prevent callers of this class from doing something akin to "password".toCharArray() and defeat the idea, so don’t consider this to be a strong protection, just a nod to what should be considered proper practice.

Next, there are the two methods that need to exist to complete our algebra, hash and verify. There’s nothing about the password algebra that requires a specific implementation of these methods, just that they can be performed. Anyone using these classes should be able to provide the “how” according to their system’s design requirements. This raises our algebra to the level of

Given a Plaintext under an effect M, furnish me a Hashed

This looks an awful lot like ReaderT. Using ReaderT we can encode a generic abstraction for hash:

public <M extends MonadRec<?, M>> MonadRec<Hashed, M> hash(ReaderT<Plaintext, M, Hashed> passwordReader) {
    return passwordReader.runReaderT(this);
}

And one for verify:

public <M extends MonadRec<?, M>> MonadRec<Boolean, M> verify(
        ReaderT<Tuple2<Plaintext, Hashed>, M, Boolean> hash Reader,
        Plaintext plaintext) {
    return hashReader.runReaderT(tuple(plaintext, this));
}

This will allow us to purely test our password algebra in a side effect free fashion, and separate out the integration testing of any specific hashing implementation. A complete example will look something like the following:

public abstract class Password implements CoProduct2<Password.Plaintext, Password.Hashed, Password> {
    @EqualsAndHashCode(callSuper = false)
    @AllArgsConstructor(access = PRIVATE)
    public static final class Plaintext extends Password {
        private final char[] value;

        public static Plaintext plaintext(char[] value) {
            return new Plaintext(value);
        }

        @Override
        public <R> R match(Fn1<? super Plaintext, ? extends R> aFn,
                           Fn1<? super Hashed, ? extends R> bFn) {
            return aFn.apply(this);
        }

        public char[] exposeValue() {
            return value;
        }

        public <M extends MonadRec<?, M>> MonadRec<Hashed, M> hash(ReaderT<Plaintext, M, Hashed> passwordReader) {
            return passwordReader.runReaderT(this);
        }
    }

    @EqualsAndHashCode(callSuper = false)
    @AllArgsConstructor(access = PRIVATE)
    public static final class Hashed extends Password {
        private final String value;

        public static Hashed hashed(String value) {
            return new Hashed(value);
        }

        @Override
        public <R> R match(Fn1<? super Plaintext, ? extends R> aFn,
                           Fn1<? super Hashed, ? extends R> bFn) {
            return bFn.apply(this);
        }

        public String exposeValue() {
            return value;
        }

        public <M extends MonadRec<?, M>> MonadRec<Boolean, M> verify(
                ReaderT<Tuple2<Plaintext, Hashed>, M, Boolean> hashReader,
                Plaintext plaintext) {
            return hashReader.runReaderT(tuple(plaintext, this));
        }
    }
}

The only other thing to note here is the presence of exposeValue(). It is necessary for our implementation to provide access to the underlying value, but when used, it should be obvious what it is doing and easy to follow any invocations. Separating the method that exposes the value also helps us not fall victim to default stringification under exceptional or logging conditions. This helps prevent plaintext passwords from ending up in stack traces and log files, which should never happen.

On to our tests. Because of the algebraic requirements, we can substitute Identity for our effect under test. This will let us test side effect free. The nature of the hashing and verification is not relevant to this particular part of our domain, just that it has an interface that provides the correct way to resolve the essential arrows.

public class PasswordTest {
    private static final Plaintext plaintext = plaintext("password".toCharArray());
    private static final Hashed hashed = hashed("$argon2i$v=19$m=65536,t=10,p=1$VGa5ppUavng3sjMgsdVMjA$BAp8s+zLAbF9qUC5AW+8JARMHkOT8Xip7938ZTD8jSg");

    @Test
    public void equality() {
        //noinspection AssertBetweenInconvertibleTypes
        assertNotEquals(plaintext, hashed);
        assertEquals(plaintext, plaintext("password".toCharArray()));
        assertEquals(hashed("abc123"), hashed("abc123"));
    }

    @Test
    public void exposeValue() {
        assertArrayEquals("password".toCharArray(), plaintext.exposeValue());
        assertEquals("$argon2i$v=19$m=65536,t=10,p=1$VGa5ppUavng3sjMgsdVMjA$BAp8s+zLAbF9qUC5AW+8JARMHkOT8Xip7938ZTD8jSg", hashed.exposeValue());
    }

    @Test
    public void hash() {
        ReaderT<Plaintext, Identity<?>, Hashed> passwordReader = readerT(p -> new Identity<>(hashed));
        Identity<Hashed> actual = plaintext.hash(passwordReader).coerce();

        assertEquals(hashed, actual.runIdentity());
    }

    @Test
    public void verify() {
        ReaderT<Tuple2<Plaintext, Hashed>, Identity<?>, Boolean> hashReader = readerT(constantly(new Identity<>(true)));

        assertTrue(hashed
                .verify(hashReader, plaintext)
                .<Identity<Boolean>>coerce()
                .runIdentity());
    }
}

With this in hand, we can now focus on the real hashing and verification mechanism.

Adding Argon2

First we need to select an Argon2 library. There are a few to consider, and I ended up choosing argon2-jvm. It appears to be maintained, and is a binding to native Argon2 libraries, which are typically well vetted. There are two package options, one with and one without the native libraries packaged. You should choose whichever works best for your environment. Let’s add it to our dependencies:

<dependency>
    <groupId>de.mkammerer</groupId>
    <artifactId>argon2-jvm-nolibs</artifactId>
    <version>2.9.1</version>
</dependency>

Next, we need a way to adapt our new library to the algebra we defined above. Before we create our wrapper, we need to consider the configuration requirements for Argon2:

Salt length
Hash length
Number of iterations
Amount of memory
Parallelism

We will need to have all of these available to our adapter to properly encode the semantics but provide something that can be used in any environment. With that in mind, let’s put it all together:

@AllArgsConstructor(access = PRIVATE)
public final class Argon2Wrapper {
    private final Argon2 argon2;
    private final Iterations iterations;
    private final Memory memory;
    private final Parallelism parallelism;

    public static Argon2Wrapper argon2Wrapper(
            SaltLength saltLength,
            HashLength hashLength,
            Iterations iterations,
            Memory memory,
            Parallelism parallelism) {

        Argon2 argon2 = Argon2Factory.create(saltLength.getValue(), hashLength.getValue());
        return new Argon2Wrapper(argon2, iterations, memory, parallelism);
    }

    public ReaderT<Password.Plaintext, IO<?>, Password.Hashed> passwordReader() {
        return readerT(plaintext -> io(() -> hashed(
                argon2.hash(
                        iterations.getValue(),
                        memory.getValue(),
                        parallelism.getValue(),
                        plaintext.exposeValue()))));
    }

    public ReaderT<Tuple2<Password.Plaintext, Password.Hashed>, IO<?>, Boolean> hashReader() {
        return readerT(into((plaintext, hashed) -> io(() ->
                argon2.verify(hashed.exposeValue(), plaintext.exposeValue()))));
    }
}

Ultimately, we need to be able to furnish the two ReaderTs required by our password algebra. Argon2 is necessarily bound to IO, so our ReaderTs must be encoded as ReaderT<Plaintext, IO<?>, Hashed>. Inside the reader, we execute the underlying Argon2 library calls using the values setup on construction. You will notice that instead of using int for the configuration values, tiny types have been created to make the resolution of these variables more obvious and make presenting int values in incorrect places a compiler error.

With our wrapper in hand, let’s write an integration test that ensures we have done everything properly:

public class Argon2WrapperTest {
    private static final Password.Plaintext plaintext = plaintext("password".toCharArray());
    private static final Password.Hashed hashed = hashed("$argon2i$v=19$m=65536,t=10,p=1$VGa5ppUavng3sjMgsdVMjA$BAp8s+zLAbF9qUC5AW+8JARMHkOT8Xip7938ZTD8jSg");

    private Argon2Wrapper argon2Wrapper;

    @Before
    public void before() {
        argon2Wrapper = argon2Wrapper(saltLength(16), hashLength(32), iterations(10), memory(65536), parallelism(1));
    }

    @Test
    public void argonHash() {
        Password.Hashed actual = plaintext
                .hash(argon2Wrapper.passwordReader())
                .<IO<Password.Hashed>>coerce()
                .unsafePerformIO();

        assertThat(actual.exposeValue(), containsString("$argon2i$v=19$m=65536,t=10,p=1$"));
    }

    @Test
    public void argonVerify() {
        assertTrue(hashed
                .verify(argon2Wrapper.hashReader(), plaintext)
                .<IO<Boolean>>coerce()
                .unsafePerformIO());
    }
}

Wrap-Up

Authentication is hard. Password handling is hard. Designing authentication systems is non-trivial and should be done with the utmost care. Ultimately, it’s best to rely on a vendor that spends all of its time on this problem and can provide you with a foundation better than you can build on your own. If your business isn’t authentication, let someone else help. If you need to build your own authentication, take the advice above into account. It’s also good to think about your handling of passwords as a generic concept, as better hashing mechanisms will inevitably become best practice in the future, and you should be able to swap them without disrupting what it means to hash and verify passwords.

We are now left with a way to properly satisfy OWASP ASVS V2.1.2 without having to resort to additional tricks. You might be wondering about password complexity requirements, and you should. I will spend some time exploring the algebra behind password complexity in a future post, as I believe it is an entirely separate concept that is something you compose with your password algebra, not something that belongs inside of it.

A third form should be considered for hashed passwords that need to be upgraded to a new form. This is out of scope for our current discussion. ↩︎