Skip to content

fillmore-labs/validation-benchmark

Repository files navigation

Bean Validation Benchmark

bda46a80a33133c693ec068eb0491956fec0c2ed354fdf0756 validation benchmark

Introduction

In a customer project I came across Get Your Hands Dirty on Clean Architecture [gyhd], and since value objects seem to be my thing and I had seen Dan Bergh Johnsson excellent presentation [pov], of course my interest was piqued.

To get that out of the way first: This is not a review of the book, just a few notes on some of the code from chapter 4 Implementing a Use Case.

Motivation

In the subchapter Validating Input [gyhd] uses a SelfValidating class, implemented like this:

public abstract class SelfValidating<T> {
  private Validator validator;

  public SelfValidating() {
    ValidatorFactory factory = Validation.buildDefaultValidatorFactory();
    validator = factory.getValidator();
  }

  protected void validateSelf() {
    Set<ConstraintViolation<T>> violations = validator.validate((T) this);
    if (!violations.isEmpty()) {
      throw new ConstraintViolationException(violations);
    }
  }
}

The first thing you probably see is the interesting use of generics, only to make an unnecessary unchecked cast. Due to type erasure everything is gone during runtime, but the purpose is unclear.

The second thing is the creation of a Jakarta Bean Validation [jbv] ValidatorFactory for every object. The documentation recommends to “cache ValidatorFactory objects and reuse them for optimal performances [sic]” and remarks that building a ValidatorFactory can be expensive.

So, let’s benchmark this, shall we?

Structure

We implement SelfValidating like presented in the book in package self_validating.

Then we try to fix some problems in validated, see Discussion.

An alternative is implemented in immutables.

For benchmark purposes we also use a pojo and reimplement self_validating and validated with Jakarta Bean Validation 3.0 in jee9.

Discussion

In Validating Input [gyhd] we read “In the Java world, the de facto standard for [validation checks] is the Bean Validation API.” Well, for Java Beans maybe. There are good alternatives, like Guava’s Preconditions. And interdependencies are much easier to express.

Nullability in Java is a much-discussed issue, and is in most cases best handled via static analysis. There are tools like the Checker Framework and NullAway, IntelliJ IDEA supports real time checks, even the Spring Framework provides annotations.

None of them is supported by Jakarta Bean Validation, of course. Also, tools like AutoValue and Immutables provide null-safety by default.

Therefore, we don’t use @NotNull for benchmarking, since it would be an unfair comparison.

In this example we model a person to have a given name with at least on character (might be blank) and an optional surname with at least one character when present. In practice it is debatable how this would improve your business logic and whether just null-safety and input validation at the external API would be better. To quote Oliver Drotbohm: “I think [Bean Validation is] absolutely awesome for input validation. In a domain model you don’t deal with validation. You implement business constraints. That’s a different story and I often see projects where it’s been applied blindly because it looks convenient.

But we are here to benchmark validation, not write business logic, so this should suffice.

Improving the Code

If we look at the example we see that we should cache the ValidatorFactory somewhere and close it at application shutdown. Usually, a framework does this for us, and we acquire a Validator via dependency injection.

We might be tempted to semi-fix the issue by creating some kind of global static variable, initialized at application start up, but there are additional problems with inheritance.

When we define a non-final, self-validated class and inherit a self-validated class from it, we are effectively prevented to instantiate the subclass, since the superclass calls validateSelf() on a partially constructed object, demonstrated in a unit test. While this is by all means bad programming style and our tools warn us about it, it is nothing I haven’t seen in practice and is generally very error prone.

Also, we are forced to write a constructor with a lot of boilerplate, just to call one simple function, which could easily be forgotten.

So, let’s move the validation into a factory and make sure that no one else could instantiate our object.

There is an additional problem: Hibernate (the Jakarta Bean Validation we are using) documentation requires that “When validating byte code enhanced objects, property level constraints should be used, …​

Since Project Lombok is clearly byte code enhancing, we need to annotate the properties. Which forces us to use JavaBeans-style naming conventions and the experimental onMethod_ option (the underscore is important).

We end up with

import jakarta.validation.constraints.NotEmpty;
import jakarta.validation.constraints.Size;
import java.util.Optional;
import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.Value;
import lombok.experimental.Accessors;

@Value
@AllArgsConstructor(access = PACKAGE)
@Accessors(fluent = false)
public class Person {
  @Getter(onMethod_ = @NotEmpty)
  String givenName;

  String surname;

  public Optional<@Size(min = 1) String> getSurname() {
    return Optional.ofNullable(surname);
  }
}

which is somewhat ugly, but at least mostly correct.

Alternatives

In The Power of Constructors [gyhd] we read “good IDEs help with parameter name hints”.

Yes, if you never did a side-by-side diff or read code outside your IDE.

We follow Effective Java [ej]Item 2: Consider a builder when faced with many constructor parameters”. Here we have only two, but we already see the improvements when confronted with an optional surname.

import java.util.Optional;
import org.immutables.value.Value;

@Value.Style(optionalAcceptNullable = true)
@Value.Immutable
public abstract class Person {
  public abstract String givenName();

  public abstract Optional<String> surname();

  @Value.Check
  final void check() {
    var givenNameIsEmpty = givenName().isEmpty();
    var surnameIsPresentAndEmpty = surname().map(String::isEmpty).orElse(false);
    checkState(!givenNameIsEmpty, "Given name must not be empty");
    checkState(!surnameIsPresentAndEmpty, "Surname must not be empty");
  }

  public static final class Builder extends ImmutablePerson.Builder {}
}

The code is a little longer, but pretty readable, correct and we don’t need to write a factory.

Also, when API changes are a problem (“So, why not let the compiler guide us?[gyhd]) we could use a staged builder.

Benchmarking

Prerequisites

You need Bazelisk installed, see also Installing Bazel using Bazelisk.

macOS

Using HomeBrew enter

brew install bazelisk

Windows

Using Chocolatey enter

choco install bazelisk

Enable developer mode:

  1. Open Windows settings

  2. Go to “Update & security”, then “For developers”

  3. Under “Developer Mode” section enable “Install apps from any source, including loose files”.

or run with administrator privileges.

Running

Run the benchmark with

bazelisk run //:benchmark

or

bazelisk run //:jee9-benchmark

for the Jakarta Bean Validation 3.0 variant.

If you have a local JDK ≥ 11 installed you could also use Gradle:

./gradlew :run

Tests

To run all tests, use

bazelisk test //src/test/...

Results

JMH Java microbenchmarks are hard to do correctly, and I don’t assert that I’m an expert in that craft. Nevertheless, these numbers give us an idea of the performance characteristics.

Sample run on an Intel® Xeon® E3-1245 v2; 32GB RAM; Linux Kernel 5.4.0:

Table 1. JEE8 Benchmark
Benchmark Mode Cnt Score Error Units

Bench.immutables

avgt

5

4.639

± 0.090

ns/op

Bench.pojo

avgt

5

5.505

± 0.055

ns/op

Bench.selfValidating

avgt

5

1386941.125

± 10838.886

ns/op

Bench.validated

avgt

5

580.868

± 8.928

ns/op

Table 2. JEE9 Benchmark
Benchmark Mode Cnt Score Error Units

Bench.immutables

avgt

5

4.662

± 0.120

ns/op

Bench.pojo

avgt

5

5.495

± 0.028

ns/op

Bench.selfValidating

avgt

5

1385860.145

± 11741.320

ns/op

Bench.validated

avgt

5

1151.524

± 5.205

ns/op

Interestingly enough Immutables is faster than the POJO implementation, but both are more than a factor of 100 faster than Jakarta Bean Validation and more than 250,000 faster that the approach of Validating Input [gyhd].

While 0.5 μs/message overhead in your application might be acceptable, 1.3 ms will be noticeable under load.

References