Java Boxing Benchmark
This article continues the dive into the Java Microbenchmark Harness aka. JMH by comparing Java’s boxing performance in a for loop vs using primitive types to achieve the same.
Setting Up a Project
Clone the project: https://github.com/kimsaabyepedersen/jmh-boxing.
What Will Be Benchmarked?
The two for loops below will be benchmarked twice. Both benchmarks start with an array of length 128. They then proceed to add consecutive Integers to the array. In one benchmark the values added start from 0 and in the other benchmark the values added start from 200.
The array length is 128, because there are 128 numbers from 0 to 127, both inclusive. The numbers 0 to 127 are
interesting because Java caches Integer values between -128 and 127 (both inclusive). The boxing itself, calls the
valueOf
method of the Integer class
and the Javadoc for the method gives the details
.
1// Initialization code left out, creates these arrays
2int[] ints;
3Integer[] integers;
4
5@Benchmark
6public Integer[] boxing() {
7 for (int i = 0; i < integers.length; i++) {
8 // boxing happens here as the result of the addition is an int
9 // which is boxed to an Integer to match type of integers array
10 integers[i] = i + startValue;
11 }
12 return integers;
13}
14
15@Benchmark
16public int[] nonBoxing() {
17 for (int i = 0; i < ints.length; i++) {
18 // no boxing as ints array and result of addition is of the same primitive type
19 ints[i] = i + startValue;
20 }
21 return ints;
22}
So the question is: how much faster is and int
vs an Integer
when the values are small (cached) and what if the
values are larger (non-cached)? Here faster will be measured as the number of operations per second, where more means
faster.
The Output
Human-readable output from running benchmarks looks like this:
1Benchmark (startValue) Mode Cnt Score Error Units
2BoxingBenchmark.boxing 0 thrpt 9 2174940,666 ± 7201,642 ops/s
3BoxingBenchmark.boxing 200 thrpt 9 689941,327 ± 107816,476 ops/s
4BoxingBenchmark.nonBoxing 0 thrpt 9 19186481,525 ± 120700,617 ops/s
5BoxingBenchmark.nonBoxing 200 thrpt 9 19207011,149 ± 58883,470 ops/s
Looking at the boxing benchmark first, the cached version performs 2174940,666 operations per second. The non-cached performs 689941,327. The cached version performs roughly 3.1 more operations per second than the cached.
The non-boxing benchmarks, which uses the primitive type int
, shows that the is practically no difference when
starting from 0 or from 200, as expected as there is no cache for primitive types.
Comparing the non-boxing and the boxing benchmark, the non-boxing version is 8.8 times faster when the boxing version uses caching and 27 times faster than the boxing version when the Integers are not cached.
Note that microbenchmarks are difficult to get right:
- just read the samples and their comments in the JMH repository , as there are many, many things that can influence the result. So have them reviewed by a peer and interpret the results with caution.