Java streams 24. Reduce

Terminal operation either returns one value or does not return anything (produces just side effects). It does not allow other operations to be applied and closes the stream.

In this post we will cover the terminal operation reduce() that has three overloaded versions:

Optional<T> reduce(BinaryOperator<T> accumulator) – accumulates the stream elements, using the specified function, and returns the resulting value, if any, wrapped inside the Optional object.

T reduce(T identity, BinaryOperator<T> accumulator) – accumulates the stream elements, using the specified identity value and accumulator function, and returns the resulting value, which may be just the specified identity value.

U reduce(U identity, BiFunction<U, T, U> accumulator, BinaryOperator<U> combiner) – accumulates the stream elements, using the specified identity value and accumulator function, and returns the resulting value, which may be just the specified identity value. In the case of a parallel stream, uses the specified combiner function to incorporate the results of all sub-processes into the returned resulting value.

First, let us look at what is BinaryOperator<T> and what is BiFunction<U, T, U>.



BinaryOperator<T> and BiFunction<U, T, U>

The BinaryOperator<T> and BiFunction<U, T, U> are functional interfaces. This means that each of them has only one abstract method. They have also non-abstract methods, which we will discuss later (they are not essential for our discussion of the reduce() method), so for now let us concentrate on the fact that they have only one abstract method each.

These two functional interfaces are related: the BinaryOperator<T> extends the BiFunction<U, T, U> interface. This means that the abstract method of the BiFunction<U, T, U> interface is inherited by the BinaryOperator<T> interface. That is how the BinaryOperator<T> acquires its only abstract method. This means that these functional interfaces share the abstract method.

In the BiFunction<U, T, U> interface the abstract method accepts two parameters of type U and T and returns a value of type U.

For example, here are two (of many) possible implementations of the BiFunction<U, T, U> interface: 

  
  BiFunction<String, Integer, String> bf1 = 
        (String s, Integer i) -> {
            String r1 = s == null ? "" : s;
            String r2 = i == null ? "0" : i.toString();
            return r1 + ", " + r2;
        };

  BiFunction<Integer, String, Integer> bf2 =
        (Integer i, String s) -> {
            Integer r1 = i == null ? 0 : i;
            Integer r2 = s == null ? 0 : s.length();
            return r1 + r2;
        };
    

The abstract method is defined in the BiFunction<U, T, U> as follows:    

U apply(T t, U u). Applies this function to the given arguments.

This means we can execute the functions bf1 and bf2 by calling the apply() method as shown below:  

  
  System.out.println(bf1.apply("abc", 42));   //prints: abc, 42
  System.out.println(bf1.apply(null, 42));    //prints: , 42
  System.out.println(bf1.apply("abc", null)); //prints: abc, 0
  
  System.out.println(bf2.apply(42, "abc"));   //prints: 45
  System.out.println(bf2.apply(null, "abc")); //prints: 3
  System.out.println(bf2.apply(42, null));    //prints: 42
    

The BinaryOperator<T> interface is a specialization of the BiFunction<U, T, U>. It is defined so that the type of the two input parameters of its abstract method are the same. They are equal the type of the output parameter as well. That is why its signature includes only one type. This means the abstract method of the BinaryOperator<T> interface is defined as follows:    

T apply(T t1, T t2). Applies this function to the given arguments.

The following are two (of many) possible implementations of the BinaryOperator<T> interface:  

  
  BinaryOperator<String> bo1 = 
        (String s1, String s2) -> s1 + ", " + s2;
  System.out.println(bo1.apply("abc", "42"));//prints: abc, null
  System.out.println(bo1.apply(null, "42")); //prints: null, 42
  System.out.println(bo1.apply("abc", null));//prints: abc, null

  BinaryOperator<Integer> bo2 =
        (Integer i1, Integer i2) -> {
            Integer r1 = i1 == null ? 0 : i1;
            Integer r2 = i2 == null ? 0 : i2;
            return r1 + r2;
        };
  System.out.println(bo2.apply(42, 42));     //prints: 84
  System.out.println(bo2.apply(null, 42));   //prints: 42
  System.out.println(bo2.apply(42, null));   //prints: 42
    

The default method andThen() is the only non-abstract method of the BiFunction<U, T, U> interface:    

default BiFunction<U, T, V> andThen(Function<U, V> after). Returns a composed function that first applies the BiFunction<U, T, U> to its input, and then applies the after function Function<U, V> to the result. This method is used to construct new BiFunction form the existing functions.

The BinaryOperator<T> interface extends the BiFunction<U, T, U> interface (so has all its methods) and, in addition, has two more non-abstract methods maxBy() and minBy():    

static BinaryOperator<T> maxBy(Comparator<T> comparator). Returns a BinaryOperator which returns the greater of two elements according to the specified Comparator.
static BinaryOperator<T> minBy(Comparator<T> comparator). Returns a BinaryOperator which returns the lesser of two elements according to the specified Comparator.

With functions out of the way, let us look at the first of the reduce() operation versions. 



Optional<T> reduce(BinaryOperator<T> accumulator)

To demonstrate the reduce() operation, we are going to use the class Box:

  
  class Box {
    int weight;
    String color;

    public Box(int weight, String color) {
        this.weight = weight;
        this.color = color;
    }

    public int getWeight() { return weight; }

    public String getColor() { return color; }

    @Override
    public String toString() {
        return "Box{weight=" + weight +
                    ", color='" + color + "'}";
    }
  }
         

Let us find the “heaviest” Box object using the first version of reduce():

  
  Box theHeaviest = Stream.of(new Box(5, "red"), 
                              new Box(8, "green"), 
                              new Box(3, "blue"))
     .reduce((b1, b2) -> 
               b1.getWeight() > b2.getWeight() ? b1 : b2)
     .orElse(null);
  System.out.print(theHeaviest);
                   //prints: Box{weight=8, color='green'}
   

The implementation looks not intuitive, is it? It seems like the accumulator does not accumulate anything. Its apply() method accepts the first two elements of the stream, compares them, and returns the “heavier” one. Then it accepts the result (as the first parameter) and the third stream element (as the second parameter) and returns the “heavier” one. 

In other words, the accumulator saves the result of the comparison and provides it as the first parameter for the next comparison (with the next element). 

We could implement the same functionality by creating the BinaryOperator<Box> accumulator first as follows:

  
  BinaryOperator<Box> maxByWeight = 
      (b1, b2) -> b1.getWeight() > b2.getWeight() ? b1 : b2;
  Box theHeaviest = Stream.of(new Box(5, "red"), 
                              new Box(8, "green"), 
                              new Box(3, "blue"))
           .reduce(maxByWeight)
          .orElse(null);
  System.out.print(theHeaviest);  
                     //prints: Box{weight=8, color='green'}
      

We also could use the maxBy() method of the BinaryOperator, to create the function we need:

   BinaryOperator<Box> maxByWeight = 
       BinaryOperator.maxBy(Comparator
                                .comparing(Box::getWeight));

The result would be the same.

But it is not the only way to use the accumulator function. In the following example, we actually accumulate (add to the total) the weights of all Box objects: 

  
  int totalWeight = Stream.of(new Box(5, "red"), 
                              new Box(8, "green"), 
                              new Box(3, "blue"))
        .map(b -> b.getWeight())
        .reduce((w1, w2) -> w1 + w2)
        .orElse(null);
  System.out.print(totalWeight);     //prints: 16
   

Or, another example of actual accumulation, we can concatenate all the colors of the boxes in one space-separated String value:

  
  String colors = Stream.of(new Box(5, "red"), 
                            new Box(8, "green"), 
                            new Box(3, "blue"))
        .map(p -> p.getColor())
        .reduce((c1, c2) -> c1 + " " + c2)
        .orElse(null);
  System.out.print(colors);    //prints: red green blue
     

And the following is the same example but with the comma-separated String result:

  
  String colors = Stream.of(new Box(5, "red"), 
                            new Box(8, "green"), 
                            new Box(3, "blue"))
        .map(p -> p.getColor())
        .reduce((c1, c2) -> c1 + ", " + c2)
        .orElse(null);
  System.out.print(colors);    //prints: red, green, blue
   

Every programmer, who tried to generate a String with comma-separated values, can appreciate how much easier is it can be accomplished with a Stream and reduce() than with traditional FOR-loop:

  
  List<Box> list = List.of(new Box(5, "red"), 
                           new Box(8, "green"), 
                           new Box(3, "blue"));
  StringBuffer sb = new StringBuffer();
  int count = 1;
  for(Box b: list){
    sb.append(b.getColor());
    if(count < list.size()){
        sb.append(", ");
    }
    count++;
  }
  System.out.print(sb.toString());  //prints: red, green, blue
  

In the following post, we will present collect() operation and demonstrate an even simpler way to produce the same result.



T reduce(T identity, BinaryOperator<T> accumulator

The second version of the reduce() operation allows adding identity as the initial value of the final result. It guarantees that the result is going to be not null, so the returned result is not wrapped in an Optional object. For example: 

  
  int totalWeight5 = Stream.of(new Box(5, "red"), 
                               new Box(8, "green"), 
                               new Box(3, "blue"))
        .map(b -> b.getWeight())
        .reduce(10, (w1, w2) -> w1 + w2);
  System.out.print(totalWeight5);      //prints: 26
    

We can also use the identity value to add a prefix to the resulting string:  

  
  String colors = Stream.of(new Box(5, "red"), 
                            new Box(8, "green"), 
                            new Box(3, "blue"))
        .map(p -> p.getColor())
        .reduce("Colors: ", (c1, c2) -> c1 + " " + c2);
  System.out.print(colors); 
                 //prints: Colors: red green blue

Notice the space after “Colors:” in the resulting string. That is because the result of the first element processing is “Colors:” + ” ” + “a”. And, again, the collect() operation (which we will present in the next post) provides much simpler way to accomplish all that.

Now let us process the same elements using a parallel stream.



U reduce(U identity, BiFunction<U,T,U> accumulator, BinaryOperator<U> combiner)

In the case of a parallel stream, the identity value may bring an unexpected result:

   
  String colors = Stream.of(new Box(5, "red"), 
                            new Box(8, "green"), 
                            new Box(3, "blue"))
        .parallel()
        .map(p -> p.getColor())
        .reduce("Colors:", (c1, c2) -> c1 + " " + c2, 
                           (r1, r2) -> r1 + " " + r2);
  System.out.print(colors); 
        //prints: Colors: red Colors: green Colors: blue

As you can see, the prefix “Colors:” is added to every processed value, so we need to remove it in the combiner (if we want to see only one prefix in the resulting string):

   
  String colors = Stream.of(new Box(5, "red"), 
                           new Box(8, "green"), 
                           new Box(3, "blue"))
        .parallel()
        .map(p -> p.getColor())
        .reduce("Colors:", (c1, c2) -> c1 + " " + c2, 
             (r1, r2) -> r1 + " " + r2.replace("Colors: ", ""));
  System.out.print(colors);     //prints: Colors: red green blue

Notice that we have included space in the removed substring “Colors: “.

Let us look at another code example that calculates the sum of the integer values emitted in a parallel stream:

   
   int sum = Stream.of(1, 2, 3)
        .parallel()
        .reduce(0, (i1, i2) -> i1 + i2, 
                   (r1, r2) -> r1 + r2);
   System.out.print(sum);     //prints: 6
   

As you can see, we have learned the lesson of how the identity value is used and set it to 0.

To make the code more compact, we can use method reference instead of lambda expression:

     
  int sum = Stream.of(1, 2, 3)
        .parallel()
        .reduce(0, Integer::sum, Integer::sum);
  System.out.print(sum);    //prints: 6
   

Naturally, the question arises, Is the usage of an identity value quite limited? If not, what are the cases that justify its presence?

The answer is, there are many cases when identity value is quite useful. We have seen already how it can be used to add a prefix when strings are concatenated emitted by a non-parallel (sequential) stream. We also can imagine that a certain value may be needed for the processing of each element in a parallel stream.

The following is another example of identity value usage. Let us collect all the stream elements in a List object, using the reduce() operation:

   
  BiFunction<List<String>, String, List<String>> accumulator =
        (l, s) -> {
            l.add(s);
            return l;
        };
  BinaryOperator<List<String>> combiner =
        (l1, l2) -> {
            //Does not do anything except printing:
            System.out.println("In combiner!");
            return l1;
        };
  List<String> lst = Stream.of("a", "b", "c", "d", "e")
        .reduce(new ArrayList<>(), accumulator, combiner);
  System.out.print(lst);            //prints: [a, b, c, d, e]
     

In this case, the identity allows to provide the initial value of the List object where all the stream elements going to be collected.  Since the stream is not parallel (and should not be because results would be quite unpredictable), the combiner is never used. Why are we chosen this version of the reduce() operation then? 

The answer is because all other overloaded reduce() versions return value of the same type as the stream type. If that feels a bit awkward, wait for the next post, where we will show how to achieve the same results much easier with the help of the collect() operation. 

Chances are you will never use the reduce() operation in real life. Nevertheless, do not discard it as one day you may encounter the need for the stream processing that requires a customized implementation. Then the reduce() operation will shine.

In the next post, we will start presenting the last of the terminal operations called collect() operation. It is a convenient specialization of reduce(). We will describe and demonstrate the two overloaded versions of collect():

--R collect(Collector<T, A, R> collector);

--R collect(Supplier<R> supplier, BiConsumer<R, T> accumulator, BiConsumer<R, R> combiner);

See other posts on Java 8 streams and posts on other topics.
You can also use navigation pages for Java stream related blogs:
— Java 8 streams blog titles
— Create stream
— Stream operations
— Stream operation collect()
The source code of all the code examples is here in GitHub

, ,

Send your comments using the link Contact or in response to my newsletter.
If you do not receive the newsletter, subscribe via link Subscribe under Contact.

Powered by WordPress. Designed by Woo Themes