Thursday, August 31, 2017

Java 8 Stream collect()

In Java 8 Streams, the collect() method is a terminal operation, which can be used collect elements of a stream into a list or group them into a Map etc. collect method takes java.util.stream.Collector as parameter. While you can create your own collector by implementing the java.util.stream.Collector interface, the more common way to instantiate a Collector is through the use of Collectors class. The following examples demonstrate a few ways to use the collect() method with different types of Collectors.

Collecting Stream into a Set or List

The ToList and ToSet collectors can be used to collect elements of a Stream into a List and Set respectively. What type of a List or Set is not specified for these two collectors and hence should be used without any assumption about the underlying List or Set implementation. Here is an example of the toSet Collector.
Set<String> result = Files.lines(Paths.get("C:/test/test.txt"))
  .flatMap(x -> Arrays.stream(x.split(" ")))
  .collect(Collectors.toSet());
result.forEach(System.out::println);

Collecting to a specific type of Collection

While ToSet and ToList do not specify the type of the underlying implementation class, the ToCollection collector can be used to collect to a specific type of collection which is passed to the toCollection() method in Collectors class. In this example we are collecting the elements of the Stream of words into a TreeSet.
Collection<String> result = Files.lines(Paths.get("C:/test/test.txt"))
  .flatMap(x -> Arrays.stream(x.split(" ")))
  .collect(Collectors.toCollection(TreeSet::new));
result.forEach(System.out::println);

Applying a Function on the result of a collect() with collectinAndThen

collectingAndThen collector provides a way to apply a final function to the result of the collect. In the following example, we collect the words into a List and the create an unmodifiableList from it.
List<String> result = Files.lines(Paths.get("C:/test/test.txt"))
  .flatMap(x -> Arrays.stream(x.split(" ")))
  .collect(Collectors.collectingAndThen(Collectors.toList(), Collections::unmodifiableList));
result.forEach(System.out::println);

Collecting to a Map

The toMap collector can be used to collect the elements of a Stream into a java.util.Map. The following example takes the words from a file and creates a Map with (Word, WordLength) as (Key,Value) pairs. In this example Function.identity() is a simple function which returns the same value that is passed to it.
Map<String, Integer> result = Files.lines(Paths.get("C:/test/test.txt"))
  .flatMap(x -> Arrays.stream(x.split(" ")))
  .collect(Collectors.toSet()).stream()             //  Remove duplicates
  .collect(Collectors.toMap(Function.identity(), String::length));
result.entrySet().forEach(System.out::println);
You can see that the toMap() method takes two Function parameters. The first is for the key and the second is for the value.

Counting the number of occurances of the words of a file using groupingBy and counting

The groupingBy Collector can be used to group the elements into a Map. Like the toMap collection, groupingBy also takes two function parameters, but the first function would be a Predicate (returning true or false), which determines whether an element belongs to a group. You will notice that this collector has some similarity with the SQL "GROUP BY" clause. The following example creates a Map with the words in the file as keys and the number of occurances of the word as value. Notice we use the counting collector to count the number of occurances.
Map<String, Long> result = Files.lines(Paths.get("C:/test/test.txt"))
  .flatMap(x -> Arrays.stream(x.split(" ")))
  .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
result.entrySet().forEach(System.out::println);

Partition the words of a file using partitioningBy

The partitioningBy Collector can be used to partition the elements of a Stream into two groups, based on a Predicate which is passed to the partitioningBy() method. In this following example, we partition the elements into two groups, based on whether a word starts with the letter "s" or not.
Map<Boolean, List<String>> result = Files.lines(Paths.get("C:/test/test.txt"))
  .flatMap(x -> Arrays.stream(x.split(" ")))
  .collect(Collectors.partitioningBy(x -> x.startsWith("s")));
result.entrySet().forEach(System.out::println);
There a quite a few other Collectors provided by the Collectors class, which can be found at Collectors Javadoc Following is the full class I used for these examples.
package streams;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.TreeSet;
import java.util.function.Function;
import java.util.stream.Collectors;

public class StreamCollectorExample {

  public static void main(String[] args) {
    
    //  ToSet
    try {
      Set<String> result = Files.lines(Paths.get("C:/test/test.txt"))
          .flatMap(x -> Arrays.stream(x.split(" ")))
          .collect(Collectors.toSet());
      result.forEach(System.out::println);
    } catch (IOException e) {
      e.printStackTrace();
    }
    
    
    
    //  ToCollection
    try {
      Collection<String> result = Files.lines(Paths.get("C:/test/test.txt"))
          .flatMap(x -> Arrays.stream(x.split(" ")))
          .collect(Collectors.toCollection(TreeSet::new));
      result.forEach(System.out::println);
    } catch (IOException e) {
      e.printStackTrace();
    }

    //  CollectingAndThen
    try {
      List<String> result = Files.lines(Paths.get("C:/test/test.txt"))
          .flatMap(x -> Arrays.stream(x.split(" ")))
          .collect(Collectors.collectingAndThen(Collectors.toList(), Collections::unmodifiableList));
      result.forEach(System.out::println);
    } catch (IOException e) {
      e.printStackTrace();
    }

    //  ToMap
    try {
      Map<String, Integer> result = Files.lines(Paths.get("C:/test/test.txt"))
          .flatMap(x -> Arrays.stream(x.split(" ")))
          .collect(Collectors.toSet()).stream()             //  Remove duplicates
          .collect(Collectors.toMap(Function.identity(), String::length));
      result.entrySet().forEach(System.out::println);
    } catch (IOException e) {
      e.printStackTrace();
    }


    //  Grouping
    try {
      Map<String, Long> result = Files.lines(Paths.get("C:/test/test.txt"))
          .flatMap(x -> Arrays.stream(x.split(" ")))
          .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
      result.entrySet().forEach(System.out::println);
    } catch (IOException e) {
      e.printStackTrace();
    }

    //  Partitioning
    try {
      Map<Boolean, List<String>> result = Files.lines(Paths.get("C:/test/test.txt"))
          .flatMap(x -> Arrays.stream(x.split(" ")))
          .collect(Collectors.partitioningBy(x -> x.startsWith("s")));
      result.entrySet().forEach(System.out::println);
    } catch (IOException e) {
      e.printStackTrace();
    }

    
  }
}

No comments:

Post a Comment

Popular Posts