This post shows how you can use the Collectors
available in the Streams API to group elements of a stream with groupingBy
and partition elements of a stream with partitioningBy
.
Consider a stream of Employee
objects, each with a name, city and number of sales, as shown in the table below:
+----------+------------+-----------------+ | Name | City | Number of Sales | +----------+------------+-----------------+ | Alice | London | 200 | | Bob | London | 150 | | Charles | New York | 160 | | Dorothy | Hong Kong | 190 | +----------+------------+-----------------+
Grouping
Let's start by grouping employees by city using imperative style (pre-lamba) Java:
Map<String, List<Employee>> result = new HashMap<>(); for (Employee e : employees) { String city = e.getCity(); List<Employee> empsInCity = result.get(city); if (empsInCity == null) { empsInCity = new ArrayList<>(); result.put(city, empsInCity); } empsInCity.add(e); }
You're probably familiar with writing code like this, and as you can see, it's a lot of code for such a simple task!
In Java 8, you can do the same thing with a single statement using a groupingBy
collector, like this:
Map<String, List<Employee>> employeesByCity = employees.stream().collect(groupingBy(Employee::getCity));
This results in the following map:
{New York=[Charles], Hong Kong=[Dorothy], London=[Alice, Bob]}
It's also possible to count the number of employees in each city, by passing a counting
collector to the groupingBy
collector. The second collector performs a further reduction operation on all the elements in the stream classified into the same group.
Map<String, Long> numEmployeesByCity = employees.stream().collect(groupingBy(Employee::getCity, counting()));
The result is the following map:
{New York=1, Hong Kong=1, London=2}
Just as an aside, this is equivalent to the following SQL statement:
select city, count(*) from Employee group by city
Another example is calculating the average number of sales in each city, which can be done using the averagingInt
collector in conjuction with the groupingBy
collector:
Map<String, Double> avgSalesByCity = employees.stream().collect(groupingBy(Employee::getCity, averagingInt(Employee::getNumSales)));
The result is the following map:
{New York=160.0, Hong Kong=190.0, London=175.0}
Partitioning
Partitioning is a special kind of grouping, in which the resultant map contains at most two different groups - one for true
and one for false
. For instance, if you want to find out who your best employees are, you can partition them into those who made more than N sales and those who didn't, using the partitioningBy
collector:
Map<Boolean, List<Employee>> partitioned = employees.stream().collect(partitioningBy(e -> e.getNumSales() > 150));
This will produce the following result:
{false=[Bob], true=[Alice, Charles, Dorothy]}
You can also combine partitioning and grouping by passing a groupingBy
collector to the partitioningBy
collector. For example, you could count the number of employees in each city within each partition:
Map<Boolean, Map<String, Long>> result = employees.stream().collect(partitioningBy(e -> e.getNumSales() > 150, groupingBy(Employee::getCity, counting())));
This will produce a two-level Map:
{false={London=1}, true={New York=1, Hong Kong=1, London=1}}