The Ultimate 30-Minute Coding Workout - Streams & Lambda Expressions by Examples
BY MARKUS SPRUNCK
Lambda Expression have been released with Java 8 (March 2014). So, they are not completely new, but I know many experienced developers which are not familiar with them and/or use them where reasonable. The following collection of examples about Streams & Lambda Expressions give a brief overview of the most important aspects in a simple form. It doesn't provide a complete and exhaustive description of the topic. The purpose is to provide a fast overview with many examples which can be read in less than 30 minutes. You may find the complete source code on GitHub.
Lambdas & Streams Basics
Streams & Lambdas support parallel operations on collections based on the fork-and-join-framework. The new concept of Streams gives the possibility to operate on a collection with Lambda Expressions. In the case you are not familiar with fork-and-join-framework you may read the short article How to Implement Fork and Join in Java Concurrency Utilities? to get an first impression.
A stream can be executed in serial or in parallel way. When you execute a stream in parallel, Java partitions the stream into multiple sub-streams. The results are then combined again to a single result. All this happens behind the scenes and is managed by the Java run time.
Syntax of Lambda Expressions
Lambda Expressions have three parts, i.e. ArgList, Arrow and Body, like in the C# syntax (see Syntax Decision):
Lambda = ArgList Arrow Body
ArgList = Identifier
| "(" Identifier [ "," Identifier ]* ")"
| "(" Type Identifier [ "," Type Identifier ]* ")"
Body = Expression
| "{" [ Statement ";" ]+ "}"
Examples:
x -> x + 1
(a, b) -> a * b
(int a, int b) -> a * b
() -> System.out.print("Hello World! I'm a Runnable")
(Point p) -> { System.out.print("p="); System.out.println(p);}
Functional Interfaces
You may find in the Java documentation the following description: "Functional interfaces provide target types for lambda expressions and method references. Each functional interface has a single abstract method, called the functional method for that functional interface, to which the lambda expression's parameter and return types are matched or adapted. [...]" (see Package java.util.function)
Many useful Functional Interfaces are already implemented (see Uses of Class java.lang.FunctionalInterface). A functional interface is always annotated with @FunctionalInterface and has just one abstract method. The annotation is just informal, but it is a good practice to use it your own functional interfaces.
For example the java.lang.Runnable interface:
package java.lang;
@FunctionalInterface
public interface Runnable {
public abstract void run();
}
So, it is possible to assign and execute a Lambda Expression:
Runnable myRunnable = () -> System.out.println("Hello World!");
myRunnable.run();
Expected output:
Hello World!
Streams of Elements
The package java.util.stream contains classes and interfaces to support functional-style operations on streams of elements, such as map-reduce transformations on collections. With Interface Stream<T> as the central interface of this package and some primitive specializations for IntStream, LongStream, and DoubleStream. This interface is used in many other packages like java.io, java.nio.file, java.util.jar, java.util.regex, java.util.stream and java.util.zip.
"Stream operations are divided into intermediate and terminal operations, and are combined to form stream pipelines. A stream pipeline consists of a source (such as a Collection, an array, a generator function, or an I/O channel); followed by zero or more intermediate operations such as Stream.filter or Stream.map; and a terminal operation such as Stream.forEach or Stream.reduce." (see Package java.util.stream).
In the following you will see many examples how to use intermediate and terminal operations on streams.
Output with Lambda Expressions and Method References
For the following collection of code examples we need a filled list to operate on an existing data sets.
private static final List<Point> points = createPoints();
private static List<Point> createPoints() {
List<Point> result = new ArrayList<>();
result.add(new Point(-4, -8));
result.add(new Point(-2, 9));
result.add(new Point(-1, -8));
result.add(new Point(0, -7));
result.add(new Point(1, 1));
result.add(new Point(2, 3));
result.add(new Point(2, 3));
result.add(new Point(2, -2));
result.add(new Point(4, -1));
return result;
}
The class java.awt.Point is just a very simple class that holds two number and this is convenient for calculations. With some minor changes all the examples can work with different classes.
Standard in Java to Print all Elements of a List
Before we start with lambdas we have a look at the standard way in Java to print all elements of the list:
for (Point point : points) {
System.out.print(point);
}
Expected output:
java.awt.Point[x=-4,y=-8]java.awt.Point[x=-2,y=9]java.awt.Point[x=-1,y=-8]
java.awt.Point[x=0,y=-7]java.awt.Point[x=1,y=1]java.awt.Point[x=2,y=3]
java.awt.Point[x=2,y=3]java.awt.Point[x=2,y=-2]java.awt.Point[x=4,y=-1]
Here the Iterator of the list is used to get each Point and do then the print operation. We use an External Iterator (aka Active Iterator) - this gives full control over the order of execution, exception handling, etc. This is in principal good but the code is serial and can just be executed in a parallel way with serious effort.
ForEach Lambda Expression or Method Reference to Print all Elements of a List
The Iterable.forEach() method can take a lambda expression:
points.forEach(p -> System.out.print(p));
or an instance method reference (System.out is an instance of PrintStream):
points.forEach(System.out::print);
Expected output:
java.awt.Point[x=-4,y=-8]java.awt.Point[x=-2,y=9]java.awt.Point[x=-1,y=-8]
java.awt.Point[x=0,y=-7]java.awt.Point[x=1,y=1]java.awt.Point[x=2,y=3]
java.awt.Point[x=2,y=3]java.awt.Point[x=2,y=-2]java.awt.Point[x=4,y=-1]
Here the forEach() of the stream is used to call the print operation. We call this an Internal Iterator (aka Passive Iterator) - which delegates the control to the Java run time. This code can be executed in a serial and parallel way. The disadvantage is that the order is not deterministic in the case it is executed in a parallel way.
ForEach to Print all Elements of a List with Special Formatting
With the following helper method the output can be printed in a more convenient format:
private static void printFormated(Point p) {
System.out.print("[" + p.x + ", " + p.y + "] ");
}
Then we can use a simple lambda expression:
points.forEach(p -> printFormated(p));
or we can use a static method reference:
points.forEach(LambdaBasics::printFormated);
Expected output:
[-4, -8] [-2, 9] [-1, -8] [0, -7] [1, 1] [2, 3] [2, 3] [2, -2] [4, -1]
Kinds of Method References
There are four kinds of method references that can be used (see also Method References):
static methods (MyClass::staticMethodName),
instance methods of a particular object (myObject::instanceMethodName),
instance methods of an arbitrary object of a particular type (MyType::methodName) and
constructors (MyClass::new).
Calculations with mapToInt(), reduce(), ifPresent() and sum()
Examples that demonstrate the use of some intermediate and terminal operations.
Calculate Sum of all X-Coordinates with mapToInt () and reduce()
The operation mapToInt() creates a new stream of the type integer and fills it with the x-coordinate. Then the operation reduce() calculates the sum of all x-coordinates:
int result = points.stream()
.mapToInt(p -> p.x) // map the x value of the point to IntStream
.peek(x -> System.out.print(x + " ")) // trace the values
.reduce(0, (x1, x2) -> x1 + x2); // initial value is needed
System.out.print("\nsum=" + result);
Expected output:
-4 -2 -1 0 1 2 2 2 4
sum=4
The intermediate operation peek() exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline (see Interface Stream<T>).
Calculate Sum of all X-Coordinates with mapToInt () and reduce() and ifPresent()
The initial value in the reduce() method can be empty, but in this case you should be sure that the stream has elements. If the stream could be empty the terminal operator ifPresent() should be.
points.stream()
.mapToInt(p -> p.x) // map the x value of the point to IntStream
.peek(x -> System.out.print(x + " ")) // trace the values
.reduce((x1, x2) -> x1 + x2) // no initial value is used
.ifPresent(s -> System.out.print("\nsum=" + s)); // in the case there is no empty list
Expected output:
-4 -2 -1 0 1 2 2 2 4
sum=4
You may notice that the terminal operation reduce is not the last called method. In this case the return value of reduce() is the container object OptionalInt. This class has the method ifPresent() which is called. Other useful methods of this container are isPresent(), orElse() and getAsInt().
Calculate Sum of all X-Coordinates with sum()
An other way to calculate the sum are the statistics functions of stream class.
int result = points.stream()
.mapToInt(p -> p.x) // map the x value of the point to IntStream
.peek(x -> System.out.print(x + " ")) // trace the values
.sum(); // standard IntStream method (like min, max, average, count)
System.out.print("\nsum=" + result);
Expected output:
-4 -2 -1 0 1 2 2 2 4
sum=4
Calculate Sum of all X-Coordinates with reduce() on Empty List
Notice that pointsEmpty is an empty list For empty lists the behavior may be different. As long the reduce() method has an initial value, the result is correct:
int result = pointsEmpty.stream()
.mapToInt(p -> p.x) // map the x value of the point to IntStream
.peek(x -> System.out.print(x + " ")) // trace the values
.reduce(0, (x1, x2) -> x1 + x2); // initial value is needed
System.out.print("\nsum=" + result);
Expected output:
sum=0
Calculate Sum of all X-Coordinates with reduce() and ifPresent() on Empty List
Notice that ifPresent() method doesn't execute the lambda for an empty list.
pointsEmpty.stream()
.mapToInt(p -> p.x) // map the x value of the point to IntStream
.peek(x -> System.out.print(x + " ")) // trace the values
.reduce((x1, x2) -> x1 + x2) // no initial value is used
.ifPresent(s -> System.out.print("\nsum=" + s)); // in the case there is no empty list
Expected output (is empty):
Calculate Sum of all X-Coordinates with sum() on Empty List
The sum() method works properly also with an empty list.
int result = pointsEmpty.stream()
.mapToInt(p -> p.x) // map the x value of the point to IntStream
.peek(x -> System.out.print(x + " ")) // trace the values
.sum(); // standard IntStream method (like min, max, average, count)
System.out.print("\nsum=" + result);
Expected output:
sum=0
Use of filter() and distinct()
Filter all Points which are Positive in X with Lambda Expression
points.forEach(p -> {
if (p.x > 0) {
printFormated(p);
}
});
Expected output:
[1, 1] [2, 3] [2, 3] [2, -2] [4, -1]
Filter all Points which are Positive in X with filter()
points.stream().filter(p -> p.x > 0).forEach(LambdaBasics::printFormated);
Expected output:
[1, 1] [2, 3] [2, 3] [2, -2] [4, -1]
Filter Distinct Points which are Positive in X with filter()
points.stream().filter(p -> p.x > 0).distinct().forEach(LambdaBasics::printFormated);
Expected output:
[1, 1] [2, 3] [2, -2] [4, -1]
Wrong/Correct Way to Add Points to an Existing List
Add Point to Original List in the Case the X-Value is Equal Two
This implementation is wrong!
List<Point> points = createPoints();
try {
points.stream() // modify during iteration
.filter(p -> p.x == 2) // but just some points
.map(p -> new Point(100 * p.x, 10 * p.y)) // create a new point
.forEach(points::add); // add new the point
points.forEach(LambdaBasics::printFormated); // print results
} catch (ConcurrentModificationException e) {
System.out.println(e.toString());
}
Expected output:
java.util.ConcurrentModificationException
Add Point to New List in the Case the X-Value is Equal Two
This implementation is correct!
Stream<Point> pointsResults = Stream.concat( // concatenate two streams
points.stream(), // original stream
points.stream() // add points to new stream
.filter(p -> p.x == 2) // but just some points
.map(p -> new Point(100 * p.x, 10 * p.y))
);
pointsResults.forEach(LambdaBasics::printFormated); // print results
Expected output:
[-4, -8] [-2, 9] [-1, -8] [0, -7] [1, 1] [2, 3] [2, 3] [2, -2] [4, -1] [200, 30] [200, 30] [200, -20]
Use collect() and Collectors
The class Collectors is used for accumulating elements into collections and has a large number of useful reduction operations like grouping, partitioning, joining, counting, statistic, mapping (see Class Collectors).
Use Collectors to Store all Points with Positive X into a New List
List<Point> result = points.stream()
.filter(p -> p.getX() > 0)
.collect(Collectors.toCollection(ArrayList::new));
result.forEach(LambdaBasics::printFormated);
Expected output:
[1, 1] [2, 3] [2, 3] [2, -2] [4, -1]
The interesting thing here it the Collectors.toCollection(ArrayList::new) which creates a new ArrayList and fills it.
Use Collectors to Create Comma Separated String
String resultString = points.stream()
.mapToInt(p -> p.x) // extract x value
.mapToObj(Integer::toString) // create String with integer value
.collect(Collectors.joining(", ")); // create string
System.out.println(resultString);
Expected output:
-4, -2, -1, 0, 1, 2, 2, 2, 4
The performance of the method Collectors.joining() is better than a direct String concatenation in the lambda expression.
Performance of Sequential and Parallel Execution
The following tests are implemented with the execute-around-pattern using Lambdas. For this we need a Functional interface:
package com.sw_engineering_candies.examples;
@FunctionalInterface
public interface PerformanceTestLambda {
void execute();
}
The helper class for the test cases:
package com.sw_engineering_candies.examples;
public class PerformanceTestCase {
private PerformanceTestLambda lambda;
private String comment;
public PerformanceTestCase(
PerformanceTestLambda lambda,
String comment) {
this.lambda = lambda;
this.comment = comment;
}
public PerformanceTestLambda getLambda() {
return lambda;
}
public String getComment() {
return comment;
}
}
An the test runner:
package com.sw_engineering_candies.examples;
import java.util.ArrayList;
import java.util.List;
public class PerformanceTestRunner {
private static final int EXECUTION_CYCLES = 100;
private static final int WARM_UP_CYCLES = 200;
private static final List<PerformanceTestCase> tests = new ArrayList<>();
public static void add(String description, PerformanceTestLambda block) {
tests.add(new PerformanceTestCase(block, description));
}
public static void executeAll() {
System.out.println();
tests.forEach(PerformanceTestRunner::initialize);
System.out.println();
tests.forEach(PerformanceTestRunner::execute);
}
private static void execute(PerformanceTestCase testCase) {
long start = System.currentTimeMillis();
for (int count = 0; count < EXECUTION_CYCLES; count++) {
testCase.getLambda().execute();
}
double duration = ((double) (System.currentTimeMillis() - start)
/ EXECUTION_CYCLES);
System.out.println("test case '" + testCase.getComment()
+ "' elapsed time per execution "
+ duration + " ms");
}
private static void initialize(PerformanceTestCase testcase) {
System.out.print("test case '" + testcase.getComment() + "' warm-up ");
for (int count = 0; count < WARM_UP_CYCLES; count++) {
testcase.getLambda().execute();
if (count % (WARM_UP_CYCLES / Math.min(20, WARM_UP_CYCLES)) == 0) {
System.out.print(".");
}
}
System.out.println(" ready");
}
}
Then the following performance tests can be executed:
package com.sw_engineering_candies.examples;
import java.util.ArrayList;
import java.util.List;
import com.google.common.base.Preconditions;
public class LambdaPerformance {
private static final int NUMBER_OF_DATA = 100000;
private static List<Double> values = new ArrayList<>(NUMBER_OF_DATA);
static {
for (int i = 0; i < NUMBER_OF_DATA; i++) {
values.add(Math.random());
}
}
public static void main(String[] args) {
PerformanceTestRunner.add("standard for-loop", () -> {
double result = Double.MIN_VALUE;
double y = Double.MIN_VALUE;
for (double x : values) {
y = Math.cos(x);
if (y > result) {
result = y;
}
}
});
PerformanceTestRunner.add("sequential", () -> {
double result = values.stream()
.mapToDouble(Math::cos)
.reduce(Double.MIN_VALUE, (i, j) -> Math.max(i, j));
});
PerformanceTestRunner.add("parallel", () -> {
double result = values.parallelStream()
.mapToDouble(Math::cos)
.reduce(Double.MIN_VALUE, (i, j) -> Math.max(i, j));
});
PerformanceTestRunner.executeAll();
}
}
Expected output:
test case 'standard for-loop' warm-up .................... ready
test case 'sequential' warm-up .................... ready
test case 'parallel' warm-up .................... ready
test case 'standard for-loop' elapsed time per execution 5.68 ms
test case 'sequential' elapsed time per execution 6.1 ms
test case 'parallel' elapsed time per execution 1.79 ms
These results are not the same for all types of Collections, but in general two things are quite common for all collections:
the sequential execution is slightly slower than a standard for-loop and
the parallel execution may result in a significant performance improvement.
Further Reading
Lambda Expressions - Java Documentation; http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html
Lambda Expressions and Streams in Java - Tutorial & Reference, by Angelika Langer and Klaus Kreft; http://www.angelikalanger.com/Lambdas/Lambdas.html
Java 8 Friday: 10 Subtle Mistakes When Using the Streams API, by jOOQ Team; http://blog.jooq.org/2014/06/13/java-8-friday-10-subtle-mistakes-when-using-the-streams-api