25. λ Expression Review
Part of
CS:2820 Object Oriented Software Development Notes, Fall 2020
|
Here is a little Java program, something any first-semester programmer should be able to understand early in the semester:
public class Demo { public static void main(String[] args) { int a = 1; for (int i = 1; i <= 10; i++) { a = a + a; } System.out.println( "= " + a ); } }
Each iteration of the loop doubles a, starting with one, so the successive values are 1, 2, 4, 8, 16, 32, through powers of two up to 1024 after the 10th iteration. Hardly exciting!
Here is what might at first seem to be a trivial change to the above program, adding λ expressions.
interface Lazy<T> { T eval(); } public class LazyDemo { public static void main(String[] args) { Lazy<Integer> a = () -> 1; for (int i = 1; i <= 30; i++) { final Lazy<Integer> b = a; a = () -> b.eval() + b.eval(); } System.out.println( "= " + a.eval() ); } }
At first glance, the interface Lazy seems to do nothing useful, and the main program seems to have the exact same logic as the original, except that a bunch of annoying λ expressions have been added, making things more verbose and apparently accomplishing nothing at all.
As we will see, this is not true! This program actually builds a complex data structure, and its execution is recursive.
One of the more annoying additions to the program is the variable b added inside the loop. It would have been nice to write the body of the loop like this:
a = () -> a.eval() + a.eval();
Unfortunately, the above is illegaal because the variable a is not final or effectively final — obviously, since it is assigned many times inside the loop. Therefore, we need to make a final copy of a:
final Lazy<Integer> b = a; a = () -> b.eval() + b.eval();
Before going on, go back and look at LazyDemo and see if you can find the data structures it builds and the recursive code that traverses that data structure.
To see what is going on above, it is worth looking at how Java implements λ expressions. We can rewrite the example removing λ notation as follows:
interface Lazy<T> { T eval(); } public class LazyDemo { public static void main(String[] args) { Lazy<Integer> a = new Lazy<Integer> { // public Integer eval() { //+ return 1; //+ } //+ }; //+ for (int i = 1; i <= 30; i++) { final Lazy<Integer> b = a; a = new Lazy<Integer> { // public Integer eval() { //+ return b.eval() + b.eval(); //+ } //+ }; //+ } System.out.println( "= " + a.eval() ); } }
Lines that are added to the original are marked with //+, lines with small changes from the original are marked with //.
Now, you should see that the program created a total of 11 objects of class Lazy<Integer>. One was the initial value of a, and 10 more were created during the iterations of the loop body.
Some quiz questions:
To see what is going on above, it is worth looking at how Java implements anonymous inner classes:
interface Lazy<T> { T eval(); } public class LazyDemo { public static void main(String[] args) { class InitLazy implements Lazy<Integer> { //+ public Integer eval() { return 1; } } Lazy<Integer> a = new InitLazy(); // for (int i = 1; i <= 30; i++) { final Lazy<Integer> b = a; class UpdateLazy implements Lazy<Integer> { //+ pubic Integer eval() { return b.eval() + b.eval(); } } a = new UpdateLazy(); // } System.out.println( "= " + a.eval() ); } }
Again, lines that are added to the original are marked with //+, and lines with small changes from the original are marked with //. Lines that have merely been moved without changes are unmarked.
Again, it should be obvious that there are 11 new objects created here. The same quiz questions remain appropriate:
To see what is going on above, it is worth looking at how Java implements inner classes:
interface Lazy<T> { T eval(); } class InitLazy implements Lazy<Integer> { public Integer eval() { return 1; } } class UpdateLazy implements Lazy<Integer> { final Lazy<Integer> b; //+ UpdateLazy( Lazy<Integer> i ) { //+ b = i; //+ } //+ public Integer eval() { return b.eval() + b.eval(); } } public class LazyDemo { public static void main(String[] args) { Lazy<Integer> a = new InitLazy(); for (int i = 1; i <= 30; i++) { a = new UpdateLazy( a ); // } System.out.println( "= " + a.eval() ); } }
Again, lines that are added to the original are marked with //+, and lines with small changes from the original are marked with //. Lines that have merely been moved without changes are unmarked.
We made one optimization! We eliminated the final variablbe b from inside the loop, since we were forced to put a new variable inside the UpdateLazy object and the auxiliary final variable in the loop became redundant once we did this.
The same quiz questions remain relevant, but the answers should now be evident:
The only arithmetic done inside the loop involves incrementing the loop control variable i. The job of the loop is to create objects, where each object holds (in its instance variable b) a link to the previous object.
If each object contains a link to another object, we have a linked list!
If none of the arithmetic is done inside the loop, all the arithmetic must be done by a.eval() inside the final print statement.
There are some new quiz questions that become appropriate at this point:
Recursively. The first call to eval() uses the definition from class UpdateLazy:
return b.eval() + b.eval();
There are 10 list elements from class UpdateLazy, so each of the above calls to b.eval() does the same thing until finally the recursion terminates when an element of class InitLazy is found.
The program runs in time proportional to 2n when the loop has n iterations. 10 iterations is not difficult, but try it with 30 iterations, and you will see it slow down considerably.
To change the topic completely, can you find all the autoboxing and unboxing in this program? The code involves many many conversions between the primitive type int and the class Integer. These undoubtedly slow the program considerably, but they only impose a linear time performance penalty, while the traversal of the list imposes an exponential penalty.
What we've done above, originally using λ expressions and then unbundling to explain how it worked, is called lazy computation. The variable a inside our loop does not hold the result of the computation we want, it holds the instructions to do that computation when we need the result.
This particular example doesn't make lazyness look valuable, but there are times when lazyness is extraordinarily valuable. Some problems, for example, are best described by infinite data structures. No particular run of a program will ever need the entire infinite data structure, so what we can do is describe that structure using a lazy algorithm to construct it. Then, when the program runs, the eval methods are called only to generate just enough of the data structure to get the result we need.
The worst feature of the code above is that each call to eval required two recursive calls to eval. Of course, we could fix this by replacing this:
return b.eval() + b.eval();
With this:
return 2 * b.eval();
That is a one-time optimization requiring that you understand the algorithm. There is a trick called memoization (a term coined in 1968 by Donald Michie) that eliminates the exponential cost without requiring that we understand the algorithm being evaluated.
The idea of memoization is to have each object retain a memorandum of its value. The first time you call the eval method on an object, it actually does whatever work is required, but it also keeps a memo of the result it computed. On the second and later calls to eval on the same object, it returns the memo instad of repeating the work it already did.
Class Memo in the following version of our code does this. Here, we return to the use of compact λ notation for the sake of readability.
interface Lazy<T> { T eval(); } class Memo<T> implements Lazy<T> { private Lazy<T> lazy; // the lazy expression, or null after evaluation private T memo = null; // the memorandum of the previous value public Memo( Lazy<T> lazy ) { this.lazy = lazy; } public T eval() { if (lazy == null) return memo; memo = lazy.eval(); lazy = null; return memo; } } public class MemoDemo { public static void main(String[] args) { Lazy<Integer> a = () -> 1; for (int i = 1; i <= 30; i++) { final Memo<Integer> b = new Memo<Integer>( a ); a = () -> b.eval() + b.eval(); } System.out.println( "= " + a.eval() ); } }
Note that the main program has hardly changed, but that, because we have memoized the data structure, execution now takes linear time in the number of iterations of the loop. Setting the number of iterations to 30, the non-memoized lazy version took 4.35 seconds where the memoized version ran in about 0.1 seconds on one server.
There are some little quiz questions appropriate for this code:
Twice as complex. Where there was a list of 10 Lazy objects, there are now 20 objects, aternating Lazy and Memo objects.
Of course there are an infinite number of rewrites. Here is one:
final Lazy<Integer> b = a; a = new Memo<Integer>( () -> b.eval() + b.eval() );
It reverses the order of the 10 pairs of Memo and Lazy items in the list it builds.