7.7 Java Expression Operations

Cascading provides some support for dynamically compiled Java expression to be used as either Functions or Filters. This functionality is provided by the Janino embedded compiler. Janino and its documentation can be found on its website,http://www.janino.net/. But in short, an Expression is a single line of Java, for example a + 3 * 2 ora < 7. The first would resolve to some number, the second to a boolean value. Where a and b are field names passed in as Tuple arguments to the Operation. Janino will compile this expression into byte code giving compiled code processing speeds.

ExpressionFunction

The cascading.operation.expression.ExpressionFunction function dynamically resolves a given expression using argument Tuple values as inputs to the fields specified in the expression.

// incoming -> "ip", "time", "method", "event", "status", "size"

String exp =
  "\"this \" + method + \" request was \" + size + \" bytes\"";
Fields fields = new Fields( "pretty" );
ExpressionFunction function =
  new ExpressionFunction( fields, exp, String.class );

assembly =
  new Each( assembly, new Fields( "method", "size" ), function );

// outgoing -> "pretty" = "this GET request was 1282652 bytes"

Above, we create a new String value form our expression. Note we must declare the type of every input Tuple value so the expression compiler knows how to treat the variables in the expression.

ExpressionFilter

The cascading.operation.expression.ExpressionFilter filter dynamically resolves a given expression using argument Tuple values as inputs to the fields specified in the expression. Any Tuple that returns true for the given expression will be removed from the stream.

// incoming -> "ip", "time", "method", "event", "status", "size"

ExpressionFilter filter =
  new ExpressionFilter( "status != 200", Integer.TYPE );

assembly = new Each( assembly, new Fields( "status" ), filter );

// outgoing -> "ip", "time", "method", "event", "status", "size"

Above, every line in the Apache log that does not have a "200" status will be filtered out. Notice that the "status" would be a String in this example if it was emitted from a RegexParser, if so the ExpressionFilter will coerce the value from a String to an int for the comparison.

Copyright © 2007-2008 Concurrent, Inc. All Rights Reserved.