Cascading provides some support for dynamically-compiled Java
expressions to be used in either Functions
or
Filters
. This capability is provided by the
Janino embedded Java compiler, which compiles the expressions into byte
code for optimal processing speed. Janino is documented in detail on its
website, http://www.janino.net/.
This capability allows an Operation to evaluate a suitable
one-line Java expression, such as a + 3 * 2
or a <
7
, where the variable values ( a
and b
)
are passed in as Tuple fields. The result of the Operation thus depends
on the evaluated result of the expression - in the first example, some
number, and in the second, a Boolean value.
The function
cascading.operation.expression.ExpressionFunction
dynamically composes a string expression when executed,
assigning argument Tuple values to variables in the
expression.
// incoming -> "ip", "time", "method", "event", "status", "size"
String exp =
"\"this \" + method + \" request was \" + size + \" bytes\"";
Fields fields = new Fields( "pretty" );
ExpressionFunction function =
new ExpressionFunction( fields, exp, String.class );
assembly =
new Each( assembly, new Fields( "method", "size" ), function );
// outgoing -> "pretty" = "this GET request was 1282652 bytes"
Above, we create a new String value that contains an expression containing values from the current Tuple. Note that you must declare the type for every input Tuple field so that the expression compiler knows how to treat the variables in the expression.
The filter
cascading.operation.expression.ExpressionFilter
evaluates a Boolean expression, assigning argument Tuple values
to variables in the expression. If the expression returns
true
, the Tuple is removed from the stream.
// incoming -> "ip", "time", "method", "event", "status", "size"
ExpressionFilter filter =
new ExpressionFilter( "status != 200", Integer.TYPE );
assembly = new Each( assembly, new Fields( "status" ), filter );
// outgoing -> "ip", "time", "method", "event", "status", "size"
In this example, every line in the Apache log that does
not have a status of "200" is filtered out. ExpressionFilter
coerces the value into the specified type if necessary to make
the comparison - in this case, coercing the status String into
an int
.
As of Cascading 2.2, along with
cascading.operation.expression.ExpressionFilter
and
cascading.operation.expression.ExpressionFunction
,
two new operations have been added to support multi-line Java code,
cascading.operation.expression.ScriptFilter
and
cascading.operation.expression.ScriptFunction
.
See the relevant Javadoc for details on usage.
Copyright © 2007-2012 Concurrent, Inc. All Rights Reserved.