A Function
expects a single argument
Tuple
, and may return zero or more result
Tuples.
A Function
may only be used with a
Each
pipe which may follow any other pipe
type.
To create a customFunction
, subclass the
class cascading.operation.BaseOperation
and implement the
interfacecascading.operation.Function
. Because
BaseOperation
has been subclassed, the operate
method, as defined on the Function
interface, is the only
method that must be implemented.
Example 5.1. Custom Function
public class SomeFunction extends BaseOperation implements Function { public void operate( FlowProcess flowProcess, FunctionCall functionCall ) { // get the arguments TupleEntry TupleEntry arguments = functionCall.getArguments(); // create a Tuple to hold our result values Tuple result = new Tuple(); // insert some values into the result Tuple // return the result Tuple functionCall.getOutputCollector().add( result ); } }
Functions should declare both the number of argument values they expect, and the field names of the Tuple they will return.
Functions must accept 1 or more values in a Tuple as arguments, by
default they will accept any number (Operation.ANY
) of
values. Cascading will verify that the number of arguments selected
match the number of arguments expected during the planning phase.
Functions may optionally declare the field names they return, by
default Functions
declare
Fields.UNKNOWN
.
Both declarations must be done on the constructor, either by
passing default values to the super
constructor, or by
accepting the values from the user via a constructor
implementation.
Example 5.2. Add Values Function
public class AddValuesFunction extends BaseOperation implements Function { public AddValuesFunction() { // expects 2 arguments, fail otherwise super( 2, new Fields( "sum" ) ); } public AddValuesFunction( Fields fieldDeclaration ) { // expects 2 arguments, fail otherwise super( 2, fieldDeclaration ); } public void operate( FlowProcess flowProcess, FunctionCall functionCall ) { // get the arguments TupleEntry TupleEntry arguments = functionCall.getArguments(); // create a Tuple to hold our result values Tuple result = new Tuple(); // sum the two arguments int sum = arguments.getInteger( 0 ) + arguments.getInteger( 1 ); // add the sum value to the result Tuple result.add( sum ); // return the result Tuple functionCall.getOutputCollector().add( result ); } }
The example above implements a fully functional
Function
that accepts two values in the argument
Tuple, adds them together, and returns the result in a new Tuple.
The first constructor assumes a default field name this function will return, but it is a best practice to always give the user the option to override the declared field names to prevent any field name collisions that would cause the planner to fail.
Copyright © 2007-2008 Concurrent, Inc. All Rights Reserved.