5.2 Functions

A Function expects a single argument Tuple, and may return zero or more result Tuples.

A Function may only be used with a Each pipe which may follow any other pipe type.

To create a customFunction, subclass the class cascading.operation.BaseOperation and implement the interfacecascading.operation.Function. Because BaseOperation has been subclassed, the operate method, as defined on the Function interface, is the only method that must be implemented.

Example 5.1. Custom Function

public class SomeFunction extends BaseOperation implements Function
  {
  public void operate( FlowProcess flowProcess, FunctionCall functionCall )
    {
    // get the arguments TupleEntry
    TupleEntry arguments = functionCall.getArguments();

    // create a Tuple to hold our result values
    Tuple result = new Tuple();

    // insert some values into the result Tuple

    // return the result Tuple
    functionCall.getOutputCollector().add( result );
    }
  }

Functions should declare both the number of argument values they expect, and the field names of the Tuple they will return.

Functions must accept 1 or more values in a Tuple as arguments, by default they will accept any number (Operation.ANY) of values. Cascading will verify that the number of arguments selected match the number of arguments expected during the planning phase.

Functions may optionally declare the field names they return, by default Functions declare Fields.UNKNOWN.

Both declarations must be done on the constructor, either by passing default values to the super constructor, or by accepting the values from the user via a constructor implementation.

Example 5.2. Add Values Function

public class AddValuesFunction extends BaseOperation implements Function
  {
  public AddValuesFunction()
    {
    // expects 2 arguments, fail otherwise
    super( 2, new Fields( "sum" ) );
    }

  public AddValuesFunction( Fields fieldDeclaration )
    {
    // expects 2 arguments, fail otherwise
    super( 2, fieldDeclaration );
    }

  public void operate( FlowProcess flowProcess, FunctionCall functionCall )
    {
    // get the arguments TupleEntry
    TupleEntry arguments = functionCall.getArguments();

    // create a Tuple to hold our result values
    Tuple result = new Tuple();

    // sum the two arguments
    int sum = arguments.getInteger( 0 ) + arguments.getInteger( 1 );

    // add the sum value to the result Tuple
    result.add( sum );

    // return the result Tuple
    functionCall.getOutputCollector().add( result );
    }
  }

The example above implements a fully functional Function that accepts two values in the argument Tuple, adds them together, and returns the result in a new Tuple.

The first constructor assumes a default field name this function will return, but it is a best practice to always give the user the option to override the declared field names to prevent any field name collisions that would cause the planner to fail.

Copyright © 2007-2008 Concurrent, Inc. All Rights Reserved.