Interface Buffer<C>

All Superinterfaces:

public interface Buffer<C>
extends Operation<C>

A Buffer is similar to an Aggregator by the fact that it operates on unique groups of values. It differs by the fact that an Iterator is provided and it is the responsibility of the operate(cascading.flow.FlowProcess, BufferCall) method to iterate overall all the input arguments returned by this Iterator, if any.

For the case where a Buffer follows a CoGroup, the method operate(cascading.flow.FlowProcess, BufferCall) will be called for every unique group whether or not there are values available to iterate over. This may be counter-intuitive for the case of an 'inner join' where the left or right stream may have a null grouping key value. Regardless, the current grouping value can be retrieved through BufferCall.getGroup().

Buffer is very useful when header or footer values need to be inserted into a grouping, or if values need to be inserted into the middle of the group values. For example, consider a stream of timestamps. A Buffer could be used to add missing entries, or to calculate running or moving averages over a smaller "window" within the grouping.

There may be only one Buffer after a GroupBy or CoGroup. And there may not be any additional Every pipes before or after the buffers Every pipe instance. A PlannerException will be thrown if these rules are violated.

Buffer implementations should be re-entrant. There is no guarantee a Buffer instance will be executed in a unique vm, or by a single thread. Also, note the Iterator will return the same TupleEntry instance, but with new values in its child Tuple.

Field Summary
Fields inherited from interface cascading.operation.Operation
Method Summary
 void operate(FlowProcess flowProcess, BufferCall<C> bufferCall)
          Method operate is called once for each grouping.
Methods inherited from interface cascading.operation.Operation
cleanup, getFieldDeclaration, getNumArgs, isSafe, prepare

Method Detail


void operate(FlowProcess flowProcess,
             BufferCall<C> bufferCall)
Method operate is called once for each grouping. BufferCall passes in an Iterator that returns an argument TupleEntry for each value in the grouping defined by the argument selector on the parent Every pipe instance.

TupleEntry entry, or entry.getTuple() should not be stored directly in a collection or modified. A copy of the tuple should be made via the new Tuple( entry.getTuple() ) copy constructor.

This method is called for every unique group, whether or not there are values in the arguments Iterator.

flowProcess - of type FlowProcess
bufferCall - of type BufferCall

Copyright © 2007-2010 Concurrent, Inc. All Rights Reserved.