As can be seen above, the Each
and
Every
Pipe
classes provide
a means to merge input Tuple values with Operation result Tuple values
to create a final output Tuple, which are used as the input to the next
Pipe
instance. This merging is created through a
type of "field algebra", and can get rather complicated when factoring
in Fields sets, a kind of wildcard for specifying certain field
values.
Fields sets are constant values on the
Fields
class and can be used in many places the
Fields
class is expected. They are:
The cascading.tuple.Fields.ALL
constant is a "wildcard" that represents all the current
available fields.
The cascading.tuple.Fields.RESULTS
constant set is used to represent the field names of the current
Operations return values. This Fields set may only be used as an
output selector on a Pipe where it replaces in the input Tuple
with the Operation result Tuple in the stream.
The cascading.tuple.Fields.REPLACE
constant is used as an output selector to inline-replace values
in the incoming Tuple with the results of an Operation. This is
a convenience Fields set that allows subsequent Operations to
'step' on the value with a given field name. The current
Operation must always use the exact same field names, or the
ARGS
Fields set.
The cascading.tuple.Fields.SWAP
constant is used as an output selector to swap out Operation
arguments with its results. Neither the argument and result
field names or size need to be the same. This is useful for when
the Operation arguments are no longer necessary and the result
Fields and values should be appended to the remainder of the
input field names and Tuple.
The cascading.tuple.Fields.ARGS
constant is used to let a given Operation inherit the field
names of its argument Tuple. This Fields set is a convenience
and is typically used when the Pipe output selector is
RESULTS
or
REPLACE
. It is specifically used by the
Identity Function when coercing values from Strings to primitive
types.
The cascading.tuple.Fields.GROUP
constant represents all the fields used as grouping values in a
previous Group. If there is no previous Group in the pipe
assembly, the GROUP
represents all the
current field names.
The cascading.tuple.Fields.VALUES
constant represent all the fields not used as grouping fields in
a previous Group.
The cascading.tuple.Fields.UNKNOWN
constant is used when Fields must be declared, but how many and
their names is unknown. This allows for arbitrarily length
Tuples from an input source or some Operation. Use this Fields
set with caution.
Below is a reference chart showing common ways to merge input and
result fields for the desired output fields. See the section on Each and Every Pipes for details on the different columns and their
relationships to the Each
and
Every
Pipes and Functions, Aggregators, and
Buffers.
Copyright © 2007-2008 Concurrent, Inc. All Rights Reserved.