The cascading.pipe.assembly.Unique
SubAssembly is used to remove duplicate values in a Tuple stream.
Uniqueness is determined by the values of all fields listed in
uniqueFields
. Thus to find all distinct Tuples in a Tuple
stream, use Fields.ALL
as the
uniqueFields
argument.
Example 10.10. Using Unique
// incoming -> first, last
assembly = new Unique( assembly, new Fields( "first", "last" ) );
// outgoing -> first, last
As of Cascading 2.2, Unique
uses the
FirstNBuffer
to more efficiently determine unique
values.
Copyright © 2007-2012 Concurrent, Inc. All Rights Reserved.