10.6 Unique

The cascading.pipe.assembly.Unique SubAssembly is used to remove duplicate values in a Tuple stream. Uniqueness is determined by the values of all fields listed in uniqueFields. Thus to find all distinct Tuples in a Tuple stream, use Fields.ALL as the uniqueFields argument.

Example 10.10. Using Unique

// incoming -> first, last

assembly = new Unique( assembly, new Fields( "first", "last" ) );

// outgoing -> first, last

As of Cascading 2.2, Unique uses the FirstNBuffer to more efficiently determine unique values.

