Frequently, objects in one
compared to objects in a second
Tuple. This is
especially true during the sort phase of
CoGroup in Cascading Hadoop mode . By
default, Hadoop and Cascading use the native
hashCode() to compare two values and get a
consistent hash code for a given value, respectively.
To override this default behavior, you can create a custom
java.util.Comparator class to perform comparisons
on a given field in a Tuple. For instance, to secondary-sort a
collection of custom
Person objects in a
GroupBy, use the
Fields.setComparator() method to designate the custom
Comparator to the
instance that specifies the sort fields.
Alternatively, you can set a default
Comparator to be used by a
Flow, or used locally on a given
Pipe instance. There are two ways to do this.
FlowProps.setDefaultTupleElementComparator() on a
Properties instance, or use the property key
If the hash code must also be customized, the custom Comparator
can implement the interface
cascading.tuple.Hasher. For more information, see
Copyright © 2007-2012 Concurrent, Inc. All Rights Reserved.