Frequently, objects in one Tuple
are
compared to objects in a second Tuple
. This is
especially true during the sort phase of GroupBy
and CoGroup
in Cascading Hadoop mode . By
default, Hadoop and Cascading use the native
Object
methods equals()
and hashCode()
to compare two values and get a
consistent hash code for a given value, respectively.
To override this default behavior, you can create a custom
java.util.Comparator
class to perform comparisons
on a given field in a Tuple. For instance, to secondary-sort a
collection of custom Person
objects in a
GroupBy
, use the
Fields.setComparator()
method to designate the custom
Comparator
to the Fields
instance that specifies the sort fields.
Alternatively, you can set a default
Comparator
to be used by a
Flow
, or used locally on a given
Pipe
instance. There are two ways to do this.
Call FlowProps.setDefaultTupleElementComparator()
on a
Properties
instance, or use the property key
cascading.flow.tuple.element.comparator
.
If the hash code must also be customized, the custom Comparator
can implement the interface
cascading.tuple.Hasher
. For more information, see
the Javadoc.
Copyright © 2007-2012 Concurrent, Inc. All Rights Reserved.