cascading.pipe.assembly
Class Unique.FilterPartialDuplicates
java.lang.Object
cascading.operation.BaseOperation<LinkedHashMap<Tuple,Object>>
cascading.pipe.assembly.Unique.FilterPartialDuplicates
- All Implemented Interfaces:
- Filter<LinkedHashMap<Tuple,Object>>, Operation<LinkedHashMap<Tuple,Object>>, Serializable
- Enclosing class:
- Unique
public static class Unique.FilterPartialDuplicates
- extends BaseOperation<LinkedHashMap<Tuple,Object>>
- implements Filter<LinkedHashMap<Tuple,Object>>
Class FilterPartialDuplicates is a Filter
that is used to remove observed duplicates from the tuple stream.
Use this class typically in tandem with a First
Aggregator
in order to improve de-duping performance by removing as many values
as possible before the intermediate GroupBy
operator.
The threshold
value is used to maintain a LRU of a constant size. If more than threshold unique values
are seen, the oldest cached values will be removed from the cache.
- See Also:
Unique
,
Serialized Form
Fields inherited from interface cascading.operation.Operation |
ANY |
Unique.FilterPartialDuplicates
public Unique.FilterPartialDuplicates()
- Constructor FilterPartialDuplicates creates a new FilterPartialDuplicates instance.
Unique.FilterPartialDuplicates
@ConstructorProperties(value="threshold")
public Unique.FilterPartialDuplicates(int threshold)
- Constructor FilterPartialDuplicates creates a new FilterPartialDuplicates instance.
- Parameters:
threshold
- of type int
prepare
public void prepare(FlowProcess flowProcess,
OperationCall<LinkedHashMap<Tuple,Object>> operationCall)
- Description copied from class:
BaseOperation
- Method prepare does nothing, and may safely be overridden.
- Specified by:
prepare
in interface Operation<LinkedHashMap<Tuple,Object>>
- Overrides:
prepare
in class BaseOperation<LinkedHashMap<Tuple,Object>>
isRemove
public boolean isRemove(FlowProcess flowProcess,
FilterCall<LinkedHashMap<Tuple,Object>> filterCall)
- Description copied from interface:
Filter
- Method isRemove returns true if input should be removed from the tuple stream.
- Specified by:
isRemove
in interface Filter<LinkedHashMap<Tuple,Object>>
- Parameters:
flowProcess
- of type FlowProcessfilterCall
- of type FilterCall
- Returns:
- boolean
cleanup
public void cleanup(FlowProcess flowProcess,
OperationCall<LinkedHashMap<Tuple,Object>> operationCall)
- Description copied from class:
BaseOperation
- Method cleanup does nothing, and may safely be overridden.
- Specified by:
cleanup
in interface Operation<LinkedHashMap<Tuple,Object>>
- Overrides:
cleanup
in class BaseOperation<LinkedHashMap<Tuple,Object>>
equals
public boolean equals(Object object)
- Overrides:
equals
in class BaseOperation<LinkedHashMap<Tuple,Object>>
hashCode
public int hashCode()
- Overrides:
hashCode
in class BaseOperation<LinkedHashMap<Tuple,Object>>
Copyright © 2007-2012 Concurrent, Inc. All Rights Reserved.