cascading.tuple.collect
Interface TupleMapFactory<Config>
- All Superinterfaces:
- CascadingFactory<Config,Map<Tuple,Collection<Tuple>>>
public interface TupleMapFactory<Config>
- extends CascadingFactory<Config,Map<Tuple,Collection<Tuple>>>
Interface TupleMapFactory allows developers to plugin alternative implementations of a "tuple map"
used to back in memory "join" and "co-group" operations. Typically these implementations are
"spillable", in that to prevent using up all memory in the JVM, after some threshold is met or event
is triggered, values are persisted to disk.
The Map
classes returned must take a Tuple
as a key, and a Collection
of Tuples as
a value. Further, Map.get(Object)
must never return null
, but on the first call to get() on the map
an empty Collection must be created and stored.
That is, Map.put(Object, Object)
is never called on the map instance internally,
only map.get(groupTuple).add(valuesTuple)
.
Using the TupleCollectionFactory
to create the underlying Tuple Collections would allow that aspect
to be pluggable as well.
If the Map implementation implements the Spillable
interface, it will receive a Spillable.SpillListener
instance that calls back to the appropriate logging mechanism for the platform. This instance should be passed
down to any child Spillable types, namely an implementation of SpillableTupleList
.
The default implementation for the Hadoop platform is the cascading.tuple.hadoop.collect.HadoopTupleMapFactory
which created a cascading.tuple.hadoop.collect.HadoopSpillableTupleMap
instance.
The class SpillableTupleMap
may be used as a base class.
- See Also:
SpillableTupleMap
,
cascading.tuple.hadoop.collect.HadoopTupleMapFactory
,
TupleCollectionFactory
,
cascading.tuple.hadoop.collect.HadoopTupleCollectionFactory
TUPLE_MAP_FACTORY
static final String TUPLE_MAP_FACTORY
- See Also:
- Constant Field Values
Copyright © 2007-2013 Concurrent, Inc. All Rights Reserved.