cascading.tuple.hadoop.collect
Class HadoopSpillableTupleList

java.lang.Object
  extended by cascading.tuple.collect.SpillableTupleList
      extended by cascading.tuple.hadoop.collect.HadoopSpillableTupleList
All Implemented Interfaces:
Spillable, Iterable<Tuple>, Collection<Tuple>

public class HadoopSpillableTupleList
extends SpillableTupleList

SpillableTupleList is a simple Iterable object that can store an unlimited number of Tuple instances by spilling excess to a temporary disk file.

Spills will automatically be compressed using the defaultCodecs values. To disable compression or change the codecs, see SpillableProps.SPILL_COMPRESS and SpillableProps.SPILL_CODECS.

It is recommended to add Lzo if available. "org.apache.hadoop.io.compress.LzoCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec"


Nested Class Summary
 
Nested classes/interfaces inherited from interface cascading.tuple.collect.Spillable
Spillable.SpillListener, Spillable.SpillStrategy
 
Field Summary
static String defaultCodecs
           
 
Fields inherited from class cascading.tuple.collect.SpillableTupleList
SPILL_CODECS, SPILL_COMPRESS, SPILL_THRESHOLD
 
Constructor Summary
HadoopSpillableTupleList(int threshold, CompressionCodec codec, JobConf jobConf)
          Constructor SpillableTupleList creates a new SpillableTupleList instance using the given threshold value, and the first available compression codec, if any.
HadoopSpillableTupleList(int threshold, TupleSerialization tupleSerialization, CompressionCodec codec)
           
 
Method Summary
protected  TupleInputStream createTupleInputStream(File file)
           
protected  TupleOutputStream createTupleOutputStream(File file)
           
static CompressionCodec getCodec(FlowProcess flowProcess, String defaultCodecs)
           
 
Methods inherited from class cascading.tuple.collect.SpillableTupleList
add, addAll, clear, contains, containsAll, getCodecClass, getGrouping, getThreshold, isEmpty, iterator, remove, removeAll, retainAll, setGrouping, setSpillListener, setSpillStrategy, size, spillCount, toArray, toArray
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.util.Collection
equals, hashCode
 

Field Detail

defaultCodecs

public static final String defaultCodecs
See Also:
Constant Field Values
Constructor Detail

HadoopSpillableTupleList

public HadoopSpillableTupleList(int threshold,
                                CompressionCodec codec,
                                JobConf jobConf)
Constructor SpillableTupleList creates a new SpillableTupleList instance using the given threshold value, and the first available compression codec, if any.

Parameters:
threshold - of type long
codec - of type CompressionCodec

HadoopSpillableTupleList

public HadoopSpillableTupleList(int threshold,
                                TupleSerialization tupleSerialization,
                                CompressionCodec codec)
Method Detail

getCodec

public static CompressionCodec getCodec(FlowProcess flowProcess,
                                        String defaultCodecs)

createTupleOutputStream

protected TupleOutputStream createTupleOutputStream(File file)
Specified by:
createTupleOutputStream in class SpillableTupleList

createTupleInputStream

protected TupleInputStream createTupleInputStream(File file)
Specified by:
createTupleInputStream in class SpillableTupleList


Copyright © 2007-2014 Concurrent, Inc. All Rights Reserved.