|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cascading.scheme.Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector,Object[],Void> cascading.scheme.hadoop.SequenceFile
public class SequenceFile
A SequenceFile is a type of Scheme
, which is a flat file consisting of
binary key/value pairs. This is a space and time efficient means to store data.
Constructor Summary | |
---|---|
protected |
SequenceFile()
Protected for use by TempDfs and other subclasses. |
|
SequenceFile(Fields fields)
Creates a new SequenceFile instance that stores the given field names. |
Method Summary | |
---|---|
void |
sink(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess,
SinkCall<Void,org.apache.hadoop.mapred.OutputCollector> sinkCall)
Method sink writes out the given Tuple found on SinkCall.getOutgoingEntry() to
the SinkCall.getOutput() . |
void |
sinkConfInit(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess,
Tap<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector> tap,
org.apache.hadoop.mapred.JobConf conf)
Method sinkInit initializes this instance as a sink. |
boolean |
source(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess,
SourceCall<Object[],org.apache.hadoop.mapred.RecordReader> sourceCall)
Method source will read a new "record" or value from SourceCall.getInput() and populate
the available Tuple via SourceCall.getIncomingEntry() and return true
on success or false if no more values available. |
void |
sourceCleanup(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess,
SourceCall<Object[],org.apache.hadoop.mapred.RecordReader> sourceCall)
Method sourceCleanup is used to destroy resources created by Scheme.sourcePrepare(cascading.flow.FlowProcess, SourceCall) . |
void |
sourceConfInit(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess,
Tap<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector> tap,
org.apache.hadoop.mapred.JobConf conf)
Method sourceInit initializes this instance as a source. |
void |
sourcePrepare(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess,
SourceCall<Object[],org.apache.hadoop.mapred.RecordReader> sourceCall)
Method sourcePrepare is used to initialize resources needed during each call of Scheme.source(cascading.flow.FlowProcess, SourceCall) . |
Methods inherited from class cascading.scheme.Scheme |
---|
equals, getNumSinkParts, getSinkFields, getSourceFields, getTrace, hashCode, isSink, isSource, isSymmetrical, presentSinkFields, presentSinkFieldsInternal, presentSourceFields, presentSourceFieldsInternal, retrieveSinkFields, retrieveSourceFields, setNumSinkParts, setSinkFields, setSourceFields, sinkCleanup, sinkPrepare, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
protected SequenceFile()
@ConstructorProperties(value="fields") public SequenceFile(Fields fields)
fields
- Method Detail |
---|
public void sourceConfInit(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess, Tap<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector> tap, org.apache.hadoop.mapred.JobConf conf)
Scheme
Scheme.sourcePrepare(cascading.flow.FlowProcess, SourceCall)
if resources much be initialized
before use. And Scheme.sourceCleanup(cascading.flow.FlowProcess, SourceCall)
if resources must be
destroyed after use.
sourceConfInit
in class Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector,Object[],Void>
flowProcess
- of type FlowProcesstap
- of type Tapconf
- of type Configpublic void sinkConfInit(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess, Tap<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector> tap, org.apache.hadoop.mapred.JobConf conf)
Scheme
Scheme.sinkPrepare(cascading.flow.FlowProcess, SinkCall)
if resources much be initialized
before use. And Scheme.sinkCleanup(cascading.flow.FlowProcess, SinkCall)
if resources must be
destroyed after use.
sinkConfInit
in class Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector,Object[],Void>
flowProcess
- of type FlowProcesstap
- of type Tapconf
- of type Configpublic void sourcePrepare(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess, SourceCall<Object[],org.apache.hadoop.mapred.RecordReader> sourceCall)
Scheme
Scheme.source(cascading.flow.FlowProcess, SourceCall)
.
Be sure to place any initialized objects in the SourceContext
so each instance
will remain threadsafe.
sourcePrepare
in class Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector,Object[],Void>
flowProcess
- of type FlowProcesssourceCall
- of type SourceCallpublic boolean source(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess, SourceCall<Object[],org.apache.hadoop.mapred.RecordReader> sourceCall) throws IOException
Scheme
SourceCall.getInput()
and populate
the available Tuple
via SourceCall.getIncomingEntry()
and return true
on success or false
if no more values available.
It's ok to set a new Tuple instance on the incomingEntry
TupleEntry
, or
to simply re-use the existing instance.
Note this is only time it is safe to modify a Tuple instance handed over via a method call.
This method may optionally throw a TapException
if it cannot process a particular
instance of data. If the payload Tuple is set on the TapException, that Tuple will be written to
any applicable failure trap Tap.
source
in class Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector,Object[],Void>
flowProcess
- of type FlowProcesssourceCall
- of SourceCall
true
when a Tuple was successfully read
IOException
public void sourceCleanup(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess, SourceCall<Object[],org.apache.hadoop.mapred.RecordReader> sourceCall)
Scheme
Scheme.sourcePrepare(cascading.flow.FlowProcess, SourceCall)
.
sourceCleanup
in class Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector,Object[],Void>
flowProcess
- of ProcesssourceCall
- of type SourceCallpublic void sink(FlowProcess<org.apache.hadoop.mapred.JobConf> flowProcess, SinkCall<Void,org.apache.hadoop.mapred.OutputCollector> sinkCall) throws IOException
Scheme
Tuple
found on SinkCall.getOutgoingEntry()
to
the SinkCall.getOutput()
.
This method may optionally throw a TapException
if it cannot process a particular
instance of data. If the payload Tuple is set on the TapException, that Tuple will be written to
any applicable failure trap Tap. If not set, the incoming Tuple will be written instead.
sink
in class Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector,Object[],Void>
flowProcess
- of ProcesssinkCall
- of SinkCall
IOException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |