|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cascading.scheme.Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter> cascading.scheme.local.TextLine
public class TextLine
A TextLine is a type of Scheme
for plain text files. Files are broken into
lines. Either line-feed or carriage-return are used to signal end of line.
Tuple
with two fields, "num" and "line". Where "num"
is the line number for "line".
Many of the constructors take both "sourceFields" and "sinkFields". sourceFields denote the field names
to be used instead of the names "num" and "line". sinkFields is a selector and is by default Fields.ALL
.
Any available field names can be given if only a subset of the incoming fields should be used.
If a Fields
instance is passed on the constructor as sourceFields having only one field, the return tuples
will simply be the "line" value using the given field name.
Note that TextLine will concatenate all the Tuple values for the selected fields with a TAB delimiter before
writing out the line.
By default, all text is encoded/decoded as UTF-8. This can be changed via the charsetName
constructor
argument.
Field Summary | |
---|---|
static String |
DEFAULT_CHARSET
|
Constructor Summary | |
---|---|
TextLine()
Creates a new TextLine instance that sources "num" and "line" fields, and sinks all incoming fields, where "num" is the line number of the line in the input file. |
|
TextLine(Fields sourceFields)
Creates a new TextLine instance. |
|
TextLine(Fields sourceFields,
Fields sinkFields)
Creates a new TextLine instance. |
|
TextLine(Fields sourceFields,
Fields sinkFields,
String charsetName)
Creates a new TextLine instance. |
|
TextLine(Fields sourceFields,
String charsetName)
Creates a new TextLine instance. |
Method Summary | |
---|---|
LineNumberReader |
createInput(InputStream inputStream)
|
PrintWriter |
createOutput(OutputStream outputStream)
|
void |
presentSinkFields(FlowProcess<Properties> process,
Tap tap,
Fields fields)
Method presentSinkFields is called after the planner is invoked and all fields are resolved. |
void |
presentSourceFields(FlowProcess<Properties> process,
Tap tap,
Fields fields)
Method presentSourceFields is called after the planner is invoked and all fields are resolved. |
void |
sink(FlowProcess<Properties> flowProcess,
SinkCall<PrintWriter,OutputStream> sinkCall)
Method sink writes out the given Tuple found on SinkCall.getOutgoingEntry() to
the SinkCall.getOutput() . |
void |
sinkCleanup(FlowProcess<Properties> flowProcess,
SinkCall<PrintWriter,OutputStream> sinkCall)
Method sinkCleanup is used to destroy resources created by Scheme.sinkPrepare(cascading.flow.FlowProcess, SinkCall) . |
void |
sinkConfInit(FlowProcess<Properties> flowProcess,
Tap<Properties,InputStream,OutputStream> tap,
Properties conf)
Method sinkInit initializes this instance as a sink. |
void |
sinkPrepare(FlowProcess<Properties> flowProcess,
SinkCall<PrintWriter,OutputStream> sinkCall)
Method sinkPrepare is used to initialize resources needed during each call of Scheme.sink(cascading.flow.FlowProcess, SinkCall) . |
boolean |
source(FlowProcess<Properties> flowProcess,
SourceCall<LineNumberReader,InputStream> sourceCall)
Method source will read a new "record" or value from SourceCall.getInput() and populate
the available Tuple via SourceCall.getIncomingEntry() and return true
on success or false if no more values available. |
void |
sourceCleanup(FlowProcess<Properties> flowProcess,
SourceCall<LineNumberReader,InputStream> sourceCall)
Method sourceCleanup is used to destroy resources created by Scheme.sourcePrepare(cascading.flow.FlowProcess, SourceCall) . |
void |
sourceConfInit(FlowProcess<Properties> flowProcess,
Tap<Properties,InputStream,OutputStream> tap,
Properties conf)
Method sourceInit initializes this instance as a source. |
void |
sourcePrepare(FlowProcess<Properties> flowProcess,
SourceCall<LineNumberReader,InputStream> sourceCall)
Method sourcePrepare is used to initialize resources needed during each call of Scheme.source(cascading.flow.FlowProcess, SourceCall) . |
protected void |
verify(Fields sourceFields)
|
Methods inherited from class cascading.scheme.Scheme |
---|
equals, getNumSinkParts, getSinkFields, getSourceFields, getTrace, hashCode, isSink, isSource, isSymmetrical, presentSinkFieldsInternal, presentSourceFieldsInternal, retrieveSinkFields, retrieveSourceFields, setNumSinkParts, setSinkFields, setSourceFields, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final String DEFAULT_CHARSET
Constructor Detail |
---|
public TextLine()
@ConstructorProperties(value="sourceFields") public TextLine(Fields sourceFields)
sourceFields
- of Fields@ConstructorProperties(value={"sourceFields","charsetName"}) public TextLine(Fields sourceFields, String charsetName)
sourceFields
- of FieldscharsetName
- of type String@ConstructorProperties(value={"sourceFields","sinkFields"}) public TextLine(Fields sourceFields, Fields sinkFields)
sourceFields
- of FieldssinkFields
- of Fields@ConstructorProperties(value={"sourceFields","sinkFields","charsetName"}) public TextLine(Fields sourceFields, Fields sinkFields, String charsetName)
sourceFields
- of FieldssinkFields
- of FieldscharsetName
- of type StringMethod Detail |
---|
protected void verify(Fields sourceFields)
public LineNumberReader createInput(InputStream inputStream)
public PrintWriter createOutput(OutputStream outputStream)
public void presentSourceFields(FlowProcess<Properties> process, Tap tap, Fields fields)
Scheme
Scheme.retrieveSourceFields(cascading.flow.FlowProcess, cascading.tap.Tap)
.
presentSourceFields
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
process
- of type FlowProcesstap
- of type Tapfields
- of type Fieldspublic void presentSinkFields(FlowProcess<Properties> process, Tap tap, Fields fields)
Scheme
Scheme.retrieveSinkFields(cascading.flow.FlowProcess, cascading.tap.Tap)
.
presentSinkFields
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
process
- of type FlowProcesstap
- of type Tapfields
- of type Fieldspublic void sourceConfInit(FlowProcess<Properties> flowProcess, Tap<Properties,InputStream,OutputStream> tap, Properties conf)
Scheme
Scheme.sourcePrepare(cascading.flow.FlowProcess, SourceCall)
if resources much be initialized
before use. And Scheme.sourceCleanup(cascading.flow.FlowProcess, SourceCall)
if resources must be
destroyed after use.
sourceConfInit
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of type FlowProcesstap
- of type Tapconf
- of type Configpublic void sinkConfInit(FlowProcess<Properties> flowProcess, Tap<Properties,InputStream,OutputStream> tap, Properties conf)
Scheme
Scheme.sinkPrepare(cascading.flow.FlowProcess, SinkCall)
if resources much be initialized
before use. And Scheme.sinkCleanup(cascading.flow.FlowProcess, SinkCall)
if resources must be
destroyed after use.
sinkConfInit
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of type FlowProcesstap
- of type Tapconf
- of type Configpublic void sourcePrepare(FlowProcess<Properties> flowProcess, SourceCall<LineNumberReader,InputStream> sourceCall) throws IOException
Scheme
Scheme.source(cascading.flow.FlowProcess, SourceCall)
.
Be sure to place any initialized objects in the SourceContext
so each instance
will remain threadsafe.
sourcePrepare
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of type FlowProcesssourceCall
- of type SourceCallIOException
public boolean source(FlowProcess<Properties> flowProcess, SourceCall<LineNumberReader,InputStream> sourceCall) throws IOException
Scheme
SourceCall.getInput()
and populate
the available Tuple
via SourceCall.getIncomingEntry()
and return true
on success or false
if no more values available.
It's ok to set a new Tuple instance on the incomingEntry
TupleEntry
, or
to simply re-use the existing instance.
Note this is only time it is safe to modify a Tuple instance handed over via a method call.
This method may optionally throw a TapException
if it cannot process a particular
instance of data. If the payload Tuple is set on the TapException, that Tuple will be written to
any applicable failure trap Tap.
source
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of type FlowProcesssourceCall
- of SourceCall
true
when a Tuple was successfully read
IOException
public void sourceCleanup(FlowProcess<Properties> flowProcess, SourceCall<LineNumberReader,InputStream> sourceCall) throws IOException
Scheme
Scheme.sourcePrepare(cascading.flow.FlowProcess, SourceCall)
.
sourceCleanup
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of ProcesssourceCall
- of type SourceCallIOException
public void sinkPrepare(FlowProcess<Properties> flowProcess, SinkCall<PrintWriter,OutputStream> sinkCall) throws IOException
Scheme
Scheme.sink(cascading.flow.FlowProcess, SinkCall)
.
Be sure to place any initialized objects in the SinkContext
so each instance
will remain threadsafe.
sinkPrepare
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of type FlowProcesssinkCall
- of type SinkCallIOException
public void sink(FlowProcess<Properties> flowProcess, SinkCall<PrintWriter,OutputStream> sinkCall) throws IOException
Scheme
Tuple
found on SinkCall.getOutgoingEntry()
to
the SinkCall.getOutput()
.
This method may optionally throw a TapException
if it cannot process a particular
instance of data. If the payload Tuple is set on the TapException, that Tuple will be written to
any applicable failure trap Tap. If not set, the incoming Tuple will be written instead.
sink
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of ProcesssinkCall
- of SinkCall
IOException
public void sinkCleanup(FlowProcess<Properties> flowProcess, SinkCall<PrintWriter,OutputStream> sinkCall) throws IOException
Scheme
Scheme.sinkPrepare(cascading.flow.FlowProcess, SinkCall)
.
sinkCleanup
in class Scheme<Properties,InputStream,OutputStream,LineNumberReader,PrintWriter>
flowProcess
- of type FlowProcesssinkCall
- of type SinkCallIOException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |