|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cascading.scheme.Scheme
public abstract class Scheme
A Scheme defines what is stored in a Tap
instance by declaring the Tuple
field names, and alternately parsing or rendering the incoming or outgoing Tuple
stream, respectively.
Tuple
s as they are sourced.
It does not necessarily filter the output since a given implementation may choose to
collapse values and ignore keys depending on the format.
Setting the numSinkParts
value to 1 (one) insures the output resource has only one part.
In the case of MapReduce, it does this by setting the number of reducers to the given value.
This may affect performance, so be cautioned.
Note that setting numSinkParts does not force the planner to insert a final Reduce operation in the job, so
numSinkParts may be ignored entirely if the final job is Map only. To force the Flow to have a final Reduce,
add a GroupBy
to the assembly before sinking.
Constructor Summary | |
---|---|
protected |
Scheme()
Constructor Scheme creates a new Scheme instance. |
protected |
Scheme(Fields sourceFields)
Constructor Scheme creates a new Scheme instance. |
protected |
Scheme(Fields sourceFields,
Fields sinkFields)
Constructor Scheme creates a new Scheme instance. |
protected |
Scheme(Fields sourceFields,
Fields sinkFields,
int numSinkParts)
Constructor Scheme creates a new Scheme instance. |
protected |
Scheme(Fields sourceFields,
int numSinkParts)
Constructor Scheme creates a new Scheme instance. |
Method Summary | |
---|---|
boolean |
equals(Object object)
|
int |
getNumSinkParts()
Method getNumSinkParts returns the numSinkParts of this Scheme object. |
Fields |
getSinkFields()
Method getSinkFields returns the sinkFields of this Scheme object. |
Fields |
getSourceFields()
Method getSourceFields returns the sourceFields of this Scheme object. |
String |
getTrace()
Method getTrace returns a String that pinpoint where this instance was created for debugging. |
int |
hashCode()
|
boolean |
isSink()
Method isSink returns true if this Scheme instance can be used as a sink. |
boolean |
isSource()
Method isSource returns true if this Scheme instance can be used as a source. |
boolean |
isSymmetrical()
Method isSymmetrical returns true if the sink fields equal the source fields. |
boolean |
isWriteDirect()
Method isWriteDirect returns true if the parent Tap instances TupleEntryCollector should be used to sink values. |
void |
setNumSinkParts(int numSinkParts)
Method setNumSinkParts sets the numSinkParts of this Scheme object. |
void |
setSinkFields(Fields sinkFields)
Method setSinkFields sets the sinkFields of this Scheme object. |
void |
setSourceFields(Fields sourceFields)
Method setSourceFields sets the sourceFields of this Scheme object. |
abstract void |
sink(TupleEntry tupleEntry,
OutputCollector outputCollector)
Method sink writes out the given Tuple instance to the outputCollector. |
abstract void |
sinkInit(Tap tap,
JobConf conf)
Method sinkInit initializes this instance as a sink. |
abstract Tuple |
source(Object key,
Object value)
Method source takes the given Hadoop key and value and returns a new Tuple instance. |
abstract void |
sourceInit(Tap tap,
JobConf conf)
Method sourceInit initializes this instance as a source. |
String |
toString()
|
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
protected Scheme()
protected Scheme(Fields sourceFields)
sourceFields
- of type Fieldsprotected Scheme(Fields sourceFields, int numSinkParts)
sourceFields
- of type FieldsnumSinkParts
- of type intprotected Scheme(Fields sourceFields, Fields sinkFields)
sourceFields
- of type FieldssinkFields
- of type Fieldsprotected Scheme(Fields sourceFields, Fields sinkFields, int numSinkParts)
sourceFields
- of type FieldssinkFields
- of type FieldsnumSinkParts
- of type intMethod Detail |
---|
public Fields getSinkFields()
public void setSinkFields(Fields sinkFields)
sinkFields
- the sinkFields of this Scheme object.public Fields getSourceFields()
public void setSourceFields(Fields sourceFields)
sourceFields
- the sourceFields of this Scheme object.public int getNumSinkParts()
public void setNumSinkParts(int numSinkParts)
numSinkParts
- the numSinkParts of this Scheme object.public String getTrace()
public boolean isWriteDirect()
Tap
instances TupleEntryCollector
should be used to sink values.
public boolean isSymmetrical()
true
if the sink fields equal the source fields. That is, this
scheme sources the same fields as it sinks.
public boolean isSource()
public boolean isSink()
public abstract void sourceInit(Tap tap, JobConf conf) throws IOException
tap
- of type Tapconf
- of type JobConf
IOException
- on initializatin failurepublic abstract void sinkInit(Tap tap, JobConf conf) throws IOException
tap
- of type Tapconf
- of type JobConf
IOException
- on initialization failurepublic abstract Tuple source(Object key, Object value)
Tuple
instance.
key
- of type WritableComparablevalue
- of type Writable
public abstract void sink(TupleEntry tupleEntry, OutputCollector outputCollector) throws IOException
Tuple
instance to the outputCollector.
tupleEntry
- outputCollector
- of type OutputCollector @throws IOException when
IOException
public boolean equals(Object object)
equals
in class Object
public String toString()
toString
in class Object
public int hashCode()
hashCode
in class Object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |