|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cascading.flow.Flow
@Process public class Flow
A Pipe
assembly is connected to the necessary number of Tap
sinks and
sources into a Flow. A Flow is then executed to push the incoming source data through
the assembly into one or more sinks.
Pipe
assemblies can be reused in multiple Flow instances. They maintain
no state regarding the Flow execution. Subsequently, Pipe
assemblies can be given
parameters through its calling Flow so they can be built in a generic fashion.
When a Flow is created, an optimized internal representation is created that is then executed
within the cluster. Thus any overhead inherent to a give Pipe
assembly will be removed
once it's placed in context with the actual execution environment.
Flows are submitted in order of dependency. If two or more steps do not share the same dependencies and all
can be scheduled simultaneously, the getSubmitPriority()
value determines the order in which
all steps will be submitted for execution. The default submit priority is 5.
Properties
FlowConnector
Nested Class Summary | |
---|---|
static class |
Flow.FlowHolder
Class FlowHolder is a helper class for wrapping Flow instances. |
Field Summary | |
---|---|
protected Map<String,Tap> |
sources
Field sources |
protected boolean |
stopJobsOnExit
Field stopJobsOnExit |
Constructor Summary | |
---|---|
protected |
Flow()
Used for testing. |
protected |
Flow(Map<Object,Object> properties,
JobConf jobConf,
String name)
|
protected |
Flow(Map<Object,Object> properties,
JobConf jobConf,
String name,
ElementGraph pipeGraph,
StepGraph stepGraph,
Map<String,Tap> sources,
Map<String,Tap> sinks,
Map<String,Tap> traps)
|
protected |
Flow(Map<Object,Object> properties,
JobConf jobConf,
String name,
StepGraph stepGraph,
Map<String,Tap> sources,
Map<String,Tap> sinks,
Map<String,Tap> traps)
|
Method Summary | |
---|---|
void |
addListener(FlowListener flowListener)
Method addListener registers the given flowListener with this instance. |
boolean |
areSinksStale()
Method areSinksStale returns true if any of the sinks referenced are out of date in relation to the sources. |
boolean |
areSourcesNewer(long sinkModified)
Method areSourcesNewer returns true if any source is newer than the given sinkModified date value. |
void |
cleanup()
|
void |
complete()
Method complete starts the current Flow instance if it has not be previously started, then block until completion. |
void |
deleteSinks()
Method deleteSinks deletes all sinks, whether or not they are configured for SinkMode.UPDATE . |
void |
deleteSinksIfNotAppend()
Deprecated. |
void |
deleteSinksIfNotUpdate()
Method deleteSinksIfNotUpdate deletes all sinks if they are not configured with the SinkMode.UPDATE flag. |
FlowSkipStrategy |
getFlowSkipStrategy()
Method getFlowSkipStrategy returns the current FlowSkipStrategy used by this Flow. |
FlowStats |
getFlowStats()
Method getFlowStats returns the flowStats of this Flow object. |
Flow.FlowHolder |
getHolder()
Used to return a simple wrapper for use as an edge in a graph where there can only be one instance of every edge. |
String |
getID()
Method getID returns the ID of this Flow object. |
JobConf |
getJobConf()
Method getJobConf returns the jobConf of this Flow object. |
static long |
getJobPollingInterval(JobConf jobConf)
|
static long |
getJobPollingInterval(Map<Object,Object> properties)
Returns property jobPollingInterval. |
String |
getName()
Method getName returns the name of this Flow object. |
static boolean |
getPreserveTemporaryFiles(Map<Object,Object> properties)
Returns property preserveTemporaryFiles. |
String |
getProperty(String key)
Method getProperty returns the value associated with the given key from the underlying properties system. |
Tap |
getSink()
Method getSink returns the first sink of this Flow object. |
long |
getSinkModified()
Method getSinkModified returns the youngest modified date of any sink Tap managed by this Flow instance. |
Map<String,Tap> |
getSinks()
Method getSinks returns the sinks of this Flow object. |
Collection<Tap> |
getSinksCollection()
Method getSinksCollection returns a Collection of sink Tap s for this Flow object. |
Map<String,Tap> |
getSources()
Method getSources returns the sources of this Flow object. |
Collection<Tap> |
getSourcesCollection()
Method getSourcesCollection returns a Collection of source Tap s for this Flow object. |
List<FlowStep> |
getSteps()
Method getSteps returns the steps of this Flow object. |
static boolean |
getStopJobsOnExit(Map<Object,Object> properties)
Returns property stopJobsOnExit. |
int |
getSubmitPriority()
Method getSubmitPriority returns the submitPriority of this Flow object. |
Map<String,Tap> |
getTraps()
Method getTraps returns the traps of this Flow object. |
Collection<Tap> |
getTrapsCollection()
Method getTrapsCollection returns a Collection of trap Tap s for this Flow object. |
boolean |
hasListeners()
Method hasListeners returns true if FlowListener instances have been registered. |
boolean |
isPreserveTemporaryFiles()
Method isPreserveTemporaryFiles returns false if temporary files will be cleaned when this Flow completes. |
boolean |
isSkipFlow()
Method isSkipFlow returns true if the parent Cascade should skip this Flow instance. |
boolean |
isStopJobsOnExit()
Method isStopJobsOnExit returns the stopJobsOnExit of this Flow object. |
boolean |
jobsAreLocal()
Method jobsAreLocal returns true if all jobs are executed in-process as a single map and reduce task. |
TupleEntryIterator |
openSink()
Method openSink opens the first sink Tap. |
TupleEntryIterator |
openSink(String name)
Method openSink opens the named sink Tap. |
TupleEntryIterator |
openSource()
Method openSource opens the first source Tap. |
TupleEntryIterator |
openSource(String name)
Method openSource opens the named source Tap. |
TupleEntryIterator |
openTapForRead(Tap tap)
Method openTapForRead return a TupleIterator for the given Tap instance. |
TupleEntryCollector |
openTapForWrite(Tap tap)
Method openTapForWrite returns a (@link TupleCollector} for the given Tap instance. |
TupleEntryIterator |
openTrap()
Method openTrap opens the first trap Tap. |
TupleEntryIterator |
openTrap(String name)
Method openTrap opens the named trap Tap. |
void |
prepare()
Method prepare is used by a Cascade to notify the given Flow it should initialize or clear any resources
necessary for start() to be called successfully. |
boolean |
removeListener(FlowListener flowListener)
Method removeListener removes the given flowListener from this instance. |
void |
run()
Method run implements the Runnable run method and should not be called by users. |
FlowSkipStrategy |
setFlowSkipStrategy(FlowSkipStrategy flowSkipStrategy)
Method setFlowSkipStrategy sets a new FlowSkipStrategy , the current strategy is returned. |
static void |
setJobPollingInterval(Map<Object,Object> properties,
long interval)
Property jobPollingInterval will set the time to wait between polling the remote server for the status of a job. |
protected void |
setName(String name)
|
static void |
setPreserveTemporaryFiles(Map<Object,Object> properties,
boolean preserveTemporaryFiles)
Property preserveTemporaryFiles forces the Flow instance to keep any temporary intermediate data sets. |
void |
setProperty(String key,
String value)
Method setProperty sets the given key and value on the underlying properites system. |
protected void |
setSinks(Map<String,Tap> sinks)
|
protected void |
setSources(Map<String,Tap> sources)
|
protected void |
setStepGraph(StepGraph stepGraph)
|
static void |
setStopJobsOnExit(Map<Object,Object> properties,
boolean stopJobsOnExit)
Property stopJobsOnExit will tell the Flow to add a JVM shutdown hook that will kill all running processes if the underlying computing system supports it. |
void |
setSubmitPriority(int submitPriority)
Method setSubmitPriority sets the submitPriority of this Flow object. |
protected void |
setTraps(Map<String,Tap> traps)
|
void |
start()
Method start begins the execution of this Flow instance. |
void |
stop()
Method stop stops all running jobs, killing any currently executing. |
boolean |
tapPathExists(Tap tap)
Method tapExists returns true if the resource represented by the given Tap instance exists. |
String |
toString()
|
void |
writeDOT(String filename)
Method writeDOT writes this Flow instance to the given filename as a DOT file for import into a graphics package. |
void |
writeStepsDOT(String filename)
Method writeStepsDOT writes this Flow step graph to the given filename as a DOT file for import into a graphics package. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected Map<String,Tap> sources
protected boolean stopJobsOnExit
Constructor Detail |
---|
protected Flow()
protected Flow(Map<Object,Object> properties, JobConf jobConf, String name)
protected Flow(Map<Object,Object> properties, JobConf jobConf, String name, ElementGraph pipeGraph, StepGraph stepGraph, Map<String,Tap> sources, Map<String,Tap> sinks, Map<String,Tap> traps)
protected Flow(Map<Object,Object> properties, JobConf jobConf, String name, StepGraph stepGraph, Map<String,Tap> sources, Map<String,Tap> sinks, Map<String,Tap> traps)
Method Detail |
---|
public static void setPreserveTemporaryFiles(Map<Object,Object> properties, boolean preserveTemporaryFiles)
false
.
properties
- of type MappreserveTemporaryFiles
- of type booleanpublic static boolean getPreserveTemporaryFiles(Map<Object,Object> properties)
properties
- of type Map
public static void setStopJobsOnExit(Map<Object,Object> properties, boolean stopJobsOnExit)
true
.
properties
- of type MapstopJobsOnExit
- of type booleanpublic static boolean getStopJobsOnExit(Map<Object,Object> properties)
properties
- of type Map
public static void setJobPollingInterval(Map<Object,Object> properties, long interval)
properties
- of type Mapinterval
- of type longpublic static long getJobPollingInterval(Map<Object,Object> properties)
properties
- of type Map
public static long getJobPollingInterval(JobConf jobConf)
public String getName()
protected void setName(String name)
public String getID()
public int getSubmitPriority()
public void setSubmitPriority(int submitPriority)
submitPriority
- the submitPriority of this FlowStep object.protected void setSources(Map<String,Tap> sources)
protected void setSinks(Map<String,Tap> sinks)
protected void setTraps(Map<String,Tap> traps)
protected void setStepGraph(StepGraph stepGraph)
public JobConf getJobConf()
public void setProperty(String key, String value)
key
- of type Stringvalue
- of type Stringpublic String getProperty(String key)
key
- of type String
public FlowStats getFlowStats()
public boolean hasListeners()
FlowListener
instances have been registered.
public void addListener(FlowListener flowListener)
flowListener
- of type FlowListenerpublic boolean removeListener(FlowListener flowListener)
flowListener
- of type FlowListener
public Map<String,Tap> getSources()
@DependencyIncoming public Collection<Tap> getSourcesCollection()
Collection
of source Tap
s for this Flow object.
public Map<String,Tap> getSinks()
@DependencyOutgoing public Collection<Tap> getSinksCollection()
Collection
of sink Tap
s for this Flow object.
public Map<String,Tap> getTraps()
public Collection<Tap> getTrapsCollection()
Collection
of trap Tap
s for this Flow object.
public Tap getSink()
public boolean isPreserveTemporaryFiles()
public boolean isStopJobsOnExit()
true
.
public FlowSkipStrategy getFlowSkipStrategy()
FlowSkipStrategy
used by this Flow.
public FlowSkipStrategy setFlowSkipStrategy(FlowSkipStrategy flowSkipStrategy)
FlowSkipStrategy
, the current strategy is returned.
FlowSkipStrategy instances define when a Flow instance should be skipped. The default strategy is FlowSkipIfSinkStale
.
An alternative strategy would be FlowSkipIfSinkExists
.
A FlowSkipStrategy will not be consulted when executing a Flow directly through start()
or complete()
. Only
when the Flow is executed through a Cascade
instance.
flowSkipStrategy
- of type FlowSkipStrategy
public boolean isSkipFlow() throws IOException
Cascade
should skip this Flow instance. True is returned
if the current FlowSkipStrategy
returns true.
IOException
- whenpublic boolean areSinksStale() throws IOException
Tap.isReplace()
returns true.
IOException
- whenpublic boolean areSourcesNewer(long sinkModified) throws IOException
sinkModified
- of type long
IOException
- whenpublic long getSinkModified() throws IOException
Tap
managed by this Flow instance.
If zero (0) is returned, atleast one of the sink resources does not exist. If minus one (-1) is returned,
atleast one of the sinks are marked for delete (returns true
).
IOException
- whenpublic List<FlowStep> getSteps()
@ProcessPrepare public void prepare()
Cascade
to notify the given Flow it should initialize or clear any resources
necessary for start()
to be called successfully.
Specifically, this implementation calls deleteSinksIfNotUpdate()
.
IOException
- when@ProcessStart public void start()
complete()
to block until this Flow completes.
@ProcessStop public void stop()
@ProcessComplete public void complete()
@ProcessCleanup public void cleanup()
public TupleEntryIterator openSource() throws IOException
IOException
- whenpublic TupleEntryIterator openSource(String name) throws IOException
name
- of type String
IOException
- whenpublic TupleEntryIterator openSink() throws IOException
IOException
- whenpublic TupleEntryIterator openSink(String name) throws IOException
name
- of type String
IOException
- whenpublic TupleEntryIterator openTrap() throws IOException
IOException
- whenpublic TupleEntryIterator openTrap(String name) throws IOException
name
- of type String
IOException
- whenpublic void deleteSinks() throws IOException
SinkMode.UPDATE
.
Use with caution.
IOException
- whendeleteSinksIfNotUpdate()
@Deprecated public void deleteSinksIfNotAppend() throws IOException
SinkMode.APPEND
flag.
Typically used by a Cascade
before executing the flow if the sinks are stale.
Use with caution.
IOException
- whenpublic void deleteSinksIfNotUpdate() throws IOException
SinkMode.UPDATE
flag.
Typically used by a Cascade
before executing the flow if the sinks are stale.
Use with caution.
IOException
- whenpublic boolean tapPathExists(Tap tap) throws IOException
tap
- of type Tap
IOException
- whenpublic TupleEntryIterator openTapForRead(Tap tap) throws IOException
TupleIterator
for the given Tap instance.
tap
- of type Tap
IOException
- when there is an error opening the resourcepublic TupleEntryCollector openTapForWrite(Tap tap) throws IOException
tap
- of type Tap
IOException
- when there is an error opening the resourcepublic boolean jobsAreLocal()
public void run()
run
in interface Runnable
public String toString()
toString
in class Object
public void writeDOT(String filename)
filename
- of type Stringpublic void writeStepsDOT(String filename)
filename
- of type Stringpublic Flow.FlowHolder getHolder()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |