public class MapReduceFlow extends HadoopFlow
HadoopFlow
subclass that supports custom MapReduce jobs
pre-configured via the JobConf
object.
Use this class to allow custom MapReduce jobs to participate in the Cascade
scheduler. If
other Flow instances in the Cascade share resources with this Flow instance, all participants will be scheduled
according to their dependencies (topologically).
Set the parameter deleteSinkOnInit
to true
if the outputPath in the jobConf should be deleted before executing the MapReduce job.
MapReduceFlow assumes the underlying input and output paths are compatible with the Hfs
Tap.
If the configured JobConf instance uses some other identifier instead of Hadoop FS paths, you should override the
createSources(org.apache.hadoop.mapred.JobConf)
, createSinks(org.apache.hadoop.mapred.JobConf)
, and
createTraps(org.apache.hadoop.mapred.JobConf)
methods to properly resolve the configured paths into
usable Tap
instances. By default createTraps returns an empty collection and should probably be left alone.
MapReduceFlow supports both org.apache.hadoop.mapred.* and org.apache.hadoop.mapreduce.* API Jobs.BaseFlow.FlowHolder
Modifier and Type | Field and Description |
---|---|
protected boolean |
deleteSinkOnInit
Field deleteSinkOnInit
|
flowCanonicalHash, flowDescriptor, flowElementGraph, flowStats, flowStepGraph, platformInfo, sinks, sources, steps, stop, stopJobsOnExit, thread
CASCADING_FLOW_ID
NULL
Constructor and Description |
---|
MapReduceFlow(JobConf jobConf)
Constructor MapReduceFlow creates a new MapReduceFlow instance.
|
MapReduceFlow(JobConf jobConf,
boolean deleteSinkOnInit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.
|
MapReduceFlow(Properties properties,
String name,
JobConf jobConf,
boolean deleteSinkOnInit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.
|
MapReduceFlow(Properties properties,
String name,
JobConf jobConf,
Map<String,String> flowDescriptor,
boolean deleteSinkOnInit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.
|
MapReduceFlow(Properties properties,
String name,
JobConf jobConf,
Map<String,String> flowDescriptor,
boolean deleteSinkOnInit,
boolean stopJobsOnExit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.
|
MapReduceFlow(String name,
JobConf jobConf)
Constructor MapReduceFlow creates a new MapReduceFlow instance.
|
MapReduceFlow(String name,
JobConf jobConf,
boolean deleteSinkOnInit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.
|
Modifier and Type | Method and Description |
---|---|
protected FlowStep<JobConf> |
createFlowStep(JobConf jobConf,
Tap sink) |
protected Map<String,Tap> |
createSinks(JobConf jobConf) |
protected Map<String,Tap> |
createSources(JobConf jobConf) |
protected Map<String,Tap> |
createTraps(JobConf jobConf) |
protected Map<String,Tap> |
fileInputToTaps(JobConf jobConf) |
protected Map<String,Tap> |
fileOutputToTaps(JobConf jobConf) |
protected void |
initializeFrom(JobConf jobConf) |
protected String |
makeNameFromPath(Map<String,Tap> taps,
Path path) |
protected FlowStepGraph |
makeStepGraph(JobConf jobConf) |
protected Tap |
toSinkTap(Map<String,Tap> taps,
Path path) |
protected Tap |
toSourceTap(Map<String,Tap> taps,
Path path) |
copyToDistributedCache, getConfig, getConfigAsProperties, getConfigCopy, getFlowProcess, getMaxNumParallelSteps, getProperty, getTotalSliceCPUMilliSeconds, initConfig, initFromProperties, internalClean, internalShutdown, internalStart, isPreserveTemporaryFiles, newConfig, registerHadoopShutdownHook, setConfigProperty, stepsAreLocal
addListener, addPlannerProperties, addSessionProperties, addStepListener, areSinksStale, areSourcesNewer, cleanup, complete, createConfig, createFlowCanonicalHash, createFlowStats, createFlowThread, createPrepareFlowStats, deleteCheckpointsIfNotUpdate, deleteCheckpointsIfReplace, deleteSinks, deleteSinksIfNotUpdate, deleteSinksIfReplace, deleteTrapsIfNotUpdate, deleteTrapsIfReplace, fireOnCompleted, fireOnStarting, fireOnStopping, fireOnThrowable, fireOnThrowable, getCascadeID, getCascadingServices, getCheckpointNames, getCheckpoints, getCheckpointsCollection, getClassPath, getClientState, getFieldsFor, getFlowCanonicalHash, getFlowDescriptor, getFlowSession, getFlowSkipStrategy, getFlowStats, getFlowSteps, getFlowStepStrategy, getHolder, getID, getName, getPlannerInfo, getPlatformInfo, getRunID, getSink, getSink, getSinkModified, getSinkNames, getSinks, getSinksCollection, getSource, getSourceNames, getSources, getSourcesCollection, getSpawnStrategy, getStats, getSubmitPriority, getTags, getTrapNames, getTraps, getTrapsCollection, handleExecutorShutdown, hasListeners, hasStepListeners, initialize, initializeChildStats, initializeNewJobsMap, initSteps, internalStopAllJobs, isDebugEnabled, isInfoEnabled, isSkipFlow, isStopJobsOnExit, logDebug, logError, logError, logInfo, logWarn, logWarn, logWarn, openSink, openSink, openSource, openSource, openTapForRead, openTapForWrite, openTrap, openTrap, prepare, presentSinkFields, presentSourceFields, registerShutdownHook, removeListener, removeStepListener, resourceExists, retrieveSinkFields, retrieveSourceFields, setCascade, setCheckpoints, setFlowSkipStrategy, setFlowStepGraph, setFlowStepStrategy, setName, setPlannerInfo, setSinks, setSources, setSpawnStrategy, setSubmitPriority, setTraps, start, stop, toString, updateSchemes, writeDOT, writeStepsDOT
protected boolean deleteSinkOnInit
@ConstructorProperties(value="jobConf") public MapReduceFlow(JobConf jobConf)
jobConf
- of type JobConf@ConstructorProperties(value={"jobConf","deleteSinkOnInit"}) public MapReduceFlow(JobConf jobConf, boolean deleteSinkOnInit)
jobConf
- of type JobConfdeleteSinkOnInit
- of type boolean@ConstructorProperties(value={"name","jobConf"}) public MapReduceFlow(String name, JobConf jobConf)
name
- of type StringjobConf
- of type JobConf@ConstructorProperties(value={"name","jobConf","deleteSinkOnInit"}) public MapReduceFlow(String name, JobConf jobConf, boolean deleteSinkOnInit)
name
- of type StringjobConf
- of type JobConfdeleteSinkOnInit
- of type boolean@ConstructorProperties(value={"properties","name","jobConf","deleteSinkOnInit"}) public MapReduceFlow(Properties properties, String name, JobConf jobConf, boolean deleteSinkOnInit)
properties
- of type Propertiesname
- of type StringjobConf
- of type JobConfdeleteSinkOnInit
- of type boolean@ConstructorProperties(value={"properties","name","jobConf","flowDescriptor","deleteSinkOnInit"}) public MapReduceFlow(Properties properties, String name, JobConf jobConf, Map<String,String> flowDescriptor, boolean deleteSinkOnInit)
properties
- of type Propertiesname
- of type StringjobConf
- of type JobConfflowDescriptor
- of type MapdeleteSinkOnInit
- of type boolean@ConstructorProperties(value={"properties","name","jobConf","flowDescriptor","deleteSinkOnInit","stopJobsOnExit"}) public MapReduceFlow(Properties properties, String name, JobConf jobConf, Map<String,String> flowDescriptor, boolean deleteSinkOnInit, boolean stopJobsOnExit)
properties
- of type Propertiesname
- of type StringjobConf
- of type JobConfflowDescriptor
- of type MapdeleteSinkOnInit
- of type booleanstopJobsOnExit
- of type booleanprotected void initializeFrom(JobConf jobConf)
protected FlowStepGraph makeStepGraph(JobConf jobConf)
protected FlowStep<JobConf> createFlowStep(JobConf jobConf, Tap sink)
protected Map<String,Tap> createSources(JobConf jobConf)
protected Map<String,Tap> fileInputToTaps(JobConf jobConf)
protected Map<String,Tap> createSinks(JobConf jobConf)
protected Map<String,Tap> fileOutputToTaps(JobConf jobConf)
protected Map<String,Tap> createTraps(JobConf jobConf)
Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.