cascading.flow.hadoop
Class MapReduceFlow

java.lang.Object
  extended by cascading.flow.BaseFlow<JobConf>
      extended by cascading.flow.hadoop.HadoopFlow
          extended by cascading.flow.hadoop.MapReduceFlow
All Implemented Interfaces:
cascading.flow.Flow<JobConf>, cascading.management.UnitOfWork<cascading.stats.FlowStats>

public class MapReduceFlow
extends HadoopFlow

Class MapReduceFlow is a HadoopFlow subclass that supports custom MapReduce jobs pre-configured via the JobConf object.

Use this class to allow custom MapReduce jobs to participate in the Cascade scheduler. If other Flow instances in the Cascade share resources with this Flow instance, all participants will be scheduled according to their dependencies (topologically).

Set the parameter deleteSinkOnInit to true if the outputPath in the jobConf should be deleted before executing the MapReduce job.

MapReduceFlow assumes the underlying input and output paths are compatible with the Hfs Tap.

If the configured JobConf instance uses some other identifier instead of Hadoop FS paths, you should override the createSources(org.apache.hadoop.mapred.JobConf), createSinks(org.apache.hadoop.mapred.JobConf), and createTraps(org.apache.hadoop.mapred.JobConf) methods to properly resolve the configured paths into usable Tap instances. By default createTraps returns an empty collection and should probably be left alone.


Nested Class Summary
 
Nested classes/interfaces inherited from class cascading.flow.BaseFlow
cascading.flow.BaseFlow.FlowHolder
 
Field Summary
protected  boolean deleteSinkOnInit
          Field deleteSinkOnInit
 
Fields inherited from class cascading.flow.BaseFlow
flowStats, sinks, sources, stop, stopJobsOnExit, thread
 
Fields inherited from interface cascading.flow.Flow
CASCADING_FLOW_ID
 
Constructor Summary
MapReduceFlow(JobConf jobConf)
          Constructor MapReduceFlow creates a new MapReduceFlow instance.
MapReduceFlow(JobConf jobConf, boolean deleteSinkOnInit)
          Constructor MapReduceFlow creates a new MapReduceFlow instance.
MapReduceFlow(String name, JobConf jobConf)
          Constructor MapReduceFlow creates a new MapReduceFlow instance.
MapReduceFlow(String name, JobConf jobConf, boolean deleteSinkOnInit)
          Constructor MapReduceFlow creates a new MapReduceFlow instance.
MapReduceFlow(String name, JobConf jobConf, boolean deleteSinkOnInit, boolean stopJobsOnExit)
          Constructor MapReduceFlow creates a new MapReduceFlow instance.
 
Method Summary
protected  Map<String,cascading.tap.Tap> createSinks(JobConf jobConf)
           
protected  Map<String,cascading.tap.Tap> createSources(JobConf jobConf)
           
protected  Map<String,cascading.tap.Tap> createTraps(JobConf jobConf)
           
 
Methods inherited from class cascading.flow.hadoop.HadoopFlow
getConfig, getConfigAsProperties, getConfigCopy, getFlowProcess, getMaxNumParallelSteps, getProperty, initConfig, initFromProperties, internalClean, internalShutdown, internalStart, isPreserveTemporaryFiles, newConfig, setConfigProperty, stepsAreLocal
 
Methods inherited from class cascading.flow.BaseFlow
addListener, addStepListener, areSinksStale, areSourcesNewer, cleanup, complete, createConfig, createFlowThread, deleteCheckpointsIfNotUpdate, deleteCheckpointsIfReplace, deleteSinks, deleteSinksIfNotUpdate, deleteSinksIfReplace, deleteTrapsIfNotUpdate, deleteTrapsIfReplace, fireOnCompleted, fireOnStarting, fireOnStopping, fireOnThrowable, getCascadeID, getCascadingServices, getCheckpointNames, getCheckpoints, getCheckpointsCollection, getClassPath, getFieldsFor, getFlowSession, getFlowSkipStrategy, getFlowStats, getFlowSteps, getFlowStepStrategy, getHolder, getID, getName, getPlatformInfo, getRunID, getSink, getSink, getSinkModified, getSinkNames, getSinks, getSinksCollection, getSource, getSourceNames, getSources, getSourcesCollection, getSpawnStrategy, getStats, getSubmitPriority, getTags, getTrapNames, getTraps, getTrapsCollection, handleExecutorShutdown, hasListeners, hasStepListeners, initialize, initializeNewJobsMap, initSteps, internalStopAllJobs, isSkipFlow, isStopJobsOnExit, logInfo, openSink, openSink, openSource, openSource, openTapForRead, openTapForWrite, openTrap, openTrap, prepare, presentSinkFields, presentSourceFields, registerShutdownHook, removeListener, removeStepListener, resourceExists, retrieveSinkFields, retrieveSourceFields, setCascade, setCheckpoints, setFlowSkipStrategy, setFlowStepGraph, setFlowStepStrategy, setName, setSinks, setSources, setSpawnStrategy, setSubmitPriority, setTraps, start, stop, toString, updateSchemes, writeDOT, writeStepsDOT
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

deleteSinkOnInit

protected boolean deleteSinkOnInit
Field deleteSinkOnInit

Constructor Detail

MapReduceFlow

@ConstructorProperties(value="jobConf")
public MapReduceFlow(JobConf jobConf)
Constructor MapReduceFlow creates a new MapReduceFlow instance.

Parameters:
jobConf - of type JobConf

MapReduceFlow

@ConstructorProperties(value={"jobConf","deleteSinkOnInit"})
public MapReduceFlow(JobConf jobConf,
                                                boolean deleteSinkOnInit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.

Parameters:
jobConf - of type JobConf
deleteSinkOnInit - of type boolean

MapReduceFlow

@ConstructorProperties(value={"name","jobConf"})
public MapReduceFlow(String name,
                                                JobConf jobConf)
Constructor MapReduceFlow creates a new MapReduceFlow instance.

Parameters:
name - of type String
jobConf - of type JobConf

MapReduceFlow

@ConstructorProperties(value={"name","jobConf","deleteSinkOnInit"})
public MapReduceFlow(String name,
                                                JobConf jobConf,
                                                boolean deleteSinkOnInit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.

Parameters:
name - of type String
jobConf - of type JobConf
deleteSinkOnInit - of type boolean

MapReduceFlow

@ConstructorProperties(value={"name","jobConf","deleteSinkOnInit","stopJobsOnExit"})
public MapReduceFlow(String name,
                                                JobConf jobConf,
                                                boolean deleteSinkOnInit,
                                                boolean stopJobsOnExit)
Constructor MapReduceFlow creates a new MapReduceFlow instance.

Parameters:
name - of type String
jobConf - of type JobConf
deleteSinkOnInit - of type boolean
stopJobsOnExit - of type boolean
Method Detail

createSources

protected Map<String,cascading.tap.Tap> createSources(JobConf jobConf)

createSinks

protected Map<String,cascading.tap.Tap> createSinks(JobConf jobConf)

createTraps

protected Map<String,cascading.tap.Tap> createTraps(JobConf jobConf)


Copyright © 2007-2013 Concurrent, Inc. All Rights Reserved.