cascading.flow.hadoop
Class HadoopFlow

java.lang.Object
  extended by cascading.flow.BaseFlow<JobConf>
      extended by cascading.flow.hadoop.HadoopFlow
All Implemented Interfaces:
cascading.flow.Flow<JobConf>, cascading.management.UnitOfWork<cascading.stats.FlowStats>
Direct Known Subclasses:
MapReduceFlow, ProcessFlow

public class HadoopFlow
extends cascading.flow.BaseFlow<JobConf>

Class HadoopFlow is the Apache Hadoop specific implementation of a Flow.

HadoopFlow must be created through a HadoopFlowConnector instance.

If classpath paths are provided on the FlowDef, the Hadoop distributed cache mechanism will be used to augment the remote classpath.

Any path elements that are relative will be uploaded to HDFS, and the HDFS URI will be used on the JobConf. Note all paths are added as "files" to the JobConf, not archives, so they aren't needlessly uncompressed cluster side.

See Also:
HadoopFlowConnector

Nested Class Summary
 
Nested classes/interfaces inherited from class cascading.flow.BaseFlow
cascading.flow.BaseFlow.FlowHolder
 
Field Summary
 
Fields inherited from class cascading.flow.BaseFlow
flowStats, sinks, sources, stop, stopJobsOnExit, thread
 
Fields inherited from interface cascading.flow.Flow
CASCADING_FLOW_ID
 
Constructor Summary
protected HadoopFlow()
           
  HadoopFlow(cascading.flow.planner.PlatformInfo platformInfo, Map<Object,Object> properties, JobConf jobConf, cascading.flow.FlowDef flowDef)
           
protected HadoopFlow(cascading.flow.planner.PlatformInfo platformInfo, Map<Object,Object> properties, JobConf jobConf, String name)
           
 
Method Summary
 JobConf getConfig()
           
 Map<Object,Object> getConfigAsProperties()
           
 JobConf getConfigCopy()
           
 cascading.flow.FlowProcess<JobConf> getFlowProcess()
           
protected  int getMaxNumParallelSteps()
           
 String getProperty(String key)
          Method getProperty returns the value associated with the given key from the underlying properties system.
protected  void initConfig(Map<Object,Object> properties, JobConf parentConfig)
           
protected  void initFromProperties(Map<Object,Object> properties)
           
protected  void internalClean(boolean stop)
           
protected  void internalShutdown()
           
protected  void internalStart()
           
 boolean isPreserveTemporaryFiles()
          Method isPreserveTemporaryFiles returns false if temporary files will be cleaned when this Flow completes.
protected  JobConf newConfig(JobConf defaultConfig)
           
protected  void setConfigProperty(JobConf config, Object key, Object value)
           
 boolean stepsAreLocal()
           
 
Methods inherited from class cascading.flow.BaseFlow
addListener, addStepListener, areSinksStale, areSourcesNewer, cleanup, complete, createConfig, createFlowThread, deleteCheckpointsIfNotUpdate, deleteCheckpointsIfReplace, deleteSinks, deleteSinksIfNotUpdate, deleteSinksIfReplace, deleteTrapsIfNotUpdate, deleteTrapsIfReplace, fireOnCompleted, fireOnStarting, fireOnStopping, fireOnThrowable, getCascadeID, getCascadingServices, getCheckpointNames, getCheckpoints, getCheckpointsCollection, getClassPath, getFieldsFor, getFlowSession, getFlowSkipStrategy, getFlowStats, getFlowSteps, getFlowStepStrategy, getHolder, getID, getName, getPlatformInfo, getRunID, getSink, getSink, getSinkModified, getSinkNames, getSinks, getSinksCollection, getSource, getSourceNames, getSources, getSourcesCollection, getSpawnStrategy, getStats, getSubmitPriority, getTags, getTrapNames, getTraps, getTrapsCollection, handleExecutorShutdown, hasListeners, hasStepListeners, initialize, initializeNewJobsMap, initSteps, internalStopAllJobs, isSkipFlow, isStopJobsOnExit, logInfo, openSink, openSink, openSource, openSource, openTapForRead, openTapForWrite, openTrap, openTrap, prepare, presentSinkFields, presentSourceFields, registerShutdownHook, removeListener, removeStepListener, resourceExists, retrieveSinkFields, retrieveSourceFields, setCascade, setCheckpoints, setFlowSkipStrategy, setFlowStepGraph, setFlowStepStrategy, setName, setSinks, setSources, setSpawnStrategy, setSubmitPriority, setTraps, start, stop, toString, updateSchemes, writeDOT, writeStepsDOT
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

HadoopFlow

protected HadoopFlow()

HadoopFlow

protected HadoopFlow(cascading.flow.planner.PlatformInfo platformInfo,
                     Map<Object,Object> properties,
                     JobConf jobConf,
                     String name)

HadoopFlow

public HadoopFlow(cascading.flow.planner.PlatformInfo platformInfo,
                  Map<Object,Object> properties,
                  JobConf jobConf,
                  cascading.flow.FlowDef flowDef)
Method Detail

initFromProperties

protected void initFromProperties(Map<Object,Object> properties)
Overrides:
initFromProperties in class cascading.flow.BaseFlow<JobConf>

initConfig

protected void initConfig(Map<Object,Object> properties,
                          JobConf parentConfig)
Specified by:
initConfig in class cascading.flow.BaseFlow<JobConf>

setConfigProperty

protected void setConfigProperty(JobConf config,
                                 Object key,
                                 Object value)
Specified by:
setConfigProperty in class cascading.flow.BaseFlow<JobConf>

newConfig

protected JobConf newConfig(JobConf defaultConfig)
Specified by:
newConfig in class cascading.flow.BaseFlow<JobConf>

getConfig

public JobConf getConfig()

getConfigCopy

public JobConf getConfigCopy()

getConfigAsProperties

public Map<Object,Object> getConfigAsProperties()

getProperty

public String getProperty(String key)
Method getProperty returns the value associated with the given key from the underlying properties system.

Parameters:
key - of type String
Returns:
String

getFlowProcess

public cascading.flow.FlowProcess<JobConf> getFlowProcess()

isPreserveTemporaryFiles

public boolean isPreserveTemporaryFiles()
Method isPreserveTemporaryFiles returns false if temporary files will be cleaned when this Flow completes.

Returns:
the preserveTemporaryFiles (type boolean) of this Flow object.

internalStart

protected void internalStart()
Specified by:
internalStart in class cascading.flow.BaseFlow<JobConf>

stepsAreLocal

public boolean stepsAreLocal()

internalClean

protected void internalClean(boolean stop)
Specified by:
internalClean in class cascading.flow.BaseFlow<JobConf>

internalShutdown

protected void internalShutdown()
Specified by:
internalShutdown in class cascading.flow.BaseFlow<JobConf>

getMaxNumParallelSteps

protected int getMaxNumParallelSteps()
Specified by:
getMaxNumParallelSteps in class cascading.flow.BaseFlow<JobConf>


Copyright © 2007-2013 Concurrent, Inc. All Rights Reserved.