cascading.flow
Class MultiMapReducePlanner

java.lang.Object
  extended by cascading.flow.FlowPlanner
      extended by cascading.flow.MultiMapReducePlanner

public class MultiMapReducePlanner
extends FlowPlanner

Class MultiMapReducePlanner is the core Hadoop MapReduce planner.

Notes:

Custom JobConf properties
A custom JobConf instance can be passed to this planner by calling setJobConf(java.util.Map, org.apache.hadoop.mapred.JobConf) on a map properties object before constructing a new FlowConnector.

A better practice would be to set Hadoop properties directly on the map properties object handed to the FlowConnector. All values in the map will be passed to a new default JobConf instance to be used as defaults for all resulting Flow instances.

For example, properties.set("mapred.child.java.opts","-Xmx512m"); would convince Hadoop to spawn all child jvms with a heap of 512MB.

Properties


Field Summary
 
Fields inherited from class cascading.flow.FlowPlanner
assertionLevel, debugLevel, properties
 
Constructor Summary
protected MultiMapReducePlanner(Map<Object,Object> properties)
          Constructor MultiMapReducePlanner creates a new MultiMapReducePlanner instance.
 
Method Summary
 Flow buildFlow(String flowName, Pipe[] pipes, Map<String,Tap> sources, Map<String,Tap> sinks, Map<String,Tap> traps)
          Method buildFlow renders the actual Flow instance.
static JobConf getJobConf(Map<Object,Object> properties)
          Method getJobConf returns a stored JobConf instance, if any.
static boolean getNormalizeHeterogeneousSources(Map<Object,Object> properties)
          Method getNormalizeHeterogeneousSources returns if this planner will normalize heterogeneous input sources.
static void setJobConf(Map<Object,Object> properties, JobConf jobConf)
          Method setJobConf adds the given JobConf object to the given properties object.
static void setNormalizeHeterogeneousSources(Map<Object,Object> properties, boolean doNormalize)
          Method setNormalizeHeterogeneousSources adds the given doNormalize boolean to the given properites object.
 
Methods inherited from class cascading.flow.FlowPlanner
createElementGraph, failOnGroupEverySplit, failOnLoneGroupAssertion, failOnMissingGroup, failOnMisusedBuffer, verifyAssembly, verifyPipeAssemblyEndPoints, verifySourceNotSinks, verifyTaps, verifyTraps
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MultiMapReducePlanner

protected MultiMapReducePlanner(Map<Object,Object> properties)
Constructor MultiMapReducePlanner creates a new MultiMapReducePlanner instance.

Parameters:
properties - of type Map
Method Detail

setJobConf

public static void setJobConf(Map<Object,Object> properties,
                              JobConf jobConf)
Method setJobConf adds the given JobConf object to the given properties object. Use this method to pass custom default Hadoop JobConf properties to Hadoop.

Parameters:
properties - of type Map
jobConf - of type JobConf

getJobConf

public static JobConf getJobConf(Map<Object,Object> properties)
Method getJobConf returns a stored JobConf instance, if any.

Parameters:
properties - of type Map
Returns:
a JobConf instance

setNormalizeHeterogeneousSources

public static void setNormalizeHeterogeneousSources(Map<Object,Object> properties,
                                                    boolean doNormalize)
Method setNormalizeHeterogeneousSources adds the given doNormalize boolean to the given properites object. Use this method if additional jobs should be planned in to handle incompatible InputFormat classes.

Normalization is off by default and should only be enabled by advanced users. Typically this will decrease application performance.

Parameters:
properties - of type Map
doNormalize - of type boolean

getNormalizeHeterogeneousSources

public static boolean getNormalizeHeterogeneousSources(Map<Object,Object> properties)
Method getNormalizeHeterogeneousSources returns if this planner will normalize heterogeneous input sources.

Parameters:
properties - of type Map
Returns:
a boolean

buildFlow

public Flow buildFlow(String flowName,
                      Pipe[] pipes,
                      Map<String,Tap> sources,
                      Map<String,Tap> sinks,
                      Map<String,Tap> traps)
Method buildFlow renders the actual Flow instance.

Parameters:
flowName - of type String
pipes - of type Pipe[]
sources - of type Map
sinks - of type Map
traps - of type Map
Returns:
Flow


Copyright © 2007-2010 Concurrent, Inc. All Rights Reserved.