public abstract class FlowConnector extends Object
FlowDef
class for a fluent way to define a new Flow.
Use the FlowConnector to link source and sink Tap
instances with an assembly of Pipe
instances into
an executable Flow
.
FlowConnector invokes a planner for the target execution environment.
For executing Flows in local memory against local files, see LocalFlowConnector
.
For Apache Hadoop, see the HadoopFlowConnector
.
Or if you have a pre-existing custom Hadoop job to execute, see MapReduceFlow
, which
doesn't require a planner.
Note that all connect
methods take a single tail
or an array of tail
Pipe instances. "tail"
refers to the last connected Pipe instances in a pipe-assembly. Pipe-assemblies are graphs of object with "heads"
and "tails". From a given "tail", all connected heads can be found, but not the reverse. So "tails" must be
supplied by the user.
The FlowConnector and the underlying execution framework (Hadoop or local mode) can be configured via a
Map
or Properties
instance given to the constructor.
This properties map must be populated before constructing a FlowConnector instance. Many planner specific
properties can be set through the FlowConnectorProps
fluent interface.
Some planners have required properties. Hadoop expects AppProps.setApplicationJarPath(java.util.Map, String)
or
AppProps.setApplicationJarClass(java.util.Map, Class)
to be set.
Any properties set and passed through the FlowConnector constructor will be global to all Flow instances created through
the that FlowConnector instance. Some properties are on the FlowDef
and would only be applicable to the
resulting Flow instance.
These properties are used to influence the current planner and are also passed down to the
execution framework to override any default values. For example when using the Hadoop planner, the number of reducers
or mappers can be set by using platform specific properties.
Custom operations (Functions, Filter, etc) may also retrieve these property values at runtime through calls to
FlowProcess.getProperty(String)
or FlowProcess.getStringProperty(String)
.
Most applications will need to call AppProps.setApplicationJarClass(java.util.Map, Class)
or
AppProps.setApplicationJarPath(java.util.Map, String)
so that
the correct application jar file is passed through to all child processes. The Class or path must reference
the custom application jar, not a Cascading library class or jar. The easiest thing to do is give setApplicationJarClass
the Class with your static main function and let Cascading figure out which jar to use.
Note that MapLocalFlowConnector
,
HadoopFlowConnector
,
cascading.flow.hadoop2.Hadoop2MR1FlowConnector
Modifier and Type | Field and Description |
---|---|
protected Map<Object,Object> |
properties
Field properties
|
Modifier | Constructor and Description |
---|---|
protected |
FlowConnector() |
protected |
FlowConnector(Map<Object,Object> properties) |
protected |
FlowConnector(Map<Object,Object> properties,
RuleRegistrySet ruleRegistrySet) |
protected |
FlowConnector(RuleRegistrySet ruleRegistrySet) |
Modifier and Type | Method and Description |
---|---|
Flow |
connect(FlowDef flowDef) |
Flow |
connect(Map<String,Tap> sources,
Map<String,Tap> sinks,
Pipe... tails)
Method connect links the named sources and sinks to the given pipe assembly.
|
Flow |
connect(Map<String,Tap> sources,
Tap sink,
Pipe tail)
Method connect links the named source Taps and sink Tap to the given pipe assembly.
|
Flow |
connect(String name,
Map<String,Tap> sources,
Map<String,Tap> sinks,
Map<String,Tap> traps,
Pipe... tails)
Method connect links the named sources, sinks and traps to the given pipe assembly.
|
Flow |
connect(String name,
Map<String,Tap> sources,
Map<String,Tap> sinks,
Pipe... tails)
Method connect links the named sources and sinks to the given pipe assembly.
|
Flow |
connect(String name,
Map<String,Tap> sources,
Tap sink,
Map<String,Tap> traps,
Pipe tail)
Method connect links the named source and trap Taps and sink Tap to the given pipe assembly.
|
Flow |
connect(String name,
Map<String,Tap> sources,
Tap sink,
Pipe tail)
Method connect links the named source Taps and sink Tap to the given pipe assembly.
|
Flow |
connect(String name,
Tap source,
Map<String,Tap> sinks,
Collection<Pipe> tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.
|
Flow |
connect(String name,
Tap source,
Map<String,Tap> sinks,
Pipe... tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.
|
Flow |
connect(String name,
Tap source,
Tap sink,
Map<String,Tap> traps,
Pipe tail)
Method connect links the named trap Taps, source and sink Tap to the given pipe assembly.
|
Flow |
connect(String name,
Tap source,
Tap sink,
Pipe tail)
Method connect links the given source and sink Taps to the given pipe assembly.
|
Flow |
connect(String name,
Tap source,
Tap sink,
Tap trap,
Pipe tail)
Method connect links the given source, sink, and trap Taps to the given pipe assembly.
|
Flow |
connect(Tap source,
Map<String,Tap> sinks,
Collection<Pipe> tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.
|
Flow |
connect(Tap source,
Map<String,Tap> sinks,
Pipe... tails)
Method connect links the named source Taps and sink Tap to the given pipe assembly.
|
Flow |
connect(Tap source,
Tap sink,
Pipe tail)
Method connect links the given source and sink Taps to the given pipe assembly.
|
protected abstract RuleRegistrySet |
createDefaultRuleRegistrySet() |
protected abstract FlowPlanner |
createFlowPlanner() |
protected abstract Class<? extends Scheme> |
getDefaultIntermediateSchemeClass() |
Class |
getIntermediateSchemeClass(Map<Object,Object> properties)
Method getIntermediateSchemeClass is used for debugging.
|
PlatformInfo |
getPlatformInfo()
Method getPlatformInfo returns an instance of
PlatformInfo for the underlying platform. |
Map<Object,Object> |
getProperties()
Method getProperties returns the properties of this FlowConnector object.
|
RuleRegistrySet |
getRuleRegistrySet()
Returns the configured RuleRegistry, or the default for this platform.
|
protected Map<Object,Object> properties
protected FlowConnector()
protected FlowConnector(RuleRegistrySet ruleRegistrySet)
protected FlowConnector(Map<Object,Object> properties)
protected FlowConnector(Map<Object,Object> properties, RuleRegistrySet ruleRegistrySet)
public Class getIntermediateSchemeClass(Map<Object,Object> properties)
properties
- of type Mapprotected abstract Class<? extends Scheme> getDefaultIntermediateSchemeClass()
public Map<Object,Object> getProperties()
Properties
instance was passed to the constructor, the returned object will be a flattened
Map
instance.public Flow connect(Tap source, Tap sink, Pipe tail)
source
- source Tap to bind to the head of the given tail Pipesink
- sink Tap to bind to the given tail Pipetail
- tail end of a pipe assemblypublic Flow connect(String name, Tap source, Tap sink, Pipe tail)
name
- name to give the resulting Flowsource
- source Tap to bind to the head of the given tail Pipesink
- sink Tap to bind to the given tail Pipetail
- tail end of a pipe assemblypublic Flow connect(String name, Tap source, Tap sink, Tap trap, Pipe tail)
name
- name to give the resulting Flowsource
- source Tap to bind to the head of the given tail Pipesink
- sink Tap to bind to the given tail Pipetrap
- trap Tap to sink all failed Tuples intotail
- tail end of a pipe assemblypublic Flow connect(Map<String,Tap> sources, Tap sink, Pipe tail)
sources
- all head names and source Taps to bind to the heads of the given tail Pipesink
- sink Tap to bind to the given tail Pipetail
- tail end of a pipe assemblypublic Flow connect(String name, Map<String,Tap> sources, Tap sink, Pipe tail)
name
- name to give the resulting Flowsources
- all head names and source Taps to bind to the heads of the given tail Pipesink
- sink Tap to bind to the given tail Pipetail
- tail end of a pipe assemblypublic Flow connect(String name, Map<String,Tap> sources, Tap sink, Map<String,Tap> traps, Pipe tail)
name
- name to give the resulting Flowsources
- all head names and source Taps to bind to the heads of the given tail Pipesink
- sink Tap to bind to the given tail Pipetraps
- all pipe names and trap Taps to sink all failed Tuples intotail
- tail end of a pipe assemblypublic Flow connect(String name, Tap source, Tap sink, Map<String,Tap> traps, Pipe tail)
name
- name to give the resulting Flowsource
- source Tap to bind to the head of the given tail Pipesink
- sink Tap to bind to the given tail Pipetraps
- all pipe names and trap Taps to sink all failed Tuples intotail
- tail end of a pipe assemblypublic Flow connect(Tap source, Map<String,Tap> sinks, Collection<Pipe> tails)
source
- source Tap to bind to the head of the given tail Pipessinks
- all tail names and sink Taps to bind to the given tail Pipestails
- all tail ends of a pipe assemblypublic Flow connect(String name, Tap source, Map<String,Tap> sinks, Collection<Pipe> tails)
name
- name to give the resulting Flowsource
- source Tap to bind to the head of the given tail Pipessinks
- all tail names and sink Taps to bind to the given tail Pipestails
- all tail ends of a pipe assemblypublic Flow connect(Tap source, Map<String,Tap> sinks, Pipe... tails)
source
- source Tap to bind to the head of the given tail Pipessinks
- all tail names and sink Taps to bind to the given tail Pipestails
- all tail ends of a pipe assemblypublic Flow connect(String name, Tap source, Map<String,Tap> sinks, Pipe... tails)
name
- name to give the resulting Flowsource
- source Tap to bind to the head of the given tail Pipessinks
- all tail names and sink Taps to bind to the given tail Pipestails
- all tail ends of a pipe assemblypublic Flow connect(Map<String,Tap> sources, Map<String,Tap> sinks, Pipe... tails)
sources
- all head names and source Taps to bind to the heads of the given tail Pipessinks
- all tail names and sink Taps to bind to the given tail Pipestails
- all tail ends of a pipe assemblypublic Flow connect(String name, Map<String,Tap> sources, Map<String,Tap> sinks, Pipe... tails)
name
- name to give the resulting Flowsources
- all head names and source Taps to bind to the heads of the given tail Pipessinks
- all tail names and sink Taps to bind to the given tail Pipestails
- all tail ends of a pipe assemblypublic Flow connect(String name, Map<String,Tap> sources, Map<String,Tap> sinks, Map<String,Tap> traps, Pipe... tails)
name
- name to give the resulting Flowsources
- all head names and source Taps to bind to the heads of the given tail Pipessinks
- all tail names and sink Taps to bind to the given tail Pipestraps
- all pipe names and trap Taps to sink all failed Tuples intotails
- all tail ends of a pipe assemblyprotected abstract FlowPlanner createFlowPlanner()
public RuleRegistrySet getRuleRegistrySet()
connect(FlowDef)
.protected abstract RuleRegistrySet createDefaultRuleRegistrySet()
public PlatformInfo getPlatformInfo()
PlatformInfo
for the underlying platform.Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.