cascading.tap
Class MultiSourceTap<Child extends Tap,Config,Input>

java.lang.Object
  extended by cascading.tap.Tap<Config,Input,Void>
      extended by cascading.tap.SourceTap<Config,Input>
          extended by cascading.tap.MultiSourceTap<Child,Config,Input>
All Implemented Interfaces:
FlowElement, CompositeTap<Child>, cascading.util.Traceable, Serializable
Direct Known Subclasses:
GlobHfs

public class MultiSourceTap<Child extends Tap,Config,Input>
extends SourceTap<Config,Input>
implements CompositeTap<Child>

Class MultiSourceTap is used to tie multiple Tap instances into a single resource. Effectively this will allow multiple files to be concatenated into the requesting pipe assembly, if they all share the same Scheme instance.

Note that order is not maintained by virtue of the underlying model. If order is necessary, use a unique sequence key to span the resources, like a line number.

Note that if multiple input files have the same Scheme (like TextLine), they may not contain the same semi-structure internally. For example, one file might be an Apache log file, and another might be a Log4J log file. If each one should be parsed differently, then they must be handled by different pipe assembly branches.

See Also:
Serialized Form

Field Summary
protected  Child[] taps
           
 
Constructor Summary
  MultiSourceTap(Child... taps)
          Constructor MultiSourceTap creates a new MultiSourceTap instance.
protected MultiSourceTap(Scheme<Config,Input,?,?,?> scheme)
           
 
Method Summary
 boolean equals(Object object)
           
 Iterator<Child> getChildTaps()
           
 String getIdentifier()
          Method getIdentifier returns a String representing the resource this Tap instance represents.
 long getModifiedTime(Config conf)
          Returns the most current modified time.
 long getNumChildTaps()
           
 Scheme getScheme()
          Method getScheme returns the scheme of this Tap object.
protected  Child[] getTaps()
          Method getTaps returns the taps of this MultiTap object.
 int hashCode()
           
 boolean isReplace()
          Method isReplace indicates whether the resource represented by this instance should be deleted if it already exists when the Flow is started.
 TupleEntryIterator openForRead(FlowProcess<Config> flowProcess, Input input)
          Method openForRead opens the resource represented by this Tap instance for reading.
 boolean resourceExists(Config conf)
          Method resourceExists returns true if the path represented by this instance exists.
 void sourceConfInit(FlowProcess<Config> process, Config conf)
          Method sourceConfInit initializes this instance as a source.
 String toString()
           
 
Methods inherited from class cascading.tap.SourceTap
commitResource, createResource, deleteResource, getSinkFields, isSink, openForWrite, rollbackResource, sinkConfInit
 
Methods inherited from class cascading.tap.Tap
createResource, deleteResource, flowConfInit, getConfigDef, getFullIdentifier, getFullIdentifier, getModifiedTime, getSinkMode, getSourceFields, getStepConfigDef, getTrace, hasConfigDef, hasStepConfigDef, id, isEquivalentTo, isKeep, isSource, isTemporary, isUpdate, openForRead, openForWrite, outgoingScopeFor, presentSinkFields, presentSourceFields, resolveIncomingOperationArgumentFields, resolveIncomingOperationPassThroughFields, resourceExists, retrieveSinkFields, retrieveSourceFields, setScheme, taps
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

taps

protected Child extends Tap[] taps
Constructor Detail

MultiSourceTap

protected MultiSourceTap(Scheme<Config,Input,?,?,?> scheme)

MultiSourceTap

@ConstructorProperties(value="taps")
public MultiSourceTap(Child... taps)
Constructor MultiSourceTap creates a new MultiSourceTap instance.

Parameters:
taps - of type Tap...
Method Detail

getTaps

protected Child[] getTaps()
Method getTaps returns the taps of this MultiTap object.

Returns:
the taps (type Tap[]) of this MultiTap object.

getChildTaps

public Iterator<Child> getChildTaps()
Specified by:
getChildTaps in interface CompositeTap<Child extends Tap>

getNumChildTaps

public long getNumChildTaps()
Specified by:
getNumChildTaps in interface CompositeTap<Child extends Tap>

getIdentifier

public String getIdentifier()
Description copied from class: Tap
Method getIdentifier returns a String representing the resource this Tap instance represents.

Often, if the tap accesses a filesystem, the identifier is nothing more than the path to the file or directory. In other cases it may be a an URL or URI representing a connection string or remote resource.

Any two Tap instances having the same value for the identifier are considered equal.

Specified by:
getIdentifier in class Tap<Config,Input,Void>
Returns:
String

getScheme

public Scheme getScheme()
Description copied from class: Tap
Method getScheme returns the scheme of this Tap object.

Overrides:
getScheme in class Tap<Config,Input,Void>
Returns:
the scheme (type Scheme) of this Tap object.

isReplace

public boolean isReplace()
Description copied from class: Tap
Method isReplace indicates whether the resource represented by this instance should be deleted if it already exists when the Flow is started.

Overrides:
isReplace in class Tap<Config,Input,Void>
Returns:
boolean

sourceConfInit

public void sourceConfInit(FlowProcess<Config> process,
                           Config conf)
Description copied from class: Tap
Method sourceConfInit initializes this instance as a source.

This method maybe called more than once if this Tap instance is used outside the scope of a Flow instance or if it participates in multiple times in a given Flow or across different Flows in a Cascade.

In the context of a Flow, it will be called after FlowListener.onStarting(cascading.flow.Flow)

Note that no resources or services should be modified by this method.

Overrides:
sourceConfInit in class Tap<Config,Input,Void>
Parameters:
process - of type FlowProcess
conf - of type Config

resourceExists

public boolean resourceExists(Config conf)
                       throws IOException
Description copied from class: Tap
Method resourceExists returns true if the path represented by this instance exists.

Specified by:
resourceExists in class Tap<Config,Input,Void>
Parameters:
conf - of type Config
Returns:
true if the underlying resource already exists
Throws:
IOException - when the status cannot be determined

getModifiedTime

public long getModifiedTime(Config conf)
                     throws IOException
Returns the most current modified time.

Specified by:
getModifiedTime in class Tap<Config,Input,Void>
Parameters:
conf - of type Config
Returns:
The date this resource was last modified.
Throws:
IOException

openForRead

public TupleEntryIterator openForRead(FlowProcess<Config> flowProcess,
                                      Input input)
                               throws IOException
Description copied from class: Tap
Method openForRead opens the resource represented by this Tap instance for reading.

input value may be null, if so, sub-classes must inquire with the underlying Scheme via Scheme.sourceConfInit(cascading.flow.FlowProcess, Tap, Object) to get the proper input type and instantiate it before calling super.openForRead().

Note the returned iterator will return the same instance of TupleEntry on every call, thus a copy must be made of either the TupleEntry or the underlying Tuple instance if they are to be stored in a Collection.

Specified by:
openForRead in class Tap<Config,Input,Void>
Parameters:
flowProcess - of type FlowProcess
input - of type Input
Returns:
TupleEntryIterator
Throws:
IOException - when the resource cannot be opened

equals

public boolean equals(Object object)
Overrides:
equals in class Tap<Config,Input,Void>

hashCode

public int hashCode()
Overrides:
hashCode in class Tap<Config,Input,Void>

toString

public String toString()
Overrides:
toString in class Tap<Config,Input,Void>


Copyright © 2007-2013 Concurrent, Inc. All Rights Reserved.