Class GlobHfs

  extended by cascading.tap.Tap<Config,Input,Void>
      extended by cascading.tap.SourceTap<Config,Input>
          extended by cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>
              extended by cascading.tap.hadoop.GlobHfs
All Implemented Interfaces:
cascading.flow.FlowElement, cascading.tap.CompositeTap<Hfs>, Serializable

public class GlobHfs
extends cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>

Class GlobHfs is a type of MultiSourceTap that accepts Hadoop style 'file globing' expressions so multiple files that match the given pattern may be used as the input sources for a given Flow.

See FileSystem.globStatus(org.apache.hadoop.fs.Path) for details on the globing syntax. But in short it is similar to standard regular expressions except alternation is done via {foo,bar} instead of (foo|bar).

Note that a Flow sourcing from GlobHfs is not currently compatible with the Cascade scheduler. GlobHfs expects the files and paths to exist so the wildcards can be resolved into concrete values so that the scheduler can order the Flows properly.

Note that globing can match files or directories. It may consume less resources to match directories and let Hadoop include all sub-files immediately contained in the directory instead of enumerating every individual file. Ending the glob path with a / should match only directories.

See Also:
Hfs, MultiSourceTap, FileSystem, Serialized Form

Field Summary
Fields inherited from class cascading.tap.MultiSourceTap
Constructor Summary
GlobHfs(cascading.scheme.Scheme<JobConf,RecordReader,?,?,?> scheme, String pathPattern)
          Constructor GlobHfs creates a new GlobHfs instance.
GlobHfs(cascading.scheme.Scheme<JobConf,RecordReader,?,?,?> scheme, String pathPattern, PathFilter pathFilter)
          Constructor GlobHfs creates a new GlobHfs instance.
Method Summary
 boolean equals(Object object)
 String getIdentifier()
protected  Hfs[] getTaps()
 int hashCode()
 void sourceConfInit(cascading.flow.FlowProcess<JobConf> process, JobConf conf)
 String toString()
Methods inherited from class cascading.tap.MultiSourceTap
getChildTaps, getModifiedTime, getNumChildTaps, getScheme, isReplace, openForRead, resourceExists
Methods inherited from class cascading.tap.SourceTap
commitResource, createResource, deleteResource, getSinkFields, isSink, openForWrite, rollbackResource, sinkConfInit
Methods inherited from class cascading.tap.Tap
createResource, deleteResource, flowConfInit, getConfigDef, getFullIdentifier, getFullIdentifier, getModifiedTime, getSinkMode, getSourceFields, getStepConfigDef, getTrace, hasConfigDef, hasStepConfigDef, id, isEquivalentTo, isKeep, isSource, isTemporary, isUpdate, openForRead, openForWrite, outgoingScopeFor, presentSinkFields, presentSourceFields, resolveIncomingOperationArgumentFields, resolveIncomingOperationPassThroughFields, resourceExists, retrieveSinkFields, retrieveSourceFields, setScheme, taps
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

Constructor Detail


public GlobHfs(cascading.scheme.Scheme<JobConf,RecordReader,?,?,?> scheme,
                                          String pathPattern)
Constructor GlobHfs creates a new GlobHfs instance.

scheme - of type Scheme
pathPattern - of type String


public GlobHfs(cascading.scheme.Scheme<JobConf,RecordReader,?,?,?> scheme,
                                          String pathPattern,
                                          PathFilter pathFilter)
Constructor GlobHfs creates a new GlobHfs instance.

scheme - of type Scheme
pathPattern - of type String
pathFilter - of type PathFilter
Method Detail


public String getIdentifier()
getIdentifier in class cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>


protected Hfs[] getTaps()
getTaps in class cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>


public void sourceConfInit(cascading.flow.FlowProcess<JobConf> process,
                           JobConf conf)
sourceConfInit in class cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>


public boolean equals(Object object)
equals in class cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>


public int hashCode()
hashCode in class cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>


public String toString()
toString in class cascading.tap.MultiSourceTap<Hfs,JobConf,RecordReader>

Copyright © 2007-2013 Concurrent, Inc. All Rights Reserved.