|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object cascading.tap.Tap<Config,Input,Void> cascading.tap.SourceTap<Config,Input> cascading.tap.MultiSourceTap<Hfs,org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader> cascading.tap.hadoop.GlobHfs
public class GlobHfs
Class GlobHfs is a type of MultiSourceTap
that accepts Hadoop style 'file globing' expressions so
multiple files that match the given pattern may be used as the input sources for a given Flow
.
FileSystem.globStatus(org.apache.hadoop.fs.Path)
for details on the globing syntax. But in short
it is similar to standard regular expressions except alternation is done via {foo,bar} instead of (foo|bar).
Note that a Flow
sourcing from GlobHfs is not currently compatible with the Cascade
scheduler. GlobHfs expects the files and paths to exist so the wildcards can be resolved into concrete values so
that the scheduler can order the Flows properly.
Note that globing can match files or directories. It may consume less resources to match directories and let
Hadoop include all sub-files immediately contained in the directory instead of enumerating every individual file.
Ending the glob path with a /
should match only directories.
Hfs
,
MultiSourceTap
,
FileSystem
,
Serialized FormField Summary |
---|
Fields inherited from class cascading.tap.MultiSourceTap |
---|
taps |
Constructor Summary | |
---|---|
GlobHfs(Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,?,?,?> scheme,
String pathPattern)
Constructor GlobHfs creates a new GlobHfs instance. |
|
GlobHfs(Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,?,?,?> scheme,
String pathPattern,
org.apache.hadoop.fs.PathFilter pathFilter)
Constructor GlobHfs creates a new GlobHfs instance. |
Method Summary | |
---|---|
boolean |
equals(Object object)
|
String |
getIdentifier()
Method getIdentifier returns a String representing the resource this Tap instance represents. |
protected Hfs[] |
getTaps()
Method getTaps returns the taps of this MultiTap object. |
int |
hashCode()
|
void |
sourceConfInit(FlowProcess<org.apache.hadoop.mapred.JobConf> process,
org.apache.hadoop.mapred.JobConf conf)
Method sourceConfInit initializes this instance as a source. |
String |
toString()
|
Methods inherited from class cascading.tap.MultiSourceTap |
---|
getChildTaps, getModifiedTime, getNumChildTaps, getScheme, isReplace, openForRead, resourceExists |
Methods inherited from class cascading.tap.SourceTap |
---|
commitResource, createResource, deleteResource, getSinkFields, isSink, openForWrite, rollbackResource, sinkConfInit |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
@ConstructorProperties(value={"scheme","pathPattern"}) public GlobHfs(Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,?,?,?> scheme, String pathPattern)
scheme
- of type SchemepathPattern
- of type String@ConstructorProperties(value={"scheme","pathPattern","pathFilter"}) public GlobHfs(Scheme<org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader,?,?,?> scheme, String pathPattern, org.apache.hadoop.fs.PathFilter pathFilter)
scheme
- of type SchemepathPattern
- of type StringpathFilter
- of type PathFilterMethod Detail |
---|
public String getIdentifier()
Tap
getIdentifier
in class MultiSourceTap<Hfs,org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader>
protected Hfs[] getTaps()
MultiSourceTap
getTaps
in class MultiSourceTap<Hfs,org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader>
public void sourceConfInit(FlowProcess<org.apache.hadoop.mapred.JobConf> process, org.apache.hadoop.mapred.JobConf conf)
Tap
Flow
instance or if it participates in multiple times in a given Flow or across different Flows in
a Cascade
.
In the context of a Flow, it will be called after
FlowListener.onStarting(cascading.flow.Flow)
Note that no resources or services should be modified by this method.
sourceConfInit
in class MultiSourceTap<Hfs,org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader>
process
- of type FlowProcessconf
- of type Configpublic boolean equals(Object object)
equals
in class MultiSourceTap<Hfs,org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader>
public int hashCode()
hashCode
in class MultiSourceTap<Hfs,org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader>
public String toString()
toString
in class MultiSourceTap<Hfs,org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.RecordReader>
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |