cascading.tap
Class TemplateTap

java.lang.Object
  extended by cascading.tap.Tap
      extended by cascading.tap.SinkTap
          extended by cascading.tap.TemplateTap
All Implemented Interfaces:
FlowElement, Serializable

public class TemplateTap
extends SinkTap

Class TemplateTap can be used to write tuple streams out to subdirectories based on the values in the Tuple instance.

The constructor takes a Hfs Tap and a Formatter format syntax String. This allows Tuple values at given positions to be used as directory names. Note that Hadoop can only sink to directories, and all files in those directories are "part-xxxxx" files.

openTapsThreshold limits the number of open files to be output to. This value defaults to 300 files. Each time the threshold is exceeded, 10% of the least recently used open files will be closed.

See Also:
Serialized Form

Nested Class Summary
static class TemplateTap.TemplateScheme
           
 
Constructor Summary
TemplateTap(Hfs parent, String pathTemplate)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, Fields pathFields)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, Fields pathFields, int openTapsThreshold)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, Fields pathFields, SinkMode sinkMode)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, Fields pathFields, SinkMode sinkMode, boolean keepParentOnDelete)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, Fields pathFields, SinkMode sinkMode, boolean keepParentOnDelete, int openTapsThreshold)
          /** Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, int openTapsThreshold)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, SinkMode sinkMode)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, SinkMode sinkMode, boolean keepParentOnDelete)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
TemplateTap(Hfs parent, String pathTemplate, SinkMode sinkMode, boolean keepParentOnDelete, int openTapsThreshold)
          Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.
 
Method Summary
 boolean deletePath(JobConf conf)
          Method deletePath deletes the resource represented by this instance.
 boolean equals(Object object)
           
 int getOpenTapsThreshold()
          Method getOpenTapsThreshold returns the openTapsThreshold of this TemplateTap object.
 Tap getParent()
          Method getParent returns the parent Tap of this TemplateTap object.
 Path getPath()
          Method getPath returns the Hadoop path to the resource represented by this Tap instance.
 long getPathModified(JobConf conf)
          Method getPathModified returns the date this resource was last modified.
 String getPathTemplate()
          Method getPathTemplate returns the pathTemplate Formatter format String of this TemplateTap object.
 int hashCode()
           
 boolean isWriteDirect()
          Method isWriteDirect returns true if this instances TupleEntryCollector should be used to sink values.
 boolean makeDirs(JobConf conf)
          Method makeDirs makes all the directories this Tap instance represents.
 TupleEntryCollector openForWrite(JobConf conf)
          Method openForWrite opens the resource represented by this Tap instance.
 boolean pathExists(JobConf conf)
          Method pathExists return true if the path represented by this instance exists.
 String toString()
           
 
Methods inherited from class cascading.tap.SinkTap
getSourceFields, isSource, openForRead, source, sourceInit
 
Methods inherited from class cascading.tap.Tap
flowInit, getIdentifier, getQualifiedPath, getScheme, getSinkFields, getSinkMode, isAppend, isEquivalentTo, isKeep, isReplace, isSink, isUpdate, outgoingScopeFor, resolveFields, resolveIncomingOperationFields, setScheme, setWriteDirect, sink, sinkInit, taps
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate"})
public TemplateTap(Hfs parent,
                                              String pathTemplate)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.

Parameters:
parent - of type Tap
pathTemplate - of type String

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","openTapsThreshold"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              int openTapsThreshold)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.

openTapsThreshold limits the number of open files to be output to.

Parameters:
parent - of type Hfs
pathTemplate - of type String
openTapsThreshold - of type int

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","sinkMode"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              SinkMode sinkMode)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.

Parameters:
parent - of type Tap
pathTemplate - of type String
sinkMode - of type SinkMode

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","sinkMode","keepParentOnDelete"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              SinkMode sinkMode,
                                              boolean keepParentOnDelete)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.

keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deletePath(org.apache.hadoop.mapred.JobConf) is called, typically an issue when used inside a Cascade.

Parameters:
parent - of type Tap
pathTemplate - of type String
sinkMode - of type SinkMode
keepParentOnDelete - of type boolean

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","sinkMode","keepParentOnDelete","openTapsThreshold"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              SinkMode sinkMode,
                                              boolean keepParentOnDelete,
                                              int openTapsThreshold)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String.

keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deletePath(org.apache.hadoop.mapred.JobConf) is called, typically an issue when used inside a Cascade.

openTapsThreshold limits the number of open files to be output to.

Parameters:
parent - of type Tap
pathTemplate - of type String
sinkMode - of type SinkMode
keepParentOnDelete - of type boolean
openTapsThreshold - of type int

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","pathFields"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              Fields pathFields)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String. The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.

This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing data not in the result file to be used in the template path name.

Parameters:
parent - of type Tap
pathTemplate - of type String
pathFields - of type Fields

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","pathFields","openTapsThreshold"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              Fields pathFields,
                                              int openTapsThreshold)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String. The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.

This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing data not in the result file to be used in the template path name.

openTapsThreshold limits the number of open files to be output to.

Parameters:
parent - of type Hfs
pathTemplate - of type String
pathFields - of type Fields
openTapsThreshold - of type int

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","pathFields","sinkMode"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              Fields pathFields,
                                              SinkMode sinkMode)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String. The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.

This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing data not in the result file to be used in the template path name.

Parameters:
parent - of type Tap
pathTemplate - of type String
pathFields - of type Fields
sinkMode - of type SinkMode

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","pathFields","sinkMode","keepParentOnDelete"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              Fields pathFields,
                                              SinkMode sinkMode,
                                              boolean keepParentOnDelete)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String. The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.

This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing data not in the result file to be used in the template path name.

keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deletePath(org.apache.hadoop.mapred.JobConf) is called, typically an issue when used inside a Cascade.

Parameters:
parent - of type Tap
pathTemplate - of type String
pathFields - of type Fields
sinkMode - of type SinkMode
keepParentOnDelete - of type boolean

TemplateTap

@ConstructorProperties(value={"parent","pathTemplate","pathFields","sinkMode","keepParentOnDelete","openTapsThreshold"})
public TemplateTap(Hfs parent,
                                              String pathTemplate,
                                              Fields pathFields,
                                              SinkMode sinkMode,
                                              boolean keepParentOnDelete,
                                              int openTapsThreshold)
/** Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the base path and default Scheme, and the pathTemplate as the Formatter format String. The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.

This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing data not in the result file to be used in the template path name.

keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deletePath(org.apache.hadoop.mapred.JobConf) is called, typically an issue when used inside a Cascade.

openTapsThreshold limits the number of open files to be output to.

Parameters:
parent - of type Hfs
pathTemplate - of type String
pathFields - of type Fields
sinkMode - of type SinkMode
keepParentOnDelete - of type boolean
openTapsThreshold - of type int
Method Detail

getParent

public Tap getParent()
Method getParent returns the parent Tap of this TemplateTap object.

Returns:
the parent (type Tap) of this TemplateTap object.

getPathTemplate

public String getPathTemplate()
Method getPathTemplate returns the pathTemplate Formatter format String of this TemplateTap object.

Returns:
the pathTemplate (type String) of this TemplateTap object.

isWriteDirect

public boolean isWriteDirect()
Description copied from class: Tap
Method isWriteDirect returns true if this instances TupleEntryCollector should be used to sink values.

Overrides:
isWriteDirect in class Tap
Returns:
the writeDirect (type boolean) of this Tap object.

getPath

public Path getPath()
Description copied from class: Tap
Method getPath returns the Hadoop path to the resource represented by this Tap instance.

Specified by:
getPath in class Tap
Returns:
Path
See Also:
Tap.getPath()

getOpenTapsThreshold

public int getOpenTapsThreshold()
Method getOpenTapsThreshold returns the openTapsThreshold of this TemplateTap object.

Returns:
the openTapsThreshold (type int) of this TemplateTap object.

openForWrite

public TupleEntryCollector openForWrite(JobConf conf)
                                 throws IOException
Description copied from class: Tap
Method openForWrite opens the resource represented by this Tap instance.

Overrides:
openForWrite in class SinkTap
Parameters:
conf - of type JobConf
Returns:
TupleEntryCollector
Throws:
IOException - when

makeDirs

public boolean makeDirs(JobConf conf)
                 throws IOException
Description copied from class: Tap
Method makeDirs makes all the directories this Tap instance represents.

Specified by:
makeDirs in class Tap
Parameters:
conf - of type JobConf
Returns:
boolean
Throws:
IOException - when there is an error making directories
See Also:
Tap.makeDirs(JobConf)

deletePath

public boolean deletePath(JobConf conf)
                   throws IOException
Description copied from class: Tap
Method deletePath deletes the resource represented by this instance.

Specified by:
deletePath in class Tap
Parameters:
conf - of type JobConf
Returns:
boolean
Throws:
IOException - when the resource cannot be deleted
See Also:
Tap.deletePath(JobConf)

pathExists

public boolean pathExists(JobConf conf)
                   throws IOException
Description copied from class: Tap
Method pathExists return true if the path represented by this instance exists.

Specified by:
pathExists in class Tap
Parameters:
conf - of type JobConf
Returns:
boolean
Throws:
IOException - when the status cannot be determined
See Also:
Tap.pathExists(JobConf)

getPathModified

public long getPathModified(JobConf conf)
                     throws IOException
Description copied from class: Tap
Method getPathModified returns the date this resource was last modified.

Specified by:
getPathModified in class Tap
Parameters:
conf - of type JobConf
Returns:
long
Throws:
IOException - when the modified date cannot be determined
See Also:
Tap.getPathModified(JobConf)

equals

public boolean equals(Object object)
Overrides:
equals in class Tap

hashCode

public int hashCode()
Overrides:
hashCode in class Tap

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2007-2010 Concurrent, Inc. All Rights Reserved.