public class PartitionTap extends cascading.tap.partition.BasePartitionTap<java.util.Properties,java.io.InputStream,java.io.OutputStream>
Tuple
instance.
The constructor takes a FileTap
Tap
and a Partition
implementation. This allows Tuple values at given positions to be used as directory names during write
operations, and directory names as data during read operations.
The key value here is that there is no need to duplicate data values in the directory names and inside
the data files.
So only values declared in the parent Tap will be read or written to the underlying file system files. But
fields declared by the Partition
will only be read or written to the directory names. That is, the
PartitionTap instance will sink or source the partition fields, plus the parent Tap fields. The partition
fields and parent Tap fields do not need to have common field names.
openWritesThreshold
limits the number of open files to be output to. This value defaults to 300 files.
Each time the threshold is exceeded, 10% of the least recently used open files will be closed.
PartitionTap will populate a given partition
without regard to case of the values being used. Thus
the resulting paths 2012/June/
and 2012/june/
will likely result in two open files into the same
location. Forcing the case to be consistent with a custom Partition implementation or an upstream
Function
is recommended, see cascading.operation.expression.ExpressionFunction
.Constructor and Description |
---|
PartitionTap(FileTap parent,
cascading.tap.partition.Partition partition)
Constructor PartitionTap creates a new PartitionTap instance using the given parent
FileTap Tap as the
base path and default Scheme , and the partition. |
PartitionTap(FileTap parent,
cascading.tap.partition.Partition partition,
int openWritesThreshold)
Constructor PartitionTap creates a new PartitionTap instance using the given parent
FileTap Tap as the
base path and default Scheme , and the partition. |
PartitionTap(FileTap parent,
cascading.tap.partition.Partition partition,
cascading.tap.SinkMode sinkMode)
Constructor PartitionTap creates a new PartitionTap instance using the given parent
FileTap Tap as the
base path and default Scheme , and the partition. |
PartitionTap(FileTap parent,
cascading.tap.partition.Partition partition,
cascading.tap.SinkMode sinkMode,
boolean keepParentOnDelete)
Constructor PartitionTap creates a new PartitionTap instance using the given parent
FileTap Tap as the
base path and default Scheme , and the partition. |
PartitionTap(FileTap parent,
cascading.tap.partition.Partition partition,
cascading.tap.SinkMode sinkMode,
boolean keepParentOnDelete,
int openWritesThreshold)
Constructor PartitionTap creates a new PartitionTap instance using the given parent
FileTap Tap as the
base path and default Scheme , and the partition. |
Modifier and Type | Method and Description |
---|---|
protected cascading.tuple.TupleEntrySchemeCollector |
createTupleEntrySchemeCollector(cascading.flow.FlowProcess<? extends java.util.Properties> flowProcess,
cascading.tap.Tap parent,
java.lang.String path,
long sequence) |
protected cascading.tuple.TupleEntrySchemeIterator |
createTupleEntrySchemeIterator(cascading.flow.FlowProcess<? extends java.util.Properties> flowProcess,
cascading.tap.Tap parent,
java.lang.String path,
java.io.InputStream input) |
boolean |
deleteResource(java.util.Properties conf) |
protected java.lang.String |
getCurrentIdentifier(cascading.flow.FlowProcess<? extends java.util.Properties> flowProcess) |
addSourcePartitionFilter, commitResource, createResource, equals, getChildPartitionIdentifiers, getFilteredPartitionIdentifiers, getIdentifier, getModifiedTime, getOpenWritesThreshold, getParent, getPartition, hashCode, openForRead, openForWrite, prepareResourceForRead, prepareResourceForWrite, resourceExists, rollbackResource, toString
createResource, deleteResource, flowConfInit, getConfigDef, getFullIdentifier, getFullIdentifier, getModifiedTime, getNodeConfigDef, getScheme, getSinkFields, getSinkMode, getSourceFields, getStepConfigDef, getTrace, hasConfigDef, hasNodeConfigDef, hasStepConfigDef, id, isKeep, isReplace, isSink, isSource, isTemporary, isUpdate, openForRead, openForWrite, outgoingScopeFor, presentSinkFields, presentSourceFields, resolveIncomingOperationArgumentFields, resolveIncomingOperationPassThroughFields, resourceExists, retrieveSinkFields, retrieveSourceFields, setScheme, sinkConfInit, sourceConfInit, taps
@ConstructorProperties(value={"parent","partition"}) public PartitionTap(FileTap parent, cascading.tap.partition.Partition partition)
FileTap
Tap as the
base path and default Scheme
, and the partition.parent
- of type Tappartition
- of type Partition@ConstructorProperties(value={"parent","partition","openWritesThreshold"}) public PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, int openWritesThreshold)
FileTap
Tap as the
base path and default Scheme
, and the partition.
openWritesThreshold
limits the number of open files to be output to.parent
- of type Hfspartition
- of type PartitionopenWritesThreshold
- of type int@ConstructorProperties(value={"parent","partition","sinkMode"}) public PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, cascading.tap.SinkMode sinkMode)
FileTap
Tap as the
base path and default Scheme
, and the partition.parent
- of type Tappartition
- of type PartitionsinkMode
- of type SinkMode@ConstructorProperties(value={"parent","partition","sinkMode","keepParentOnDelete"}) public PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, cascading.tap.SinkMode sinkMode, boolean keepParentOnDelete)
FileTap
Tap as the
base path and default Scheme
, and the partition.
keepParentOnDelete
, when set to true, prevents the parent Tap from being deleted when BasePartitionTap.deleteResource(Object)
is called, typically an issue when used inside a Cascade
.parent
- of type Tappartition
- of type PartitionsinkMode
- of type SinkModekeepParentOnDelete
- of type boolean@ConstructorProperties(value={"parent","partition","sinkMode","keepParentOnDelete","openWritesThreshold"}) public PartitionTap(FileTap parent, cascading.tap.partition.Partition partition, cascading.tap.SinkMode sinkMode, boolean keepParentOnDelete, int openWritesThreshold)
FileTap
Tap as the
base path and default Scheme
, and the partition.
keepParentOnDelete
, when set to true, prevents the parent Tap from being deleted when BasePartitionTap.deleteResource(Object)
is called, typically an issue when used inside a Cascade
.
openWritesThreshold
limits the number of open files to be output to.parent
- of type Tappartition
- of type PartitionsinkMode
- of type SinkModekeepParentOnDelete
- of type booleanopenWritesThreshold
- of type intprotected java.lang.String getCurrentIdentifier(cascading.flow.FlowProcess<? extends java.util.Properties> flowProcess)
getCurrentIdentifier
in class cascading.tap.partition.BasePartitionTap<java.util.Properties,java.io.InputStream,java.io.OutputStream>
public boolean deleteResource(java.util.Properties conf) throws java.io.IOException
deleteResource
in class cascading.tap.partition.BasePartitionTap<java.util.Properties,java.io.InputStream,java.io.OutputStream>
java.io.IOException
protected cascading.tuple.TupleEntrySchemeCollector createTupleEntrySchemeCollector(cascading.flow.FlowProcess<? extends java.util.Properties> flowProcess, cascading.tap.Tap parent, java.lang.String path, long sequence) throws java.io.IOException
createTupleEntrySchemeCollector
in class cascading.tap.partition.BasePartitionTap<java.util.Properties,java.io.InputStream,java.io.OutputStream>
java.io.IOException
protected cascading.tuple.TupleEntrySchemeIterator createTupleEntrySchemeIterator(cascading.flow.FlowProcess<? extends java.util.Properties> flowProcess, cascading.tap.Tap parent, java.lang.String path, java.io.InputStream input) throws java.io.FileNotFoundException
createTupleEntrySchemeIterator
in class cascading.tap.partition.BasePartitionTap<java.util.Properties,java.io.InputStream,java.io.OutputStream>
java.io.FileNotFoundException
Copyright © 2007-2015 Xplenty, Inc. All Rights Reserved.