public abstract class TupleEntryCollector extends Object
BaseOperation
instances to emit
one or more result Tuple
values.
The general rule in Cascading is if you are handed a Tuple, you cannot change or cache it. Attempts at modifying
such a Tuple will result in an Exception. Preventing caching is harder, see below.
If you create the Tuple, you can re-use or modify it.
When calling add(Tuple)
or add(TupleEntry)
, you are passing a Tuple to the down stream pipes and
operations. Since no downstream operation may modify or cache the Tuple instance, it is safe to re-use the Tuple
instance when add()
returns.
That said, Tuple copies do get cached in order to perform specific operations in the underlying platforms. Currently
only a shallow copy is made (via the Tuple
copy constructor). Thus, any mutable type or collection
placed inside a Tuple will not be copied, but will likely be cached if a copy of the Tuple passed downstream is
copied.
So any subsequent changes to that nested type or collection will be reflected in the cached copy, a likely
source of hard to find errors.
There is currently no way to specify that a deep copy must be performed when making a Tuple copy.Modifier and Type | Field and Description |
---|---|
protected TupleEntry |
tupleEntry |
Modifier | Constructor and Description |
---|---|
protected |
TupleEntryCollector() |
|
TupleEntryCollector(Fields declared)
Constructor TupleCollector creates a new TupleCollector instance.
|
Modifier and Type | Method and Description |
---|---|
void |
add(Tuple tuple)
Method add inserts the given
Tuple into the outgoing stream. |
void |
add(TupleEntry tupleEntry)
Method add inserts the given
TupleEntry into the outgoing stream. |
void |
close()
Method close closes the underlying resource being written to.
|
protected abstract void |
collect(TupleEntry tupleEntry) |
void |
setFields(Fields declared) |
protected TupleEntry tupleEntry
protected TupleEntryCollector()
public TupleEntryCollector(Fields declared)
declared
- of type Fieldspublic void add(TupleEntry tupleEntry)
TupleEntry
into the outgoing stream. Note the method add(Tuple)
is
more efficient as it simply calls TupleEntry.getTuple()
;
See TupleEntryCollector
on when and how to re-use a Tuple instance.tupleEntry
- of type TupleEntrypublic void add(Tuple tuple)
Tuple
into the outgoing stream.
See TupleEntryCollector
on when and how to re-use a Tuple instance.tuple
- of type Tupleprotected abstract void collect(TupleEntry tupleEntry) throws IOException
IOException
public void close()
Tap.openForWrite(cascading.flow.FlowProcess)
and no more Tuple
instances will be written out.
This method must not be called when an instance is returned from getOutputCollector()
from any of
the relevant OperationCall
implementations (inside a Function, Aggregator, or Buffer).Copyright © 2007-2015 Concurrent, Inc. All Rights Reserved.