When using Checkpoint pipes in a Flow and the Flow fails, a future execution of the Flow can be restarted after the last successful FlowStep writing to a checkpoint file. That is, a Flow will only restart from the last Checkpoint Pipe location.
This feature requires that the failed Flow be planned with a
runID
set on the FlowDef, and the retry Flow use the same
runID
value. It goes without saying, the retry Flow should
be (roughly) equivant to the previous failed attempt.
Example 8.9. Setting runID
FlowDef flowDef = new FlowDef()
.setName( "flow-name" )
.addSource( rhs, rhsSource )
.addSource( lhs, lhsSource )
.addTailSink( groupBy, sink )
.addCheckpoint( checkpoint, checkpointTap )
.setRunID( "some-unique-value" ); // re-use this id to restart this flow
Flow flow = new HadoopFlowConnector().connect( flowDef );
Caution should be used when using restarted checkpoint Flows. If the input data has changed, or the pipe assembly has significantly been altered, the Flow may fail or there may be undetectable errors.
Note that when using a runID
, all Flow instances must
use a unique value unless they are intended as a retry attempt. The
runID value is used to scope the directories for the temporary
checkpoint files to prevent file name collisions.
On successful completion of a Flow with a runID, all temporary checkpoint files will be removed, if any.
Copyright © 2007-2012 Concurrent, Inc. All Rights Reserved.