10. Best Practices

10.1 Unit Testing

Discrete testing of all Operations, pipe assemblies, and applications is a must. The cascading.CascadingTestCase provides a number of helper methods.

When testing custom Operations, use the invokeFunction(), invokeFilter(), invokeAggregator(), and invokeBuffer() methods.

When testing Flows, use the validateLength() methods. There are quite a few of them, and collectively they offer great flexibility. All of them read the sink tap, validate that it is the correct length and has the correct Tuple size, and check to see whether the values match a given regular expression pattern.

As of Cascading 2, it is possible to write tests that are independent of the underlying platform or mode. Any unit test should subclass cascading.PlatformTestCase and apply the PlatformRunner.Platform annotation.

For example, the annotation @PlatformRunner.Platform({LocalPlatform.class, HadoopPlatform.class}) causes the PlatformTestCase to run all the unit tests defined on the subclass with the LocalPlatform and HadoopPlatform platform instances.

See the Cascading unit tests for examples.

To use any of these helper classes, make sure that cascading-test-x.y.z.jar is in your testing class path.

