10. Best Practices

10.1 Unit Testing

Discrete testing of all Operations, pipe assemblies, and applications is a must. The cascading.CascadingTestCase provides a number of helper methods.

When testing custom Operations, use the invokeFunction(), invokeFilter(), invokeAggregator(), and invokeBuffer() methods.

When testing Flows, use the validateLength() methods. There are quite a few of them, and collectively they offer great flexibility. All of them read the sink tap, validate that it is the correct length and has the correct Tuple size, and check to see whether the values match a given regular expression pattern.

As of Cascading 2, it is possible to write tests that are independent of the underlying platform. Any unit test should subclass cascading.PlatformTestCase located in the cascading-platform-x.y.z-tests.jar jar file. Any platform to be tested against should be added to the classpath as well. PlatformTestCase will search the classpath for all available platforms and run each test on the subclass against each platform found.

You can exclude any platforms you don't want to have the tests run against by omitting from a system property named platform.includes. For example, to only run the local mode tests when both local and Hadoop platforms are in the classpath, call System.setProperty( "platform.includes", "local ); or use an appropriate -D switch to the JVM.

See the Cascading platform unit tests for examples.

For Maven users, be sure to add the tests classifier to any dependencies. Note that the cascading-platform project has no main code, but does have only tests, so it must be retrieved via the tests classifier.

Copyright © 2007-2012 Concurrent, Inc. All Rights Reserved.