During runtime, Hadoop must be "told" which application jar file should be pushed to the cluster. Typically this is done via the Hadoop API JobConf object.
Cascading offers a shorthand for configuring this parameter.
Properties properties = new Properties(); // pass in the class name of your application // this will find the parent jar at runtime FlowConnector.setApplicationJarClass( properties, Main.class ); // or pass in the path to the parent jar FlowConnector.setApplicationJarPath( properties, pathToJar ); FlowConnector flowConnector = new FlowConnector( properties );
Above we see how to set the same property two ways. First via the
setApplicationJarClass()
method, and via the
setApplicationJarPath()
method. The first
method takes a Class object that owns the 'main' function for this
application. The assumption here is that Main.class
is not
located in a Java Jar that is stored in the lib
folder of the application Jar. If it is, that Jar will be pushed to the
cluster, not the parent application jar.
In your application, only one of these methods needs to be called, but one of them must be called to properly configure Hadoop.
Copyright © 2007-2008 Concurrent, Inc. All Rights Reserved.