1. Cascading

1.1 What is Cascading?

Cascading is a Query API and Query Planner used for defining, sharing, and executing data processing workflows on a distributed data grid or cluster.

Cascading relies on Apache Hadoop. To use Cascading, Hadoop must be installed locally for development and testing, and a Hadoop cluster must be deployed for production applications.

Cascading greatly simplifies the complexities with Hadoop application development, job creation, and job scheduling.

Copyright © 2007-2008 Concurrent, Inc. All Rights Reserved.