MichiPUG: using Python to run reports in Hadoop clusters

Jun 2, 2009 20:01 · 113 words · 1 minute read michipug

Zattoo’s Marshall Weir will be talking at this week’s MichiPUG (Thursday evening at 7PM at SRT Solutions in downtown Ann Arbor). In his own words:

I’ve been working on a python module for running reports in Hadoop. Its sort of a wrapper around the pig data processing language and some smarts for running reports on a hadoop cluster and pushing and pulling data to it. It’s designed primarily to make it easier and more efficient to run complex sets of interdependent reports – I’ve been using it to do business reporting on our customer behavior at Zattoo.

This should be very interesting for folks like me who have never seen Hadoop in action!