Paver is now on GitHub, thanks to Almad

Paver, the project scripting tool for Python, has just moved to GitHub thanks to Almad. Almad has stepped forward and offered to properly bring Paver into the second decade of the 21st century (doesn’t have the same ring to it as bringing something into the 21st century, does it? 🙂

Seriously, though, Paver reached the point where it was good enough for me and did what I wanted (and, apparently, a good number of other people wanted as well). Almad has some thoughts and where the project should go next and I’m looking forward to hearing more about them. Sign up for the googlegroup to see where Paver is going next.

Paver: project that works, has users, needs a leader

Paver is a Python project scripting tool that I initially created in 2007 to automate a whole bunch of tasks around projects that I was working on. It knows about setuptools and distutils, it has some ideas on handling documentation with example code. It also has users who occasionally like to send in patches. The latest release has had more than 3700 downloads on PyPI.

Paver hasn’t needed a lot of work, because it does what it says on the tin: helps you automate project tasks. Sure, there’s always more that one could do. But, there isn’t more that’s required for it to be a useful tool, day-to-day.

Here’s the point of my post: Paver is in danger of being abandoned. At this point, everything significant that I am doing is in JavaScript, not Python. The email and patch traffic is low, but it’s still too much for someone that’s not even actively using the tool any more.

If you’re a Paver user and either:

1. want to take the project in fanciful new directions or,

2. want to keep the project humming along with a new .x release every now and then

please let me know.

4th anniversary MichiPUG meeting tomorrow!

I kicked off the Michigan Python Users Group (MichiPUG) in September 2005, so this month’s meeting marks 4 years since the group began!

This month’s meeting is going to be one of our “topic free” meetings. Despite the lack of a topic, we never have trouble finding Python things to discuss. If you’re going to be around Ann Arbor Thursday evening and have a burning Python question, do stop in!

The meeting will be at 7pm at SRT Solutions in downtown Ann Arbor. Parking is free and easy next to City Hall a couple blocks north (on Ann St.).

This month, I am stepping aside as the de facto leader of the group. While I am still a Python fan and heavy user, my interests have branched out enough that I plan to devote my rather limited “user group time” elsewhere. Stay tuned for more on that soon. Mark Ramm will be taking over my duties as “the guy who sends the monthly ‘what’s our topic?’ email message”.

I’m almost certainly going to be arriving late to tomorrow’s meeting, but I do hope to catch up with folks for drinks afterwards at the very least! See you there!

Python packaging/install: what I want

Python packaging and deployment can be annoying. It’s been nearly 4 years since I released the first TurboGears release as an early adopter of setuptools/easy_install. Since then, there’s been the release of virtualenv, pip and zc.buildout. Somehow, it still seems like more trouble than it should be to get development and production environments set up.

On Bespin, I’ve been using a combination of virtualenv and pip (scripted with Paver) in development and production environments. But, I’ve found pip –freeze to be nearly unusable.

My Ideal World

After monkeying with this stuff a fair bit over the past few years, I have an idea of what I’d really like to have but I don’t think anyone’s working on it. I’d love to hear contrasting opinions or learn about projects that I’m not aware of.

  • Multiple version installation into global site-packages, as easy_install currently works (put the active package in the .pth file)
  • The better error reporting of pip (pip doesn’t meet my first desire, though, because it installs as single-version-externally-managed)
  • A tool to manage the installed packages (uninstall, select a different version)
  • In addition to a global site-packages, it would be nice to be able to specify a different site-dir for machines where I don’t have or don’t want to use root access
  • virtualenv that behaves like –no-site-packages but knows where site-packages (or the other site-dir) is
  • That tool that manages installed packages can selectively install specific versions of packages into the virtualenv by adding pointers in the .pth file that point to the site-packages directory
  • You can also install only into the virtualenv if you wish.
  • Install packages in that manner from a list of requirements (as with pip’s requirements file)
  • A way to freeze the currently set installed into the virtualenv as a new requirements file
  • An optional cache of all of the original sdists of the installed packages

pip is close to being usable, except freeze doesn’t work. zc.buildout is close to being usable, too. I think there’s a “freeze” like plugin for it, but I don’t know how well it works. I don’t like zc.buildout quite as much as virtualenv, and I see that people even use virtualenv+zc.buildout to eliminate site-packages from leaking in. I also find that it leaves tons of old packages around in every buildout, again with no way to manage them.

What I’ve found using both zc.buildout and pip is that they are slow and annoying, because they’re constantly reinstalling things that I already have. The main reason for having a shared site-packages as I suggest above is not to save on disk space, but to save on time. In development, I want to be able to update to the latest versions of packages quickly, installing/building only the ones that have changed. How fast something runs changes how you use it, and I know that the scripts that I have for updating development and production environments reflect that.

So,I think the main thing that I’m looking for is a new tool to manage the packages that I have installed globally and within virtualenvs. Are there tools out there that are heading down this path at all?

Also, I understand the starting point that Tarek is taking with Distribute (splitting it up into logical pieces), but is there any roadmap for where it’s going to go functionally from there? Or is the intention purely that tools like the one I’m angling for will be written against the newly refactored libraries? I do know about the uninstall PEP, and that’s pleasing.

One Python-based version control system to rule them all!

We’ve just released Bespin 0.4, the major new feature of which is the first bit of the collaboration feature. Bespin 0.4 includes a ton of other changes, including one that I’m going to focus on here: Subversion support, which Gordon P. Hemsley kicked off for us a few weeks back.

Bespin’s initial version control support showed up in 0.2 with support for Mercurial. Knowing that we wanted to support multiple version control systems (VCS), I took an unorthodox approach from the beginning. Rather than providing the “hg” commands that people know and love, I created a separate set of “vcs” commands. Ultimately, we want to make it easy to grab a random open source project of the net and start hacking on it. Using the “vcs” commands, for the most common version control operations you won’t even have to think about which VCS is used by a given project.

I can run “vcs clone” (“vcs checkout” also works) to check out Bespin (in a Mercurial repository), Paver (in a Subversion repo) and hopefully soon Narwhal (in a Git repo). Also new in Bespin 0.4: Bespin’s command line has been tricked out to be able to have fancier interactions with commands, so you can enter all of the extra information that Bespin needs for checking out a repository right in the output area.

If you’ve used Subversion and one of the Distributed VCSes, you’ll know that they have a different model. The DVCSes do almost everything in a local repository copy and only talk to a remote server for push/pull. That’s actually true of Subversion as well, with one notable exception: commit. For Subversion, the “vcs commit” command will simply save your commit message for later. When you run “vcs push”, that is when an actual “svn commit” operation is run.

What’s neat about the “vcs” commands is that they operate the same from VCS to VCS. svn doesn’t have a feature to “add all files that are unknown”, whereas Mercurial does. “vcs add -a” operates the same on both systems.

If you’re interested, you can also use these commands on the command line by installing the Ãœber Version Controller (uvc) Python package. After doing so, you can head into a random Subversion or Mercurial working copy and type “uvc status” to see what’s different. I will note that the command line tool has been, um, lightly tested since uvc is mostly used as a library for Bespin at this point.

One final note: Bespin will soon also support native “svn” and “hg” commands so that you can stick to commands and options you’re familiar with or for performing more complex operations that don’t have equivalent “vcs” commands.

You can learn more about version control in Bespin from this section of the User Guide.

MichiPUG July meeting tonight! Python in vfx, Impressive, 3.1

The next MichiPUG meeting will be on Thursday, July 2nd at 7PM. Ryan Burns will talk about the Impressive presentation software (which was used at Tuesday’s Ignite Ann Arbor) and Terry Howald will talk about Python’s use in scripting visual effects.We may also talk about the Python 3.1 release.

The meeting will be in downtown Ann Arbor at the SRT Solutions office.

MichiPUG: using Python to run reports in Hadoop clusters

Zattoo’s Marshall Weir will be talking at this week’s MichiPUG (Thursday evening at 7PM at SRT Solutions in downtown Ann Arbor). In his own words:

I’ve been working on a python module for running reports in Hadoop. Its sort of a wrapper around the pig data processing language and some smarts for running reports on a hadoop cluster and pushing and pulling data to it. It’s designed primarily to make it easier and more efficient to run complex sets of interdependent reports – I’ve been using it to do business reporting on our customer behavior at Zattoo.

This should be very interesting for folks like me who have never seen Hadoop in action!

Paver 1.0 released!

At long last, I’ve released Paver 1.0. Here’s the announcement that I sent to python-announce:

After months of use in production and about two months of public testing for 1.0, Paver 1.0 has been released. The changes between Paver 0.8.1, the most recent stable release, and 1.0 are quite significant. Paver 1.0 is easier, cleaner, less magical and just better all around. The backwards compatibility breaks should be easy enough to work around, are described in DeprecationWarnings and were introduced in 1.0a1 back in January.

Paver’s home page: http://www.blueskyonmars.com/projects/paver/

What is Paver?

Paver is a Python-based software project scripting tool along the lines of Make or Rake. It is not designed to handle the dependency tracking requirements of, for example, a C program. It *is* designed to help out with all of your other repetitive tasks (run documentation
generators, moving files about, downloading things), all with the convenience of Python’s syntax and massive library of code.

If you’re developing applications in Python, you get even more… Most public Python projects use distutils or setuptools to create source tarballs for distribution. (Private projects can take advantage of this, too!) Have you ever wanted to generate the docs before building the source distribution? With Paver, you can, trivially. Here’s a complete pavement.py::

    from paver.easy import *
    from paver.setuputils import setup
    
    setup(
        name="MyCoolProject",
        packages=['mycool'],
        version="1.0",
        url="http://www.blueskyonmars.com/",
        author="Kevin Dangoor",
        author_email="dangoor@gmail.com"
    )
    
    @task
    @needs(['html', "distutils.command.sdist"])
    def sdist():
        """Generate docs and source distribution."""
        pass

With that pavement file, you can just run “paver sdist“, and your docs will be rebuilt automatically before creating the source distribution. It’s also easy to move the generated docs into some other directory (and, of course, you can tell Paver where your docs are stored, if they’re not in the default location.)

Installation

The easiest way to get Paver is if you have setuptools_ installed.

easy_install Paver

Without setuptools, it’s still pretty easy. Download the Paver .tgz file from Paver’s Cheeseshop page, untar it and run:

python setup.py install

Help and Development

You can get help from the mailing list.

If you’d like to help out with Paver, you can check the code out from Googlecode:

svn checkout http://paver.googlecode.com/svn/trunk/ paver-read-only

You can also take a look at Paver’s project page on Googlecode.