Python packaging/install: what I want

Python packaging and deployment can be annoying. It’s been nearly 4 years since I released the first TurboGears release as an early adopter of setuptools/easy_install. Since then, there’s been the release of virtualenv, pip and zc.buildout. Somehow, it still seems like more trouble than it should be to get development and production environments set up.

On Bespin, I’ve been using a combination of virtualenv and pip (scripted with Paver) in development and production environments. But, I’ve found pip –freeze to be nearly unusable.

My Ideal World

After monkeying with this stuff a fair bit over the past few years, I have an idea of what I’d really like to have but I don’t think anyone’s working on it. I’d love to hear contrasting opinions or learn about projects that I’m not aware of.

  • Multiple version installation into global site-packages, as easy_install currently works (put the active package in the .pth file)
  • The better error reporting of pip (pip doesn’t meet my first desire, though, because it installs as single-version-externally-managed)
  • A tool to manage the installed packages (uninstall, select a different version)
  • In addition to a global site-packages, it would be nice to be able to specify a different site-dir for machines where I don’t have or don’t want to use root access
  • virtualenv that behaves like –no-site-packages but knows where site-packages (or the other site-dir) is
  • That tool that manages installed packages can selectively install specific versions of packages into the virtualenv by adding pointers in the .pth file that point to the site-packages directory
  • You can also install only into the virtualenv if you wish.
  • Install packages in that manner from a list of requirements (as with pip’s requirements file)
  • A way to freeze the currently set installed into the virtualenv as a new requirements file
  • An optional cache of all of the original sdists of the installed packages

pip is close to being usable, except freeze doesn’t work. zc.buildout is close to being usable, too. I think there’s a “freeze” like plugin for it, but I don’t know how well it works. I don’t like zc.buildout quite as much as virtualenv, and I see that people even use virtualenv+zc.buildout to eliminate site-packages from leaking in. I also find that it leaves tons of old packages around in every buildout, again with no way to manage them.

What I’ve found using both zc.buildout and pip is that they are slow and annoying, because they’re constantly reinstalling things that I already have. The main reason for having a shared site-packages as I suggest above is not to save on disk space, but to save on time. In development, I want to be able to update to the latest versions of packages quickly, installing/building only the ones that have changed. How fast something runs changes how you use it, and I know that the scripts that I have for updating development and production environments reflect that.

So,I think the main thing that I’m looking for is a new tool to manage the packages that I have installed globally and within virtualenvs. Are there tools out there that are heading down this path at all?

Also, I understand the starting point that Tarek is taking with Distribute (splitting it up into logical pieces), but is there any roadmap for where it’s going to go functionally from there? Or is the intention purely that tools like the one I’m angling for will be written against the newly refactored libraries? I do know about the uninstall PEP, and that’s pleasing.

7 thoughts on “Python packaging/install: what I want”

  1. I think Tarek’s work enables what you wnat but won’t have the tools to do all of it. virtualenv will still be needed. A tool to freeze requirements might still be needed (I haven’t followed that aspect of what he’s doing very well). Also, implementation details may vary. For instance, last I talked to Tarek, using .pth files was not necessary to achieve the same features.

  2. Re speed, what pip *really* needs is a package cache/local repo from which it can lift already downloaded (or even already compiled) packages to install them in new virtualenvs.

    Having to download and compile all of lxml (2.9Mb, a bunch of C) or Django (5.6Mb, pure python) every single time is a pain.

  3. I think everyone who uses zc.buildout seriously has a ~/.buildout/default.cfg file with

    [buildout]
    eggs-directory = /home/(yourusername)/tmp/buildout-eggs
    download-cache = /home/(yourusername)/tmp/buildout-download-cache

    This makes buildout mostly usable, as it doesn’t keep reinstalling the same version of the same package into each new buildout environment that you use. (BTW it sucks that you cannot specify pathnames relative to your home directory in buildout config files. There’s an open bug about that.)

  4. > Multiple version installation into global site-packages, as easy_install currently works (put the active package in the .pth file)

    This was a solution pip avoided for (I think) a good reason: Each .pth entry clogs up sys.path. This not only slows down imports across the board (each one is a stat, multiply by how many imports every library and sub-library is doing). It’d be like appending /usr/local/just-my-library-version onto your $PATH variable for every library you installed. As far as I know, this is a problem that hasn’t ever had a good solution in any language. I don’t think even CPAN does it, although there are Perl libraries that hook into imports to make that happen on an opt-in basis.

  5. Sorry for the late reply, folks. It’s been a crazy few days.

    @Toshio: Tarek’s work is indeed laying a good groundwork, and I do hope the tooling catches up. We really are way behind Ruby on this one.

    @Marius: thanks for the tip. That sounds like it’ll make buildout a lot more pleasant. I hope to give it another try in a few days.

    @hlian Yes, you’re right. What is *really* needed is a mechanism whereby packages are directly mapped to paths by an external virtualenv/buildout like tool. That way, “import tg” will go directly to the correct version without a lot of hunting around. Given Python’s ability to have custom import hooks, that should be doable even without changes to Python itself.

  6. @Marius Gedminas
    Ah but you can! If you put:

    [env]
    recipe = gocept.recipe.env

    you can do:

    my_dir = ${env:HOME}/path/to/greatness

    You also get access to all variables in your env in the same manner.
    Ben

Comments are closed.