Virtualized software environments

By now, I’ve ironed out most of the problems that arose during my time installing CellProfiler on our cluster. Though the actual science can now begin, and this is where my focus should lie (just after finishing my NSERC application, of course), I can’t help but thinking about simpler ways to install complex software. This application runs in python, with many dependencies written in C (e.g hdf5, ATLAS). There is currently no build system provided other than a makefile, which is a bit brittle. I think this underscores the focus of the CP developers on desktop installations, rather than cluster or cloud computing environments.

CellProfiler requires a version of Python that was not installed on our cluster nodes. This suggests a solution which would let it run in a virtual environment. Here are two such alternatives that would likely have made this a less time-consuming process:

  • Virtualenv: A python package that provides virtual python environments. This is actually recommended on the CP2 wiki. A great introduction to virtualenv is found here
  • CDE: A form of linux-centric lightweight application virtualization. It runs your application, noting everything that is imported and creates a self contained package with all the dependencies included. Yaroslav Bulatov has a short intro on his blog. A larger series of use cases reside at the CDE website.

Note that since I’ve gone through the hard part of creating a virtualized environment from scratch, I haven’t actually used either of these 🙂 The next time I’m asked to install something unwieldy that has loads of dependencies though, will be a different story.