NIPS Top 6 papers roundup

I’ve been stalling on this post for a while. I had a fantastic time at Nips this year. I was completely unprepared to go to a conference where everyone I encountered was a machine learning researcher. Never once did I have to use high level terms to describe my research; I could just let the jargon fly free, and be understood!

I’ve chosen a small selection of papers that spoke to me. There were tonnes of good papers, and plenty of blog posts reviewing them (a good start is to look for the #nips2013 twitter stream). This selection is what speaks to my present interests: multi-task learning papers, stochastic gradient variance reduction, clustering, and of course anything applied to biological imaging.

Jason Chang, John W. Fisher III: Parallel Sampling of DP Mixture Models using Sub-Cluster Splits

Summary: They develop an MCMC sampler that can be parallelized, leading to good performance gains. They maintain two sub-clusters for each cluster by way of auxiliary variables, and use sub clusters to propose a good split. They claim to be able to enforce the convergence of the stationary distribution of the Markov chain (presumably to the posterior of the cluster assignments) without approximations. Code is available, which is a big plus.

Michael C. Hughes, Erik B. Sudderth: Memoized online variational inference for Dirichlet Process mixture models

Summary: They develop a new variational inference algorithm that stores and maintains finite-dimensional sufficient statistics from batches of a (very) large dataset. Seems like a split-merge model. They also have code available @

Dahua Lin: Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation

Summary: Another variational approximation inference algorithm for DP mixture models, that also scales to very large data sets. Didn’t catch much of it at the poster session, but looks promising.

Marius Pachitariu, Adam Packer, Noah Pettit, Henry Dagleish, Michael Hausser, Maneesh Sahani: Extracting regions of interest from biological images with convolutional sparse block coding

Summary: A great paper about formulating a generative model of biological images, based on convolutional sparse block coding. This is basically a grouped deformable prototype model, where prototypes are grouped into blocks. A cell image in this model is represented as an interpolation with different coefficients over the elements of the block, for several blocks. The code is not yet available, but it seems like a useful way to do segmentation of cell objects when they display more variation than yeast.

Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep Ravikumar, Ambuj Tewari: Learning with Noisy Labels

Summary: They proposed a model of how to learn a binary classifier in the presence of random classification noise in the labels. The labels have been independently flipped with some small probability. It’s a new interpretation of biased loss functions; two methods, both of which reduce to the functional form of biased class composition SVM loss.

Xinhua Zhang, Wee Sun Lee, Yee Whye Teh: Learning with Invariance via Linear Functionals on Reproducing Kernel Hilbert Space

Summary: A bit outside my field of interest, but this paper really impressed me. They derive a method to learn kernels that learn their parameters to respect formalized invariance properties. The main contribution here is a representer theorem for their augmented loss function. Take a kernel family, as well as some function that encodes a given invariance property (e.g affine transformation), and they have an optimization program that will find the appropriate kernel parameters.