# Category information in the brain

The Gallant lab have a new paper out in Neuron using fMRI to study the brain’s representation of visual scene categories. It’s a slick little paper, using some fun machine learning algorithms (Latent Dirichlet allocation) that shows that there’s a substantial amount of latent semantic information available at the fMRI macroscale – raising the question of how semantic information can be exploited by the visual system. Chris and I wrote a preview for the paper in Neuron. Here are the first few paragraphs:

In a 1942 essay, Jorge Luis Borges discusses the categorization of animals, purportedly found in a fictitious Chinese encyclopedia named the ‘‘Celestial Empire of Benevolent Knowledge’’ (Borges, 1942). Animals therein are classified into 14 fanciful categories, including, ‘‘fabulous ones,’’ ‘‘those that have just broken the flower vase,’’ and ‘‘those that look like flies when viewed from a distance.’’ Borges uses this example to suggest that any attempt to categorize the contents of nature is ‘‘arbitrary and full of conjectures.’’

Nevertheless (again quoting Borges), ‘‘the impossibility of penetrating the divine scheme of the universe cannot dissuade us from outlining human schemes, even though we are aware that they are provisional.’’ In fact, such schemes can be quite useful in sensory neuroscience. A decade after Borges’s essay, Barlow (1953) discovered neurons that respond selectively to stimuli that look like flies when viewed from a distance. These ‘‘fly detectors’’ were found in the retinas of frogs and, hence, were linked to a specific category of behavior (feeding). Subsequently, Hubel and Wiesel (1962) identified visual cortical cells that were described as ‘‘simple’’ and ‘‘complex,’’ and these turned out to be useful labels for understanding many aspects of the visual cortex from anatomy to computation.

More recent imaging studies have led to the suggestion that neurons with particular stimulus selectivities are clustered together, forming brain modules responsible for encoding rather abstract categories of stimuli, including faces (Tsao et al., 2006), places (Epstein and Kanwisher, 1998), and buildings (Hasson et al., 2003). Of course, the number of such categories must be far greater than the number of brain regions, which leads to the profound question of how the brain organizes such a vast quantity of visual experience. In this issue of Neuron, Stansbury et al. (2013) address this question.

# Load pickle files in Matlab

It’s easy enough to load .mat files in Python via the scipy.io.loadmat function. But what about loading .pickle files into Matlab? That’s easy enough by calling a system command in Matlab, like so:

function [a] = loadpickle(filename)
if ~exist(filename,'file')
error('%s is not a file',filename);
end
outname = [tempname() '.mat'];
pyscript = ['import cPickle as pickle;import sys;import scipy.io;file=open("' filename '","r");dat=pickle.load(file);file.close();scipy.io.savemat("' outname '",dat)'];
system(['LD_LIBRARY_PATH=/opt/intel/composer_xe_2013/mkl/lib/intel64:/opt/intel/composer_xe_2013/lib/intel64;python -c ''' pyscript '''']);
end


Note that I’m setting the LD_LIBRARY_PATH environment variable since Matlab seems to reset this variable internally and it causes .so module import errors in Python.

# SciTrends – now with abstracts, search, RSS

I’ve added major new features to SciTrends, which should make it a more useful article discovery tool. It now indexes the abstracts of trending papers; you can use the full text search to narrow down the results to your field of interest, for instance “visual cortex” or “h1n1″. You can also narrow down results by journal.

While the general feed tends to be dominated by  stories about academic funding, fraud, editorial policy, politics, climate change, mind-reading, etc. – the narrowed-down results point to interesting articles – try the journal Neuron with timespan of 1 month, for instance.

You can generate an RSS feed for a specific journal/text search combo – so you can receive relevant articles right in your news reader. It’s available now at scitrends.com.

### Behind the scenes

The AltMetric API doesn’t give abstracts of papers. So I decided to cache AltMetric results locally in MongoDB and add abstracts to them using public databases.

I was surprised to find that there is no single public database one can use to retrieve abstracts based on, say, a DOI. I made a script in Python that aggregates data from several APIs:

• arXiv
• PubMed
• PLOS
• Nature

This retrieves a bit over 60% of abstracts. To complete the set, I built a simple web scraper that uses a variety of heuristics to determine the location of an abstract within a web page. It’s not perfect, but it gets a bit over half of the remaining abstracts, so overall the hit rate is about 85%. Here’s the script.

# Python refuses to use multiple cores – solution

I was trying to get parallel Python to work and I noticed that if I run two Python scripts simultaneously – say, in two different terminals – they use the same core. Hence, I get no speedup from multiprocessing/parallel Python. After some searching around, I found out that in some circumstances importing numpy causes Python to stick all computations in one core. This is an issue with CPU affinity, and apparently it only happens for some mixtures of Numpy and BLAS libraries – other packages may cause the CPU affinity issue as well.

There’s a package called affinity (Linux only AFAIK) that lets you set and get CPU affinity. Download it, run python setup.py install, and run this in Python or ipython:

In [1]: import affinity

Out[2]: 63


This is good: 63 is a bitmask corresponding to 111111 – meaning all 6 cores are available to Python. Now running this, I get:

In [4]: import numpy as np

Out[5]: 1


So now only one core is available to Python. The solution is simply to set the CPU affinity appropriately after import numpy, for instance:

import numpy as np
import affinity
import multiprocessing



A few months back, I announced a web application that gives you access to your Zotero library from mobile devices. I’m thrilled to announce that Zotero Reader for the Web is now available for free at zoteroreader.com. It offers the following features:

• Collections, tags, and full text search
• Automatically find PDFs for orphaned items

It’s available for free at zoteroreader.com – since it’s a web app, it runs in a web browser, so it works on Android phones and tablets, iPhone/iPad, as well as your desktop web browser (not IE).

## Fundraising campaign

With the help of web developer and longtime buddy Philippe Vachon-Rivard, we’re simultaneously launching a fundraising campaign on Indiegogo for Zotero Reader for Android. This will be a native Android app, based on the web version. The native app will be immensely better than the (already pretty good) web version. Most exciting is background syncing, which will allow the app to seamlessly fetch your library and PDFs in advance; that way, you’ll get offline access to your papers: in the subway, on the plane, in a conference with shifty Wi-Fi.

Breaking out of the browser sandbox will also allow the app to upload your edited PDFs to Zotero without your intervention: your notes and highlights will be synced across your electronic devices. The app will also be much faster thanks to a native UI. The code for the app will be released on github.

All in all, our goal is to make it as easy to read papers on your tablet as it is to read a book on an ereader.

Our funding goal is 6,000$. That money will be going into Phil’s development time and to fund the server for about a year. It runs until June 20th. This is an all-or-nothing funding campaign – like Kickstarter, if we don’t reach the funding goal by the end date, you get refunded. That means you have nothing to lose – you will only get debited if the campaign is succesful and we can go ahead with development. The final app will be free – every dollar you contribute will help you and your fellow scientist keep up with the scientific literature. If you’re broke, you can still contribute! Just tweet about the campaign or link to it on your blog or Facebook. # Whiten images in Matlab Previously, I showed how to whiten a matrix in Matlab. This involves finding the inverse square root of the covariance matrix of a set of observations, which is prohibitively expensive when the observations are high-dimensional – for instance, high-resolution natural images. Thankfully, it’s possible to whiten a set of natural images approximately by multiplying the data by the inverse of the square root of the mean power spectrum in the Fourier domain. Assume that X is a matrix of size (nimages,height,width), then: mX = bsxfun(@minus,X,mean(X)); %remove mean fX = fft(fft(mX,[],2),[],3); %fourier transform of the images spectr = sqrt(mean(abs(fX).^2)); %Mean spectrum wX = ifft(ifft(bsxfun(@times,fX,1./spectr),[],2),[],3); %whitened X  ### Why this works It’s not obvious at all that this should give similar results to the other method, which involved finding the inverse square root of the covariance matrix M. Let’s assume that the images are stationary – that is, the second order statistics are similar at any point in the image. That means that the elements of the covariance matrix should only depend on the relative position between two pixels. And of course, M is a symmetric matrix. That means that M is a matrix such that the product Mx performs the convolution between x and the kernel associated with M – call this kernel m. That kernel is actually the sum of the autocorrelation of the images. If $Mx = m * x$, then by the convolution theorem, $Mx = F^-1(F(m)\cdot F(x))$. The Fourier transform F(x) is a linear mapping from an n-dimensional complex vector to another n-dimensional complex vector. Therefore, $F(x) = \hat F x$, where $\hat F$ is an n-by-n dimensional matrix. Then, we have that $Mx = \hat F^{-1} \text{diag}(\hat F m) \hat F x$. If we take images X and transform them via $Y' = \hat F^{-1} \text{diag}(1 \// ({\hat F m})^{-1 \// 2}) \hat F X'$, then it’s easy to verify that the covariance of Y will now be a scalar times the identity matrix. Therefore, modulo edge effects, which break the stationarity requirement, images can be whitened by dividing them in the Fourier domain by the inverse of the root of the mean power spectrum. # Canopy scientific Python editor for Windows In my last post on IDEs for scientific Python, I couldn’t install, and therefore couldn’t properly review Canopy, a commercial IDE developed by Enthought, who sponsor SciPy. I had a chance to install it on Windows and try it. Canopy’s main screen shows three options: Editor, Package Manager and documentation browser. The package manager offers a graphical interface to perform much the same tasks as easy_install and pip. The documentation browser offers shortcuts for the online docs for Scipy, matplotlib, and more; it would be preferable, IMHO, if the installation included offline copies of this documentation. The main editor interface is uncluttered and fairly basic. ipython is used as the interpreter, and interestingly, the interface offers an option for including matplotlib graphics inline via SVG rather than in separate windows. Canopy’s editor is fairly smart, offering autocompletion and basic introspection, on par with that of Spyder but not as advanced as that in PyDev. docstrings pop up in a tooltip on hitting the Tab key after an opening parenthesis. The editor supports running both a complete file and a selection via run tools. And – that’s pretty much it. There’s no function browser, project manager or graphical debugger – although post-hoc debugging is supported in the ipython console via the debug command. Although I appreciate that the interface is solid and uncluttered – this was the main positive I mentioned when discussing IEP – I was expecting more from a commercial product, especially one that costs 199$ for the 64-bit Windows version. Although free versions are available for 32-bit platforms and academic users, support is limited, AFAIK, to StackOverflow.

This is a new product, and it seems that Enthought wants to push in the direction of making scientific Python more accessible for novices by removing the main barriers to entry – the next version promises to have integrated e-learning tutorials in the interface. This is a noble goal, although I can see other, perhaps more productive ways to accomplish the same goal, for instance:

• Interactive notebooks on specific themes, running in ipython Notebook, similar to Maple’s tutorials
• Graphical access to plotting commands and common analysis commands, as in Matlab and RKWard
• A Matlab-to-Python table of equivalence integrated in the documentation
• Wizards for common tasks (for instance, a data import tool)
• Written introduction to Python with emphasis on science in the documentation, like here.

I like the editor, and in terms of bare-bones, smooth editors it’s a little better than IEP. In terms of features, however, Spyder is better. It’s a good option if you can get it for free (academic or 32-bit versions).

• Editor: 3.5/5
• Interpretor integration: 4/5
• Plotting: 4/5
• Debugging: 2/5
• Smooth: 4/5

Overall score: 3.5 if you can get it for free, 2.5 otherwise.