There’s a lot of excitement at the intersection of neuroscience and AI right now. I had a chance to go to the MAIN2019 conference (Montreal AI and neuroscience), and saw an amazing line-up of people and interesting talks it was. This is a tour of the main ideas that caught my attention at the conference.
From natural to artificial intelligence
Humans are intelligent, and by emulating humans, whether at the level of circuits, algorithms, affordances or objectives, we could create intelligent machines. There’s a cogent explanation of how that’s worked in the past and how that might work in the future in this editorial from Deepmind. There’s two concrete examples that people often point to:
- Convolutional neural nets, which are inspired by the visual system, ultimately from the work of Hubel and Wiesel
- Reinforcement learning, loosely inspired by the behaviour of the basal ganglia
In addition, there’s a long tail of smaller ideas; for instance, the idea of using different types of normalization to facilitate gradient based learning (whether the role is the same in the brain and in machines is another question, but that’s a story for another day).
Constraining deep neural nets with brain data
It’s a lot easier to name instances where neuroscience has inspired AI almost by accident than to make a coherent research program based off of that idea. But some people are trying.
One idea is to use brain data (neurophysiological recordings or fMRI) to constrain deep neural nets to a good subspace with high performance. Leila Wehbe from CMU presented some ideas along these lines, fitting a language model pooling both fMRI and text doccuments. She mentioned that right now, joint training predicts the fMRI data well but the results are equivocal on language tasks.
It might just be a question of the vastly larger amount of text data than brain data. Pierre Bellec aims to collect hundreds of hours of brain data (MEG and fMRI) from a dozen subjects as part of the Courtois Neuromod project. These large scale projects could generate the vast amounts of data necessary to train ANNs with brain data exclusively. Very recently there was a proof of concept that showed more robust classification using brain data from mouse visual cortex as a prior on image classification. These are very early days but I think this approach will work eventually. The only question is whether it will ever be economical to use this method (i.e. 500$ an hour to run an MRI scanner versus pennies per classification on mturk).
Genes as an information bottleneck
Another very popular idea that has been put forward by many groups is that there are two scales of learning in human brains:
- a human lifetime scale, i.e. a lot of unsupervised learning, a sprinkle of reinforcement learning, and the tiniest sliver of supervised learning that happens over a single person’s lifetime. This timescale might involve changing billions of weights (synapses) in the brain.
- the evolutionary timescale, i.e. the massive information bottleneck that comes from the tens of thousands of genes we have to encode a nervous system composed of billions of neurons.
There’s an underlying sub-idea that many individual neurons don’t matter all that much in this context. This is a point that’s been made by many recently, most notably in an editorial in Nature Neuroscience.
Tony Zador had a talk provocatively titled “a critique of pure learning” that made the point (see also his paper) that humans and animals don’t really learn all that much. Instead, they have a number of built-in behaviours that get refined after birth; most of the intelligence is innate.
Evolution follows fundamentally different dynamics than the common gradient-based learning algorithms that are commonly used in deep learning. Instead of an objective that is optimized by nudging the current estimate of weights, an ensemble of learners carry forward traits that can mutate or be exchanged.
One interesting consequence is that different kinds of emergent collective behaviours with high fitness can emerge: cooperative, individualistic, etc. Evolution moves the whole population towards one of many deep local optima. These are not the benign saddlepoint extrema or relabeling equivalences of deep learning that can be dealt with stochastic gradient descent. You need new mathematical machinery to understand these ensembles, namely complexity theory (aka complex systems). It’s a really fascinating area of mathematical physics (here’s a book on the subject) that is ripe to be explored more in the context of neuroscience. Blaise Aguera y Arcas showed some very early work on this problem and I think it will prove ebullient in the next few years.
Artificial to natural intelligence
There was also a number of talks on improving neuroscience research with the tools of AI. Many of the student posters followed the basic template:
I used machine learning method X to solve neuroscience subtask Y and got results Z
I think this is a good first step. The bigger impacts come when we create generic machine learning systems that can used for a lot of different problems. For instance, Mackenzie Mathis spoke about DeepLabCut, which can segment animal videos to quantify behaviour in the lab and which has already had a huge impact. We also saw Gael Varoqueux, one of the people behind sklearn. Just think of the number of papers that use the logistic regression implementation in sklearn – that is some high impact!
Yannick Roy presented a poster on a Kaggle-like environment quantifying performance for a wide range of EEG datasets and algorithms. The competition should reveal which models achieve state-of-the-art performance; these models can and boxed so people can download a pre-trained models. This will unlock state-of-the-art performance for non-experts, e.g. clinical researchers.
Separability and manifolds
People are starting to ask whether artificial neural networks and real neural networks work similarly to solve the same kinds of problems. Sue Yeon Chong presented some interesting work continuing the thread that Jim DiCarlo started several years back, namely that linear separability increases in natural and artificial neural nets as a function of depth, creating a reservoir of linearly separable manifolds [see this recent related paper]. She pulled straight from the old VC theory on complexity and capacity to define the capacity of a neural population in terms of convex hulls and pairwise linear separability.
There were also some strong claims about the importance of neural manifolds in understanding neural responses. Someone (I didn’t jot down whom) claimed that neurons are random projections from the neural manifolds. Do single neurons not matter anymore? Sara Solla showed some beautiful neural manifolds in the context of the control of a cursor via motor cortex. There was a linear projection of the neural activity preserved over days despite neurons dropping in and out of recording; furthermore, there was a linear transformation from neural space to manifold space preserving the spatial structure of the task, which is quite cool. My impression, however, is that in the context of the control of a cursor, only 3 variables matter: x, y and t. So we can embed a task that lives in 3 dimensions in a 3 dimensional neural space – what have we learned? I feel like making higher dimensional version of these tasks will be crucial to make strong claims about the importance of neural manifolds. People have started working on this in the context of vision, and the manifolds there are much higher dimensional, which must reflect the statistics of the task itself.
One thing that kept creeping into many talks is the idea of the usefulness of the tools of causality in research. Patricia Conrod presented very interesting and controversial results on computational psychiatry. She made a causal claim that marijuana use in adolescents decreased their grades, using observational data and the tools Judea Pearl’s do-calculus. I have rarely seen an audience so visibly upset! It felt like listening to the Rite of Spring in 1913 Paris.
Causality is hugely important in epidemiology research, and people are starting to really hone in on its role in neuroscience at large. Konrad Kording recently wrote an article on the use of causality in neuroscience, focused on the tools of econometrics. Recently, it’s been shown that there’s a link between causal estimation and the credit assignment problem. This suggests that neurons could perform something similar to gradient descent without global, symmetric signals; see this paper from Blake Richards about this line of work.
I have massive imposter’s syndrome for applying to jobs after seeing this talkPostdoc friend
Amy Orsborn presented a really impressive body of work on BCI, following from her work at Bijan Pesaran’s and Jose Carmena’s labs. The basic idea is that BCIs allow one to perform manipulations of the brain that separate different learning processes that often conflated. Take a look at this editorial for more info.
Can we all just define consciousness before talking about it?Suzanne Still
I heard the C-word (consciousness) from Yoshua Bengio. The counterpoint is that we don’t really what consciousness is and in the absence of knowledge we’re having a circular discussion. I think it’s a – very – interesting discussion to have over a glass of wine but my TL;DR is it’s not science yet and it’s unlikely it will be in my lifetime. But it was still cool!
Verdict: really great food for thought here. Creeping up to top three conferences (with Photonics West and CCN). If you’re on the East Coast by all means come next year.