ResearchHere are some of the research projects I've worked on. You might also be interested in my Google Scholar profile.
Contrastive Learning Can Find An Optimal Basis For Approximately View-Invariant Functions [arXiv]
Daniel D. Johnson, Ayoub El Hanchi, Chris J. Maddison
We reveal a surprising connection between contrastive learning objectives and kernel methods: the optimum of these objectives can be reinterpreted as a positive-definite kernel, and the spectral decomposition of this kernel yields a min-max optimal basis for supervised learning under the assumption that positive pairs have similar labels.
Ningning Xie*, Daniel D. Johnson*, Dougal Maclaurin, Adam Paszke
ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM) 2022
Motivated by the language Dex, we explore an extension of algebraic effects that ensure effects can be executed in parallel. We formalize this in an untyped lambda calculus and also provide a Haskell implementation.
Guy Lorberbom*, Daniel D. Johnson*, Chris J. Maddison, Daniel Tarlow, Tamir Hazan
NeurIPS 2021 (spotlight)
We generalize the Gumbel-max structural causal model to a family of parameterized causal models, which can be trained to minimize quantitative criteria such as the variance of counterfactual effects, and which then generalize to new counterfactual queries at test time.
Jacob Austin*, Daniel D. Johnson*, Jonathan Ho, Daniel Tarlow, Rianne van den Berg
We introduce Discrete Denoising Diffusion Probabilistic Models (D3PMs), diffusion-like generative models for discrete data that introduce non-uniform structure into the corruption process, including transition matrices that mimic Gaussian kernels, matrices based on nearest neighbors, and matrices that introduce absorbing states. We show strong results with these models on both text and quantized images.
Beyond in-place corruption: Insertion and deletion in denoising probabilistic models [arXiv]
Daniel D. Johnson, Jacob Austin, Rianne van den Berg, Daniel Tarlow
2021 ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models
We consider a class of discrete corruption processes and denoising models for sequence data that can insert and delete elements while still being efficient to train and sample from. We demonstrate that these models outperform standard in-place diffusion-like models on an arithmetic sequence task, and that when trained on the text8 dataset they can be used to fix spelling errors without any fine-tuning.
Dex programming language [code]
A research language for typed, functional array processing. I've made various contributions, including implementing extensible record and variant types, improving type inference and typeclass synthesis, and rewriting the pretty-printer. I also helped write the ICFP paper "Getting to the Point. Index Sets and Parallelism-Preserving Autodiff for Pointful Array Programming" (arXiv preprint).
Daniel D. Johnson, Hugo Larochelle, Daniel Tarlow
NeurIPS 2020 (spotlight presentation); also presented at GRL+ 2020
We propose a differentiable layer that learns to add new edges to a graph based on a continuous relaxation of the accepting paths of a finite-state automaton. This layer can reproduce static analyses of Python programs and, when combined with a larger graph-based model and trained end to end, improves performance on the variable misuse program understanding task.
Latent Gaussian Activity Propagation: Using smoothness and structure to separate and localize sounds in large noisy environments [pdf, poster]
Daniel D. Johnson, Daniel Gorelik, Ross E. Mawhorter, Kyle Suver, Weiqing Gu, Steven Xing, Cody Gabriel, Peter Sankhagowit
We present an approach for simultaneously separating and localizing multiple sound sources recorded by multiple microphones using MAP inference and automatic differentiation. We start with a smoothness prior over the temporal, frequential, and spatial characteristics of the source sounds and use gradient ascent to find the most likely separation.
Daniel D. Johnson
ICLR 2017 (oral presentation)
This work introduces the GGT-NN, a model that uses a graph as an intermediate representation and learns to modify those graphs in response to textual input. The model is able to build knowledge graphs for most of the bAbI tasks and can also simulate the operation of simple Turing machines.
Daniel D. Johnson, Robert M. Keller, Nicholas Weintraut
International Conference on Computational Creativity 2017
We describe a neural network architecture that learns to generate jazz melodies over chord progressions. The predictions are generated by combining information from an "interval expert" that follows the melody contour and a "chord expert" that scores notes based on their relative position from the root note in the active chord.
Daniel D. Johnson
This work describes an architecture for generating polyphonic piano-roll music by using two arrays of LSTMs along perpendicular axes: time-axis LSTMs pass states forward in time, and note-axis LSTMs pass states upward across notes. The work started as a blog post, and was later published as a conference paper.