Visualizing Data using tSNE (2008)

Paper Laurens van der Maaten, Geoffrey Hinton. Visualizing Data using tSNE, J. Machine Learning Research,9(Nov):2579–2605, (2008)

Abstract
We present a new technique called “tSNE” that visualizes highdimensional data by giving each datapoint a location in a two or threedimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. tSNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for highdimensional data that lie on several different, but related, lowdimensional manifolds, such as images ofobjects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large data sets, we show how tSNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of tSNE on a wide variety of data sets and compare it with many other nonparametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by tSNE are significantly better than those produced by the other techniques on almost all of the data sets.