


HONG KONG BAPTIST UNIVERSITY
FACULTY OF SCIENCE Department of Computer Science Colloquium 2011 Series Machine Learning , Persistent Topology, and Circular Coordinates Prof. Vin de Silva Pomona College California Date: March 14, 2011 (Monday) Time: 2:30  3:30 pm Venue: RRS905, Sir Run Run Shaw Building, Ho Sin Hang Campus Abstract I will discuss work from the early 2000s in two different fields: NLDR (nonlinear dimensionality reduction) and PCT (pointcloud topology). Then I will try to show how these two fields meet, in the problem of finding circular coordinates for a data set. This newer work has possible applications in signal processing and dynamical systems.
The idea behind NLDR is to take a highdimensional data set, perhaps obtained as a collection of scientific measurements, and to find a small set of realvalued coordinates that reveal meaningful parameters of the data. The classical linear instance of this is principal components analysis (PCA). The paradigm was introduced by Josh Tenenbaum in the late 1990s. Two wellknown algorithms, Isomap (Tenenbaum, dS, Langford) and LLE (Roweis, Saul) were published in 2000, and many other researchers have published NLDR algorithms since then. Each algorithm exploits a different aspect of the inherent geometry of the data, in order to construct the coordinates. Roughly over the same time period, several groups of researchers have been developing tools and techniques for applying algebraic topology to scientific data. Here the idea is to detect the topological structure of a set of highdimensional observed data points. The difficulty is that data are inherently noisy, and topological invariants are extremely sensitive to local noise. The early breakthrough came in 2000, with the publication of the persistence algorithm of Edelsbrunner, Letscher and Zomorodian. This new framework gives robust versions of the classical invariants of algebraic topology (such as homology and betti numbers), that can be used to estimate the topology (or "shape") of a noisy data set. In my talk, I will present recent work which combines ideas from both fields. From the NLDR side, one can generalize from realvalued coordinates to more general coordinates. We focus on circlevalued coordinates (such as angles). To discover these coordinates, we exploit not the geometry but the topology of the data. In order to do this robustly, it is necessary to use a persistence framework. I will indicate how these calculations are carried out, and give some examples of how one can exploit the resulting coordinates in applications. My collaborators in this work are Mikael VejdemoJohansson, Dmitriy Morozov, and Primoz Skraba. Biography Vin de Silva studied mathematics at Cambridge and Oxford, completing a doctorate in symplectic geometry under the supervision of Simon Donaldson. Since 2000, he has worked in applied topology, spending five years working in Gunnar Carlsson's research group at Stanford. His work with Josh Tenenbaum and John Langford on the isomap algorithm is widely cited to this day, and his collaboration with Robert Ghrist on sensor network topology was honored by a SciAm50 award in 2007. Vin is currently an assistant professor of mathematics at Pomona College, California, and holds a Digiteo Chair at INRIA Saclay IledeFrance. ********* ALL INTERESTED ARE WELCOME *********** (For enquiry, please contact Computer Science Department at 3411 2385) http://www.comp.hkbu.edu.hk/v1/?page=seminars&id=168 

Copyright © 2021. All rights reserved.Privacy Policy Department of Computer Science, Hong Kong Baptist University 
