Big Biological Data

An important motivation for the establishment of the Quantitative Biology Institute is the explosion of biological data generated by technological advances that have revolutionized our ability to measure biological processes in space and time. For example, new developments in electron microscopy allow solving structures of large protein and RNA complexes with atomic resolution without the need for crystallization; they promise the elucidation of the structures of entire cells and organelles with near atomic resolution. Another example is the rapid and ongoing developments in light microscopy that have pushed the resolution well beyond the conventional diffraction limit; furthermore, optical sectioning techniques allow the real-time visualization, with cellular resolution, of the development of entire organisms. A third example is the rapid progress in DNA sequencing technology that allows gene transcription, RNA processing and protein translation to the probed at the whole-genome level; advances in sensitivity allow sequencing and mass spectrometry at the single-cell level. This explosion of data has created demand for computational and statistical tools for data analysis, and for physical and mathematical theories to use this new information to understand outstanding biology problems. 
 
The interpretation of these data will require the development of new analytic tools. For example, the sorting of cryoEM images of heterogeneous particle assemblies, the segmentation of cells in a developing embryo, and the interpretation of single-cell and long-read RNA sequencing, all pose computational challenges that are at the forefront of data science. Thus faculty from mathematics, statistics and computer science are needed. Tools such as machine learning and principle component analysis need to be complement by newer mathematical approaches such as manifold learning and diffusion maps, spectral network theory and non-linear harmonic analysis. In addition to the practical benefit of solving these problems, we hope that faculty in the Quantitative Biology Institute will be inspired by the biological findings to discover new mathematic results and physical theories.