With the introduction of new technologies such as next-generation sequencing, it has become easier and cheaper to generate high-throughput biological data. Data are being generated at such a fast rate that the bottleneck for scientific discovery is data analysis. The main goal of the field of Computational Biology is to develop and apply mathematical, statistical, and computational methods to efficiently process and analyze large-scale biological data. In the Department of Biology and within the Center for Genomics and Systems Biology, a rigorous curriculum has been developed to train future computational biologists in the areas of genomics, mathematics, statistics, and computer science.
Some of the computational methods being developed in the Department to answer biological questions in a range of organisms from yeast and viruses, to plants and parasites, include:
- Developing computational pipelines to assemble, annotate and analyze whole genome sequences, transcriptomes of different tissues and developmental stages, and protein-protein interaction networks.
- Predicting biological functions of genes and their proteins using machine-learning methods.
- Predicting gene and protein structures using statistical methods such as Hidden Markov Models and homology information.
- Developing tools to enable biologists with no computational background to analyze their data.
Faculty and students have access to cutting-edge research facilities including the Sequencing (GenCore) Facility for high-throughput sequencing and data generation, and NYU’s high performance compute clusters. Students are taught in the sub-disciplines of computer programming (e.g., shell and python scripting), statistical programming languages such as R, and how to use high performance computing systems.