Yoseph Barash  |  Principal Investigator  |  yosephb@upenn.edu


I grew up in Israel and moved to Canada when I took a postdoc position at the University of Toronto. I arrived at UPenn in 2012. During my years in academia I had the great furtune to study at excellent institutions with brilliant mentors. My B.Sc. degree was in Physics and Computer Science, at the Hebrew University. Towards the end of my studies I also started to work at a start-up company as a programmer and algorithm developer. I later decided to continue my studies in CS under the supervision of Prof. Nir Friedman at the Hebrew University. For one year, I also worked with Dr. Naftali Kaminski at the functional genomics unit at Tel Hashomer Hospital. My research was in the area of machine learning, developing probabilistic models for transcriptonal regulation. After graduation, I took a position at the University of Toronto as a joint postdoc between the labs of Prof. Brendan Frey and Prof. Ben Blencowe. I kept working in the fields of machine learning and computational biology but my focus shifted to post-transcriptional regulation and alternative splicing. In 2012 I became an Assistant Professor at the University of Pennsylvania in the Department of Genetics in the Medical school, and the Department of Computer and Information Science in the Engineering School. My lab is part of the Institute for Biomedical Informatics (IBI). In 2018 I was tenured as an Associate Professor.

Scientific Focus

I am interested in solving problems from the bio-medical field using machine learning and probabilistic graphical models in particular.


Why machine learning and computational biology?

Current high-throuput experimental technologies make bio-medical research incredibly rich with computational challenges tangled with fundamental scientific questions. To answer these questions we need a good handle on both the bio-medical and computational aspects. Specifically, these high-throuput experiments produce large and noisy datasets with complex relations, making machine learning a particularly useful approach for analyzing these data. I develop probabilistic models that integrate diverse sources of genomic and genetic data to decipher cell regulatory mechanisms. I then use these models to produce testable hypotheses about novel regulatory mechanisms, and how these mechanisms go awry in human disease.


What about wet lab?

The web lab component of the lab serves to valdiate hypotheses generated by our models, and provide feedback for further model improvements. With our collaboraters we also work to design the experimental data that serves as input for our algorithms. 

What specific areas do you focus on?
We focus on understanding RNA biogensis, its regulation, and its role in human disease and in phenotipic diversity. Much of our work involves modeling alternative splicing regulation. The lab works in three main directions that pose computational, engineering, and experimental challenges:

  • Deriving new mechanistic insights into RNA biogenesis.

  • Applying our predictive algorithms for RNA processing to the study of human disease and phenotipic diversity.

  • Developing software tools that allow the greater scientific community to employ our algorithms.



​· Machine learning
· Computational Biology
· Bioinformatics

· RNA biogenesis
· Alternative splicing

· Genetic variations and genomics of human disease 


· GCB 537 - Advanced Computational Biology (part of GCB graduate program, co-directed with Prof. Li-San Wang).
· CIS 700-001  (tentative) - Advanced Machine Learning in Computational Biology.

· CIS 700-001 (summer & fall 2016) Deep Learning intro and beyond (covering deeplearningbook.org cover to cover + related current papers)

· CIS 800-001 (spring 2018) "Peeking into the black box of (deep) learning models" - current research on model interpretation. 

· GCB/CAMB 752 - I moderate the sessions about transcriptome methods & analysis in Prof. Diskin's Genomics seminar.