James Taylor

Ralph S. O'Connor Associate Professor of Biology and Associate Professor of Computer Science

391 UTL
410-516-0152
jxtx@jhu.edu
Group/Lab Website
Google Scholar Profile
ORCID Profile

Biography
Research
Publications

James Taylor is the Ralph S. O'Connor Associate Professor of Biology and associate professor of computer science at Johns Hopkins University. Until 2014, he was an associate professor in the departments of biology and mathematics and computer science at Emory University. He is one of the original developers of the Galaxy platform for data analysis, and his group continues to work on extending the Galaxy platform. His group also works on understanding genomic and epigenomic regulation of gene transcription through integrated analysis of functional genomic data. James received a PhD in computer science from Penn State University, where he was involved in several vertebrate genome projects and the ENCODE project.

Our lab’s research is in genome informatics, the use of computational and statistical approaches to understand genomes. Our ultimate goal is to achieve a complete understanding of the structure and function of genomes. Specifically, how information is encoded in genomes and how this encoding allows for precise reproducible biological processes and developmental programs, yet is harnessed by evolution to generate remarkable diversity. We work toward this goal both through the study of genome function and evolution, and through the development of tools that support the broader genomics community. Our work can be divided into two main areas.

First, we develop software and infrastructure to support data-intensive biomedical research}. The transformative power of genomic techniques will not be fully realized until all researchers can take advantage of them. Our group researches, develops, and implements approaches to eliminate the informatics challenges that impede genomic research. The results of our work are provided as part of the Galaxy framework (http://galaxyproject.org), which has become one of the most widely used tools for genome analysis.

Second, we use of genomic, comparative genomic, and functional genomic approaches to understand genome structure and gene regulation. Our group studies regulatory element structure and mechanism in evolutionary and functional context through the development of primary analysis, data mining and integration methods. To do this we use data generated by a variety of experimental techniques, leveraging collaborations with experimentalists working in multiple model systems, and developing new analysis algorithms and models.

Making computational biology more accessible, transparent, and reproducible

High-throughput data production technologies are revolutionizing modern biology, but progress is frequently impeded by computational details completely unrelated to the scientific questions being investigated. It is our goal to remove these impediments and make complex computational analysis more accessible. Along with the Nekrutenko Lab at Penn State, we develop Galaxy, which allows computational tools to be trivially integrated into an analysis environment in which experimental biologists can construct complex analyses. Some specific recent projects include:

Developing an approach for transparent and reproducible publication of data-intensive analysis

Reproducibility of published results is fundamental to the scientific process. However there are serious challenges to ensuring reproducibility as results become increasingly reliant on complex computational methods. We have developed an integrated approach for publishing analysis called Galaxy Pages. Pages are interactive, web-based documents that describe a complete experiment, allowing computational experiments to be documented and published with all analysis processes and results directly connected. Readers can then view the experiment at any level of detail, inspect intermediate data and analysis steps, reproduce some or all of the experiment, and extract methods to be modified and reused. The resulting framework was published in Genome Biology (Goecks et al. 2010).

Enabling scalable genomic analysis using cloud infrastructure

With ARRA support from NHGRI, We developed an approach for composing analysis environments using cloud resources that is extremely flexible while requiring minimal expertise from users. Building on this framework we developed Galaxy CloudMan, allowing users to create their own Galaxy environment within a cloud computing service (e.g., Amazon Web Services) which automatically acquires and releases compute resources as needed, resulting in more efficient and cost-effective resource use.

Developing a visual analytics framework for genomic data

We have built a new, extensible, framework for web-based genome browsing within the context of Galaxy. Genome browsers have existed as long as there have been genome assemblies. However, wide availability of sequencing technologies has created a need not just to browse publicly curated data on genomes; browsers now need to support researchers who are generating their own data. Because the results of sequencing based experiments are often very different depending on how they are analyzed, browsers should support dynamic and interactive reanalysis of data. Since the Galaxy framework already integrates analysis tools, it provides a natural substrate for building interactive visual analysis. The goal of this framework is to explore new ways to visualize and visually analyze and integrate genomic datasets. The first result of this work is the Galaxy Track Browser. The Galaxy Track Browser presents a new paradigm for rapid visual exploration of the complex parameter spaces associated with high-throughput sequence analysis tools.

Supporting de-centralization of analysis resources

The rapid increase in Galaxy use has created new challenges and opportunities. Many Galaxy users are now running their own dedicated instances, either on local or cloud resources, and other labs have started to build suites of analysis tools on top of Galaxy. Galaxy has become an important resource for the genomics community. In addition to continuing our work on best-practice analyses and visual analytics, a major focus of our future work will be building infrastructure to allow and increasingly decentralized Galaxy community. The core of this work is the Galaxy Tool Shed, which will allow tools and best-practice workflows to be shared between Galaxy instances in a way that preserves reproducibility.

Genomics and epigenomics of gene regulation

My lab is currently engaged in several projects to better understand the structure, function, and evolution of the genomic elements that regulate gene expression, called cis-regulatory modules (CRMs), and the epigenomic features associated with gene regulation.

Establishing the genomic and epigenomic determinants of CRM activity

As part of a collaborative project with Ross Hardison at Penn State we are studying features associated with CRM activity in erythroid differentiation. Using an inducible differentiation model cell line, our collaborators have been mapping the locations of key histone modifications and transcription factors at multiple time points, as well as assaying transcription levels using RNA-seq. A major interest of the lab is the use of machine learning approaches to understand the relationship between these epigenomic features and changes in expression.

Understanding the relationship between evolutionary constraint and function

The extent to which CRMs are under evolutionary constraint remains controversial, and tissue specific cross species comparisons show surprisingly little overlap in transcription factor occupancy. As part of the Mouse ENCODE project we are studying the extent to which TF bound regions and other putative regulatory elements are conserved between human and mouse.

High-resolution mapping of chromatin structure

In a collaboration with Victor Corces at Emory University, we are developing novel analysis methods to allow high-resolution analysis of chromatin structure data. We have developed methods to model and correct for biases in 5C and Hi-C experiments resulting in high resolution measurement of 3D chromatin interactions. We have applied these approaches in both Dropsophila and mouse to produce high-resolution locus specific interaction maps that have revealed new insights into chromatin structure. In ongoing work we are continuing to develop methods to incorporate this information with other functional genomic datasets.

Displaying the 20 most recent publications. View the Google Scholar Profile for complete publications list.

Note: Please refresh the page if no publications initially appear.

M Freeberg, J Taylor
Approaches for small RNA-seq in Galaxy
F1000Research 6, 2017

BA Grüning, E Rasche, B Rebolledo-Jaramillo, C Eberhard, T Houwaart, ...
Jupyter and Galaxy: Easing entry barriers into complex data analyses for biomedical researchers
PLOS Computational Biology 13 (5), e1005425, 2017

YH Jung, MEG Sauria, X Lyu, MS Cheema, J Ausio, J Taylor, VG Corces
Chromatin States in Mouse Sperm Correlate with Embryonic and Adult Regulatory Landscapes
Cell reports 18 (6), 1366-1382, 2017

G Yardimci, H Ozadam, MEG Sauria, O Ursu, KK Yan, T Yang, ...
Measuring the reproducibility and quality of Hi-C data
bioRxiv, 188755, 2017

C Anderson, C Zhou, A Cho, H Siddiqi, B Mormann, CM Avelis, ...
Natural variation in stochastic photoreceptor specification and color preference in Drosophila
bioRxiv, 153445, 2017

TR Luperchio, MEG Sauria, X Wong, MC Gaillard, P Tsang, K Pekrun, ...
Chromosome Conformation Paints Reveal The Role Of Lamina Association In Genome Organization And Regulation
bioRxiv, 122226, 2017

CA Stewart, DY Hancock, M Vaughn, N Merchant, JM Lowe, J Fischer, ...
Jetstream (NSF Award 1445604) Year Program Year 2 Annual Report (Dec 1, 2015–Nov 30, 2016)

N Turaga, MA Freeberg, D Baker, J Chilton, A Nekrutenko, J Taylor, ...
A guide and best practices for R/Bioconductor tool integration in Galaxy
F1000Research 5, 2016

N Coraor, J Taylor, J Chilton, R Marenco
Galaxy Architecture
F1000Research 5, 2016

J Goecks, A Nekrutenko, J Taylor
Galaxy at scale: 2016 and beyond
F1000Research 5, 2016

N Goonasekera, A Lonie, J Taylor, E Afgan
CloudBridge: a Simple Cross-Cloud Python Library
Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at …, 2016

CA Stewart, DY Hancock, M Vaughn, J Fischer, T Cockerill, L Liming, ...
Jetstream: performance, early experiences, and early results
Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at …, 2016

E Afgan, A Lonie, J Taylor, K Skala, N Goonasekera
Architectural models for deploying and running virtual laboratories in the cloud
Information and Communication Technology, Electronics and Microelectronics …, 2016

CA Stewart, DY Hancock, M Vaughn, NC Merchant, JM Lowe, J Fischer, ...
System Acceptance Report for NSF award 1445604” High Performance Computing System Acquisition: Jetstream-A Self-Provisioned, Scalable Science and Engine...

E Afgan, D Baker, M Van den Beek, D Blankenberg, D Bouvier, M Čech, ...
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update
Nucleic acids research 44 (W1), W3-W10, 2016

CA Stewart, I Foster, NC Merchant, J Taylor, MW Vaughn
High Performance Computing System Acquisition: Jetstream–A Self-Provisioned, Scalable Science and Engineering Cloud Environment (Year 1 Annual Report)

B Grüning, E Rasche, B Rebolledo-Jaramillo, C Eberhart, T Houwaart, ...
Enhancing pre-defined workflows with ad hoc analytics using Galaxy, Docker and Jupyter
bioRxiv, 075457, 2016

N Turaga, MA Freeberg, D Baker, J Chilton, G Team, A Nekrutenko, ...
A guide and best practices for R/Bioconductor tool integration in Galaxy [version 1; referees: awaiting peer review]

CA Stewart, I Foster, NC Merchant, J Taylor, MW Vaughn
Jetstream–A Self-Provisioned, Scalable Science and Engineering Cloud Environment-NSF Acceptance Report

E Afgan, N Coraor, J Chilton, D Baker, J Taylor
Enabling cloud bursting for life sciences within Galaxy
Concurrency and Computation: Practice and Experience 27 (16), 4330-4343, 2015