Bio
I’m currently a data scientist and bioinformatician in the team led by Leonardo Collado-Torres at LIBD.
Here, I use R, Python, shell scripting, and more to develop computational pipelines for processing genomic data, explore and implement machine-learning methods, and publish research papers showcasing open-source software and biological findings. My work in genomics spans topics such as bulk-RNA sequencing (for which I published SPEAQeasy), whole-genome bisulfite sequencing (see BiocMAP), spatial transcriptomics (see visiumStitched), and cell-type deconvolution.
I also help install and maintain software for LIBD and collaborators, and mentor and share my knowledge through data science guidance sessions and R Stats Club presentations. Some example topics I’ve taught include implementing machine-learning models in python, system-wide software installations on high-performance computing clusters with Lmod, and running PyTorch-based software with GPUs at HPC environments.
Portfolio
Data Science Snippets
Here I showcase actual code I’ve developed for data-science tasks throughout my career.
Building machine-learning models: here I use
scikit-learn
to train and test some candidate cell-type-classification models.Data wrangling, cleaning, and visualization: Beginning with several messy, large datasets, I use
dplyr
in R to clean and integrate them. Along the way, I visualize key patterns withggplot2
.Deep learning: In a personal project to build a neural-network-powered chess engine, I use Keras/Tensorflow to build and fit a CNN-based model.
Bioinformatics Pipelines
These are Nextflow-based pipelines for processing genomic data where I was the lead developer.
SPEAQeasy: bulk RNA-seq preprocessing workflow, quantifying features like genes into
SummarizedExperiment
R objects ( code | documentation | paper)BiocMAP: preprocessing workflow for bisulfite sequencing data, quantifying methylation in
bsseq
R objects ( code | documentation | paper)
Education
IBM | Online
Data Science Professional Certification | October 2024 - January 2025
University of Maryland, Baltimore County | Baltimore, MD
B.S. in Mathematics | September 2013 - May 2018
Experience
Lieber Institute for Brain Development | Research Associate II- Data Science | November 2018 - Present
Publications
Integrating gene expression and imaging data across Visium capture areas with visiumStitched
- Eagles et al., BMC Genomics, 2024.
BiocMAP: a Bioconductor-friendly, GPU-accelerated pipeline for bisulfite-sequencing data
- Eagles et al., BMC Bioinformatics, 2023.
- Eagles et al., BMC Bioinformatics, 2021.
My full list of publications is available at my ORCID profile.
Public Presentations
Bioconductor | Bioconductor | 2023
Presented a workshop: A Bioconductor-style differential expression analysis powered by SPEAQeasy recorded as a video
Presented a short talk: Spot Deconvolution in the Post-Mortem Human DLPFC recorded as a video
Biological Data Science | Cold Spring Harbor Laboratory | 2022
- Presented a poster: Benchmarking spot-level cell-type deconvolution methods using Visium immunofluorescence benchmark data on the human dorsolateral prefrontal cortex.
European Bioconductor | Bioconductor | 2020
- Presented my SPEAQeasy paper.