AI Usage in Cores
AI in Our Cores
| Core | Use of AI |
|---|---|
| Advanced Microscopy Facility Director: Sjie Hao, PhD pfa2xb@virginia.edu | AMF’s AI-related services and research focus on two main components: AI-guided imaging and AI-assisted image data analysis. AI-guided imaging: The LSM 980 is equipped with an AI-based sample finder that scans fluorescently labeled tissue sections using specialized objectives and coherent illumination. Automated, AI-assisted segmentation is then applied to identify and outline regions of interest. This process generates a Google Earth–style map of the entire sample, enabling efficient navigation and facilitating downstream high-resolution imaging. The segmentation model is user-trainable, allowing customization for specific experimental needs. AI-assisted image data analysis: AMF supports machine learning–based instance segmentation using both open-source tools (Fiji-Weka, Labkit, Ilastik, Cellpose) and commercial software (Bitplane Imaris and ZEN Intellesis). Leveraging the high-performance image server HIVE and the Google Colab platform, AMF develops customized image analysis pipelines using libraries such as scikit-image, scikit-learn, and TensorFlow for advanced image data analysis. |
| Bioinformatics Core Director: Pankaj Kumar, PhD pk7z@virginia.edu | The Bioinformatics Core is systematically integrating artificial intelligence and machine learning (AI/ML) into standard bioinformatics pipelines to improve data quality, reproducibility, and scalability across single-cell, spatial, and bulk omics analyses. Traditional workflows rely on subjective parameter thresholds (e.g., gene counts, mitochondrial read percentages) for quality control, which can introduce bias and reduce reproducibility. DSSR addresses these limitations by incorporating automated machine learning (AutoML) approaches that learn data-specific quality thresholds from curated examples, enabling more objective and precise filtering. ML-based methods are also applied for batch-effect detection and correction. Core staff have contributed to multiple pan-cancer projects, including an ML-based analysis of TCGA RNA-seq data that identified a 19-gene signature predictive of pathogenic lymph node involvement in melanoma, informing personalized surgical decision-making and resulting in a patent application and publication in Annals of Surgery. In alignment with UVA’s strategic emphasis on AI and the continued expansion of the UVA Cancer Center, DSSR is advancing AI-driven integrative analyses that link institutional molecular datasets (e.g., ORIEN Avatar genomics) with clinical/EHR data and major public resources (TCGA, ICGC, CPTAC, All of Us, UK Biobank). These efforts position the Core to meet the growing demand for integrative, AI-enabled analyses as data scale and complexity increase. A core supported UVACCC trainee is developing a tissue-aware, AI-driven cell annotation framework that hierarchically classifies cells from broad lineages to fine-grained functional states while explicitly incorporating tissue context—an essential feature for accurately modeling immune and tumor populations. As an initial application, a glioblastoma-specific model is being developed to enable probabilistic, confidence-scored cell annotation of Visium HD spatial transcriptomics data, directly supporting ongoing and future spatial omics studies at UVA. To further strengthen AI/ML expertise, Core staff were selected for the AIM-AHEAD All of Us Training Program (Cohort 3), a competitive NIH initiative focused on advanced AI/ML methods applied to the All of Us Research Program dataset. This resource integrates electronic health records and multi-omic data from over 600,000 participants, providing hands-on training in scalable, equitable, and reproducible AI-driven biomedical research. |
| Biomolecular Analysis Facility Director: Nicholas Sherman, PhD nes3f@virginia.edu | The Biomolecular Analysis Facility Core (BAF) has a new mass spectrometer the Thermo Orbitrap Astral Zoom capable of doing DDA (data dependent analysis) and DIA (data independent analysis) on moderate to high complexity samples at 3-50 minutes per run with high sensitivity. Given this speed and sensitivity, the instrument often acquires fragmentation spectra (MS2) that contain data from multiple overlapping peptide ions within the isolation window. Previously, software often identified only the most abundant species or could try to do a second pass (obtaining two peptide identifications for a spectrum). New software in the Core from Thermo, Proteome Discoverer 3.3, uses AI based search algorithms made by MSAID that allow many peptides to be identified for each MS2 fragment scan. CHIMERYS uses non-negative L1-regularized regression (LASSO) to explain the experimental spectrum using predicted fragment ion intensities and retention times from the INFERYS deep-learning model. It aims to explain as much experimental intensity as possible with the fewest candidate peptides. These hardware and AI software advances allow for identification of 3-5X more peptides in most samples. |
| Biomolecular Magnetic Resonance Facility Director: Jeff Ellena, PhD jfe@virginia.edu | The BioNMR Core and clients use NMR data processing and analysis applications that have AI components. These applications include Artina, NMRFAM-Sparky, POKY, TalosN, and Mestrenova. Most but not all of these applications are free and open source with support from the BioNMR Core. |
| Biostatistics and Population Science Shared Resource Director: Hong Zhu, PhD hzhu2m@virginia.edu | BPSSR personnel have collaborated with basic, clinical, translational, population, and data sciences investigators on development, validation, and implementation of AI/ML tools for risk prediction and clinical decision support for patients with and at risk of cancer. Examples include: 1. Leverage integrative analyses, machine learning (ML) methods, and data from two large Phase III clinical trials (NCT00046930, NCT02085408) to develop and validate a novel risk prediction model for overall survival of aged acute myeloid leukemia (AML) patients, in a P01-funded study. 2. Utilize ML methods to validate and extend risk prediction models for hepatotoxicity and acute neurotoxicity among a contemporary multiethnic population of children with acute lymphoblastic leukemia (ALL), integrating clinical, demographic, SDoH, genomic, and lipidomic data, in a U54 SPORE-funded study. 3. Develop a digital clinical decision support tool that uses federated learning and natural language processing (NLP) with unstructured and structured clinical and real-world data to predict efficacy and toxicity in CAR-T cancer therapy (STTR R41 submission) 4. AI-driven analysis of large real-world oncology datasets (ORIEN) to inform systemic therapy selection and predict treatment outcomes and tolerability in patients with metastatic breast cancer. 5. Apply machine-learning models on healthcare claims data (SEER-Medicare) to identify patients with metastatic breast cancer at high risk of functional decline and in need of geriatric assessment. 6. Use NLP to extract information about tobacco product use in TRICARE claims data. 7. Develop a modular multitask learning architecture and unified heterogeneous event sequence framework for remission-level risk prediction. |
| Flow Cytometry Core Facility Director: Mike Solga, MS mds4z@virginia.edu | The Flow Cytometry Core Facility supports the application of artificial intelligence (AI) and machine learning for the analysis of high-dimensional single-cell and spatial cytometry datasets. Core staff assist investigators with experimental design to ensure datasets are suitable for advanced and automated analysis workflows that incorporate dimensionality reduction, unsupervised clustering, and automated population identification across conventional, spectral, imaging flow cytometry, and mass cytometry platforms. These approaches enable unbiased interrogation of multiparametric datasets and improved reproducibility for the identification of rare or novel cell populations. In addition, the core evaluates and pilots new commercial and open-source analysis platforms and provides hands-on training and workshops to support investigator data analysis. |
| Molecular Electron Microscopy Core Director: Michael Purdy, PhD mpurdy@virginia.edu | The Molecular Electron Microscopy Core (MEMC) is actively integrating artificial intelligence and machine learning tools across its research and operational workflows, positioning the core as a key resource for AI-driven structural biology at the University of Virginia School of Medicine. These efforts were highlighted in a poster presentation at the 2025 Frontiers in Clinical AI Symposium, which showcased several collaborative projects between MEMC staff and School of Medicine investigators that leverage AI to accelerate discovery. AI and ML tools are now embedded throughout the cryo-EM and cryo-ET data processing pipelines used in the MEMC. Deep learning-based particle pickers such as Topaz and crYOLO have significantly improved the identification of macromolecules in noisy cryo-EM micrographs, while deep denoising algorithms enhance raw data quality for downstream analysis. For resolving the structural dynamics of flexible and heterogeneous complexes, the MEMC supports researchers in using tools such as cryoDRGN and CryoSPARC 3DFlex. In collaboration with the Jomaa and Qi laboratories, these methods were applied to reveal conformational ensembles of ribosomes and to resolve the structure and dynamics of the human ERAD complex, a multi-protein machine critical for protein quality control. AlphaFold3 was used in the same study to generate accurate initial atomic models that were refined against cryo-EM density maps, dramatically accelerating the model-building process. A particularly promising frontier is the use of AI for therapeutic protein design coupled with experimental validation by cryo-EM. The Zimmer laboratory is using BindCraft, an automated pipeline built on AlphaFold2, to computationally design novel protein binders with predicted nanomolar affinity to target molecules. These designed binders can then be experimentally validated and their structures determined using the MEMC's cryo-EM infrastructure, establishing a rapid design-validate cycle with direct implications for structure-based drug design and the development of targeted cancer therapies. AI is also advancing the MEMC's expanding in situ structural biology program, which leverages the Aquilos 2+ cryo-FIB-SEM and Titan Krios to visualize macromolecules within their native cellular environment. Tools such as TARDIS, a deep learning-based segmentation platform, enable automated identification and annotation of macromolecules in complex cryo-electron tomograms. The Redemann laboratory used TARDIS to segment mitotic spindles in human U2OS cells, demonstrating the power of AI to extract biological insight from crowded cellular volumes that would be prohibitively time-consuming to analyze manually. Beyond research applications, the MEMC is participating in the School of Medicine's Claude.ai pilot program, exploring the use of large language models to improve operational efficiency in areas such as documentation, reporting, communication, and training material development. Together, these efforts reflect the MEMC's commitment to harnessing AI at every level—from atomic-resolution structure determination to day-to-day core management—to maximize its impact on biomedical research at UVA. |