GOOD 2025

Containerized bioinformatics applications and pipelines on Open OnDemand
03-19, 14:30–14:55 (US/Eastern), Tsai Auditorium (CGIS S010)

Tufts University hosts a vibrant bioinformatics community, many of whom are new to using the linux command line for high-performance computing (HPC). Open OnDemand (OOD) simplifies access to HPC resources with its user-friendly web interface. At Tufts, we have deployed over 30 bioinformatics applications and nf-core pipelines on OOD, including custom RStudio servers tailored for bioinformatics. The nf-core pipelines enable users to run complex workflows with ease. Here, we share our experiences in building a custom RStudio server container for bioinformatics, deploying containerized applications as OOD apps, and transforming the complex command-line interfaces of nf-core pipelines into user-friendly OOD web applications.


The decreasing cost of next-generation sequencing (NGS) has encouraged more wet-lab scientists to integrate bioinformatics into their research. However, transitioning from wet-lab work to bioinformatics requires new skills, with the Linux command-line interface posing a significant challenge. Open OnDemand (OOD) provides a solution by offering a web-based interface that simplifies HPC access without extensive command-line navigation.
Some bioinformatics applications come with a GUI, making them suitable for OOD. At Tufts HPC, we have deployed containerized versions of popular tools like CellProfiler, FastQC, QualiMap, and Relion as OOD apps. Our custom RStudio Server for bioinformatics is particularly popular, offering a user-friendly web-based interface superior to RStudio Desktop. We researched user needs and pre-installed over 1,300 R packages, supporting a wide range of analyses beyond bioinformatics. Users can also install additional packages in their $HOME directory.
While most OOD apps are GUI-based, some command-line tools also benefit greatly from OOD integration. AlphaFold, the recipient of the 2024 Nobel Prize in Chemistry, serves as a prime example. Running AlphaFold via the command line involves complex database paths and parameters. OOD simplifies this by managing configurations behind the scenes. Since deploying AlphaFold on OOD, most users prefer this version. In classrooms, even undergraduates with limited command-line skills have successfully submitted AlphaFold GPU jobs via OOD and obtained predicted protein structures.
nf-core is a global initiative providing over 120 open-source analysis pipelines built with Nextflow, enhancing research with consistency and reproducibility in genomics and beyond. Despite their benefits, adopting Nextflow and nf-core in HPC centers faces challenges. While nf-core pipelines are designed for general usability, they often need customization for specialized HPC environments, leaving many users unsure of how to adapt them effectively. In Spring 2024, we integrated nf-core pipelines into Tufts’ OOD. Custom scripts converted hundreds of pipeline parameters into OOD form widgets, enabling users to run complex workflows via an intuitive interface. Within just five months, over 12% of Tufts bioinformatics researchers adopted Nextflow and nf-core into their research.

Dr. Yucheng Zhang is a bioinformatics engineer at Tufts University Research Technology, specializing in bioinformatics workflows, high-performance computing (HPC), and containerized applications. Before joining Tufts, he served as a senior life scientist at the Rosen Center for Advanced Computing (RCAC) at Purdue University, where he contributed to advancing computational solutions for life sciences.