03-19, 16:00–16:25 (US/Eastern), Tsai Auditorium (CGIS S010)
The AlphaFold AI system won the 2024 Chemistry Nobel Prize because of its predictive achievements poised to revolutionize disease understanding and drug discovery. Initially released as open-source (and now proprietary), researchers are working to improve the code to require less resources and maintain open-source accessibility. We present an open-source implementation of AlphaFold 2 & 3 that optimizes computational resource allocation by intelligently separating CPU and GPU phases within a single OOD instance. This addresses a critical challenge to make AlphaFold more accessible by minimizing idle GPU cycles. Benchmarking across three major clusters (NCSA Delta, Jetstream2, and ROAR), we developed a user-friendly OOD application that operates with maximum resource efficiency.
Through extensive benchmarking across three major clusters (NCSA Delta, Jetstream, and Roar), we identified that AlphaFold's workflow can be effectively split into CPU-intensive (MSA generation) and GPU-intensive (structure prediction) phases. Our analysis revealed that approximately 75% of the runtime is CPU-bound, while GPU resources are only required for the final structure prediction phase.
Key Innovations:
1. Workflow Optimization:
• Separated CPU and GPU phases to maximize resource efficiency
• Optimizes resource allocation by allocating GPU resources only AFTER successful CPU phase completion.
• Reduced unnecessary GPU allocation time by up to 75%
2. User Interface Development:
• Created an intuitive web interface requiring only the amino acid sequence as input
• Eliminated coding requirements for researchers
• Automated resource management and job scheduling.
3. Cross-Platform Validation:
• Comprehensive benchmarking across three distinct HPC environments.
• Documented performance metrics for various protein sizes.
• Established optimal configuration guidelines for different infrastructures.
Our solution is now deployed at Penn State's High Performance Computing platform and is available as an open-source project, enabling other institutions to implement similar services. This work significantly reduces the barrier to entry for structural biology research while optimizing computational resource utilization. The application is particularly timely with the release of AlphaFold 3, which natively supports CPU/GPU separation, making our framework immediately compatible with this latest iteration.
Target Audience:
• HPC administrators looking to deploy AlphaFold services.
• Researchers in structural biology and related fields.
• Scientific computing professionals interested in resource optimization.
Learning Outcomes/Attendees will learn about:
• Efficient AlphaFold deployment strategies.
• Resource optimization techniques for mixed CPU/GPU workloads.
• Implementation patterns for user-friendly scientific applications.
This presentation will include live demonstrations and reference our open-source codebase, providing immediate practical value to attendees.
Assistant Research Professor
Director of the EpiGenomics Core Facility
Director of the Center for Vertebrate Genomics