03-19, 17:10–17:20 (US/Eastern), Belfer Case Study Room (CGIS S020)
The Advanced Research Computing department at the University of British Columbia (UBC) have been supporting a local HPC cluster for use by the entire UBC research community for nearly a decade. With a demand for more interactive computing options coming from our researchers, we have begun implementing Open OnDemand as a portal for accessing our resources and with that has come challenges around our currently existing architecture. This talk will provide a detailed, yet high-level overview of the challenges we faced and the solutions we explored. By sharing our journey, we hope to provide other system administrators with a view of both the ease of modifying Open OnDemand for current systems, as well as potential challenges to keep in mind when exploring adding Open OnDemand.
This talk will cover our esperinces with implementing Open OnDemand with our local HPC system Sockeye. Sockeye has been in operation since 2019 and during that time the needs and demands of researchers have evolved with both their research and the tools available to assist them. With a growing number of researchers who are intersted in having more interactive and easy-to-use options for accessing our system we began exploring ways to expand our offerings. Our two main challenges arose from longstanding limitations that were built into the original system design to provide a more stable environment and better security standards for sensitive data use on the system, such as biomedical data.
One of these challenges was the limitation of write access of our home directory during running non-interactive jobs. This presented us with challenges around ensuring the logging of jobs and submission would work properly when users would use Open OnDemand to submit job scripts. The second is our outbound networking being restricted on nodes that are running jobs. This limitation makes it more challenging for users to make use of standard interactive tools to set up environments and workflows using resources from the web.
A brief outline of the talk:
- Introduction to UBC, ARC, and Sockeye
- Current Architectural Challenges
- Handling Storage Restrictions
- Networking Access Limitations
- Questions
Jacob Boschee is a Systems Administrator with the Advanced Research Computing group at UBC. He began his background in systems administration at the South Dakota School of Mines and Technology where he maintained the research cluster on campus while completing his PhD in Physics. His research background is in quantum computing, phononics, and computational modelling.
As a part of UBC ARC he assists researchers with utilizing the system and developing new services that can support researchers of all backgrounds and skill levels.