5.1 Best Practices For Scientific Commputing

Source repo: sdsc-summer-institute-2025 | Branch: main | Last synced: 2026-04-24 10:27:17.425 UTC

SDSC Summer Institute 2025

Session 5.1 Best Practices for Scientific Computing

Date: Thursday, August 7

Summary: Jupyter Notebooks excel as an exploratory visualization tool, letting you test and iterate on ideas instantly. AI and Data Engineers rely heavily on these notebooks to create, tune, enhance and refine their ML models, at the same time that they can explore new hypothesis, concepts or ideas by just converting them to code. However, whether you're creating a complex ML model to estimate the price of housing in a geographical region, exploring the best architecture for your CNN, or just playing around with a fun physics toy model, once a model or analysis needs to be shared, scaled, or deployed, the notebook should mature into a structured, tested, and reproducible Python package.

In this talk we'll discuss why that transformation matters deeply, how to perform it systematically, and the techniques to achieve it: CI/CD Pipelines, Unit testing, containerization, and the creation of revenue-generating products.

Converting notebooks into production-ready packages not only boosts reproducibility and code quality, it also showcases the software-engineering rigor that today’s employers expect from data and AI professionals.

Presented by: Fernando Garzon

Reading and Presentations:

Lecture material:
- Presentation Slides: will be made available closer to the session
Source Code/Examples:
- SkyDiving Model: Toy Physics model that simulates objects in free fall and calculates the terminal speed of said object
- mnist_ae: ML Model that predicts hand-writen digits (0-9).
  - Unit test suite
- mnist_ae pypi package

SDSC Summer Institute 2025​

Session 5.1 Best Practices for Scientific Computing

Reading and Presentations:​

TASKS: None at this time.​

SDSC Summer Institute 2025

Reading and Presentations:

TASKS: None at this time.