3.2 Data Management
Source repo: sdsc-summer-institute-2024 | Branch:
main| Last synced: 2026-04-24 10:27:17.425 UTC
SDSC Summer Institute 2024
Session 3.2 Data Management
Date: Tuesday, August 6th, 2024
Summary: Proper data management is essential to make effective use of high-performance computing (HPC) systems and other advanced cyberinfrastructure (CI) resources. This session will cover an overview of filesystems, data compression, archives (tar files), checksums and MD5 digests, downloading data using wget and curl, data transfer and long-term storage solutions.
Presented by: Marty Kandes (mkandes @sdsc.edu)
Data has a lifecycle. Data management is a lifestyle.

Image Credit: Harvard Biomedical Data Management
Reading and Presentations:
-
Lecture material:
-
Source Code/Examples: