FILESYSTEMS

Source repo: sdsc-summer-institute-2023 | Branch: main | Last synced: 2026-04-24 10:27:17.425 UTC

More files, more problems: Advantages and limitations of different filesystems

The aim of this tutorial is to teach you about the advantages and limitations of different filesystems that you'll typically find available to you on an HPC system. If you are not already logged in to Expanse, please login to Expanse with your training account either via the Expanse User Portal or directly via SSH from your terminal application.

Once you're logged into the system, go ahead and try to clone this GitHub repository that contains the CIFAR-10 dataset, but this version of the dataset is all 60K raw images in jpeg format. However, please be prepared to cancel the download.

Command:

git clone https://github.com/YoongiKim/CIFAR-10-images.git

Output:

[train108@login02 ~]$ git clone https://github.com/YoongiKim/CIFAR-10-images.git
Cloning into 'CIFAR-10-images'...
remote: Enumerating objects: 60027, done.
remote: Total 60027 (delta 0), reused 0 (delta 0), pack-reused 60027
Receiving objects: 100% (60027/60027), 19.94 MiB | 26.94 MiB/s, done.
Resolving deltas: 100% (59990/59990), done.
Updating files: 4% (2723/60001)

If you have not done so already, please go ahead and cancel your git clone command. It'll take way too long for us all to download this version of the dataset. How long? This is the runtime measured to download it on on one of Expanse's login nodes.

Command:

time -p git clone https://github.com/YoongiKim/CIFAR-10-images.git

Output:

train108@login02 ~]$ time -p git clone https://github.com/YoongiKim/CIFAR-10-images.git
real 1724.19
user 1.01
sys 3.36

Why does it take so much time? What if you attempt to clone the dataset on your laptop? This is runtime to download the dataset on my laptop's local disk.

Command:

time -p git clone https://github.com/YoongiKim/CIFAR-10-images.git

Output:

mkandes@hardtack:~$ time -p git clone https://github.com/YoongiKim/CIFAR-10-images.git
Cloning into 'CIFAR-10-images'...
remote: Enumerating objects: 60027, done.
remote: Total 60027 (delta 0), reused 0 (delta 0), pack-reused 60027
Receiving objects: 100% (60027/60027), 19.94 MiB | 8.03 MiB/s, done.
Resolving deltas: 100% (59990/59990), done.
Updating files: 100% (60001/60001), done.
real 4.42
user 1.25
sys 1.17
mkandes@hardtack:~$

What is going on here? Why is there such a big difference in the time to download the same dataset. Well, not all filesystems are local. For example, your HOME directory on Expanse is not physically located on the login node. It is hosted remotely from another server using the Network File System (NFS), which is a distributed filesystem.

NFS Architecture

You can find which NFS server your training account's HOME directory is located on with the following command.

Command:

cat /etc/auto.home | grep "${USER}"

Output:

[train108@login02 ~]$ cat /etc/auto.home | grep "${USER}"
etrain108              -fstype=bind :/expanse/nfs/home4/etrain108
train108               -fstype=bind :/expanse/nfs/home1/train108
[train108@login02 ~]$

How much space is avaiable in your HOME directory? Can you answer this question by using the df command?

Command:

df -Th | grep "${USER}"

Output:

[train108@login02 ~]$ df -Th | grep "${USER}"
10.22.100.111:/pool1/home/train108                      nfs       210T   14T  196T   7% /home/train108
10.22.100.114:/pool4/home/etrain108                     nfs       205T   14T  191T   7% /home/etrain108
[train108@login02 ~]$

Nope. But it does allow you to see the different types of filesystems and total amount of storage space available on each filesystem.

Command:

df -Th

Output:

[train108@login02 ~]$ df -Th
Filesystem                                              Type      Size  Used Avail Use% Mounted on
devtmpfs                                                devtmpfs   63G  4.0K   63G   1% /dev
tmpfs                                                   tmpfs      63G   12M   63G   1% /run
/dev/sda2                                               ext4       32G   22G  7.9G  74% /
none                                                    tmpfs      63G  245M   63G   1% /dev/shm
tmpfs                                                   tmpfs      63G     0   63G   0% /sys/fs/cgroup
/dev/sda4                                               ext4       32G  1.5G   29G   5% /tmp
/dev/sda1                                               vfat      100M     0  100M   0% /boot/efi
/dev/sdb1                                               ext4      879G   44K  834G   1% /scratch
10.22.100.114:/pool4/home                               nfs       205T   14T  191T   7% /expanse/nfs/home4
10.22.100.111:/pool1/home                               nfs       210T   14T  196T   7% /expanse/nfs/home1
10.22.100.112:/pool2/home                               nfs       199T   18T  181T  10% /expanse/nfs/home2
ps-071.sdsc.edu:/ps-data/community-sw                   nfs       1.0T  301G  724G  30% /expanse/community
master:/home                                            nfs       140G   83G   58G  60% /expanse/nfs/mgr1/home
10.22.100.113:/pool3/home                               nfs       196T   14T  182T   7% /expanse/nfs/home3
10.21.0.21:6789,10.21.11.7:6789,10.21.11.8:6789:/       ceph      1.6T  972G  652G  60% /cm/shared
192.168.43.5:6789,192.168.43.6:6789:/                   ceph      3.3P   44T  3.3P   2% /expanse/ceph
10.22.101.123@o2ib:10.22.101.124@o2ib:/expanse/projects lustre     11P  3.8P  7.1P  35% /expanse/lustre/projects
10.22.101.123@o2ib:10.22.101.124@o2ib:/expanse/scratch  lustre     11P  3.8P  7.1P  35% /expanse/lustre/scratch
10.22.100.112:/pool2/home/erfan                         nfs       199T   18T  181T  10% /home/erfan
10.22.100.111:/pool1/home/geyan1                        nfs       210T   14T  196T   7% /home/geyan1
10.22.100.111:/pool1/home/lcmoore                       nfs       210T   14T  196T   7% /home/lcmoore
tmpfs                                                   tmpfs      13G     0   13G   0% /run/user/531940
tmpfs                                                   tmpfs      13G     0   13G   0% /run/user/532702
10.22.100.111:/pool1/home/kblighe1                      nfs       210T   14T  196T   7% /home/kblighe1
10.22.100.114:/pool4/home/ksheriff                      nfs       205T   14T  191T   7% /home/ksheriff
10.22.100.113:/pool3/home/aah217                        nfs       196T   14T  182T   7% /home/aah217
...

Here, you see there are a number of local filesystems like ext4 that are associated with storage devices phyically attached to the login node, while there are a number of distributed filesysmtes in addition to NFS like Ceph and/or Lustre.

Expanse System Architecture

Do any of these filesystems solve the problem of speeding up the download of the CIFAR-10 dataset? Let's anaswer this question by starting up an interactive session on a shared compute node using the following command alias. Once the scheduler has assigned you to a shared compute node, your interactive session on the compute node will open.

Command:

srun-shared

Output:

[train108@login02 ~]$ srun-shared 
[train108@exp-3-21 ~]$

Let's see if there are any local NVMe drives on the compute nodes.

Command:

 df -Th | grep nvme

Output:

[train108@exp-3-21 ~]$ df -Th | grep nvme
/dev/nvme0n1p1                                          ext4      916G  792K  870G   1% /scratch
[train108@exp-3-21 ~]$

Let's go ahead and change your current working directy to the local /scratch disk and then download the CIFAR image repository.

Command:

cd /scratch/$USER/job_$SLURM_JOB_ID

Output:

cd /scratch/$USER/job_$SLURM_JOB_ID
[train108@exp-3-21 job_24468420]$

Command:

time -p git clone https://github.com/YoongiKim/CIFAR-10-images.git

Output:

[train108@exp-9-56 job_24472115]$ time -p git clone https://github.com/YoongiKim/CIFAR-10-images.git
Cloning into 'CIFAR-10-images'...
remote: Enumerating objects: 60027, done.
remote: Total 60027 (delta 0), reused 0 (delta 0), pack-reused 60027
Receiving objects: 100% (60027/60027), 19.94 MiB | 29.17 MiB/s, done.
Resolving deltas: 100% (59990/59990), done.
Updating files: 100% (60001/60001), done.
real 2.53
user 0.67
sys 0.98
[train108@exp-9-56 job_24472115]$

Wow! That was fast! And that's what the node local /scratch disk is good for --- high I/O and metadata operations. Now that we have a working copy of the raw image dataset, let's package it up the repository into a compressed zip archive file that will be easier to move to other filessytems on Expanse.

Command:

zip -r CIFAR-10-images.zip CIFAR-10-images

Output:

[train108@exp-9-56 job_24472115]$ zip -r CIFAR-10-images.zip CIFAR-10-images
...
adding: CIFAR-10-images/train/cat/2815.jpg (deflated 17%)
adding: CIFAR-10-images/train/cat/1910.jpg (deflated 18%)
adding: CIFAR-10-images/train/cat/2862.jpg (deflated 19%)
adding: CIFAR-10-images/train/cat/4829.jpg (deflated 18%)
adding: CIFAR-10-images/train/cat/3621.jpg (deflated 18%)
adding: CIFAR-10-images/train/cat/4036.jpg (deflated 18%)
adding: CIFAR-10-images/train/cat/4328.jpg (deflated 18%)
adding: CIFAR-10-images/train/cat/4564.jpg (deflated 22%)
[train108@exp-9-56 job_24472115]$

Check the size of the zip archive.

Command:

ls -lh

Output:

[train108@exp-9-56 job_24472115]$ ls -lh
total 78M
drwxr-xr-x 5 train108 gue998 4.0K Aug  7 20:42 CIFAR-10-images
-rw-r--r-- 1 train108 gue998  78M Aug  7 20:49 CIFAR-10-images.zip
[train108@exp-9-56 job_24472115]$

What is the size of the original image repository? Use the du command to check disk usage.

Command:

du -h CIFAR-10-images

Output:

[train108@exp-9-56 job_24472115]$ du -h CIFAR-10-images
4.0M	CIFAR-10-images/test/truck
4.0M	CIFAR-10-images/test/frog
4.0M	CIFAR-10-images/test/dog
4.0M	CIFAR-10-images/test/horse
4.0M	CIFAR-10-images/test/ship
4.0M	CIFAR-10-images/test/bird
4.0M	CIFAR-10-images/test/deer
4.0M	CIFAR-10-images/test/automobile
4.0M	CIFAR-10-images/test/airplane
4.0M	CIFAR-10-images/test/cat
40M	CIFAR-10-images/test
8.0K	CIFAR-10-images/.git/logs/refs/heads
8.0K	CIFAR-10-images/.git/logs/refs/remotes/origin
12K	CIFAR-10-images/.git/logs/refs/remotes
24K	CIFAR-10-images/.git/logs/refs
32K	CIFAR-10-images/.git/logs
4.0K	CIFAR-10-images/.git/objects/info
22M	CIFAR-10-images/.git/objects/pack
22M	CIFAR-10-images/.git/objects
64K	CIFAR-10-images/.git/hooks
8.0K	CIFAR-10-images/.git/info
4.0K	CIFAR-10-images/.git/branches
8.0K	CIFAR-10-images/.git/refs/heads
8.0K	CIFAR-10-images/.git/refs/remotes/origin
12K	CIFAR-10-images/.git/refs/remotes
4.0K	CIFAR-10-images/.git/refs/tags
28K	CIFAR-10-images/.git/refs
27M	CIFAR-10-images/.git
20M	CIFAR-10-images/train/truck
20M	CIFAR-10-images/train/frog
20M	CIFAR-10-images/train/dog
20M	CIFAR-10-images/train/horse
20M	CIFAR-10-images/train/ship
20M	CIFAR-10-images/train/bird
20M	CIFAR-10-images/train/deer
20M	CIFAR-10-images/train/automobile
20M	CIFAR-10-images/train/airplane
20M	CIFAR-10-images/train/cat
197M	CIFAR-10-images/train
263M	CIFAR-10-images
[train108@exp-9-56 job_24472115]$

Remove the original repository from the node local /scratch disk.

Command:

rm -rf CIFAR-10-images/

Output

[train108@exp-9-56 job_24472115]$ rm -rf CIFAR-10-images
[train108@exp-9-56 job_24472115]$

Unzip only the test dogs.

Command:

unzip CIFAR-10-images.zip 'CIFAR-10-images/test/dog/*'

Output:

[train108@exp-9-56 job_24472115]$ unzip CIFAR-10-images.zip 'CIFAR-10-images/test/dog/*'
Archive:  CIFAR-10-images.zip
creating: CIFAR-10-images/test/dog/
inflating: CIFAR-10-images/test/dog/0235.jpg  
inflating: CIFAR-10-images/test/dog/0857.jpg  
inflating: CIFAR-10-images/test/dog/0878.jpg  
inflating: CIFAR-10-images/test/dog/0779.jpg  
inflating: CIFAR-10-images/test/dog/0137.jpg
...
inflating: CIFAR-10-images/test/dog/0320.jpg  
inflating: CIFAR-10-images/test/dog/0395.jpg  
inflating: CIFAR-10-images/test/dog/0980.jpg  
[train108@exp-9-56 job_24472115]$

Copy the zip archive back to your HOME (NFS) directory.

Command:

cp CIFAR-10-images.zip ~/

Output:

[train108@exp-9-56 job_24472115]$ cp CIFAR-10-images.zip ~/
[train108@exp-9-56 job_24472115]$

And then check to make sure you've got the copy in your HOME directory.

Command 1:

cd ~/

Command 2:

ls -lh

Output:

[train108@exp-9-56 job_24472115]$ cd ~/
[train108@exp-9-56 ~]$ ls -lh
total 373M
drwxr-xr-x 2 train108 gue998   10 Jun  4  2009 cifar-10-batches-py
-rw-r--r-- 1 train108 gue998  78M Aug  7 20:55 CIFAR-10-images.zip
-rw-r--r-- 1 train108 gue998   57 Aug  7 10:35 cifar-10-python.md5
-rw-r--r-- 1 train108 gue998   86 Aug  7 11:10 cifar-10-python.sha256
-rw-r--r-- 1 train108 gue998 163M Jun  4  2009 cifar-10-python.tar.gz
-rw-r--r-- 1 train108 gue998 163M Aug  7 11:05 cifar-10-python.tgz
lrwxrwxrwx 1 train108 gue998   32 Aug  6 14:45 data -> /cm/shared/examples/sdsc/si/2023
drwx------ 2 train108 gue998    2 Aug  7 15:42 Downloads
[train108@exp-9-56 ~]$

If so, then you can go ahead and exit your interactive job session.

Command:

exit

Output:

[train108@exp-9-56 ~]$ exit
exit
[train108@login01 ~]$

Finally, we're going to use some of the tools we've learned about to create a tarball of the CIFAR-10 raw mage dataset as part of a batch job. Start by downloading the batch job script usng wget.

Command:

wget https://raw.githubusercontent.com/sdsc/sdsc-summer-institute-2023/main/3.2_data_management/download-cifar-images.sh

Output:

[train108@login01 ~]$ wget https://raw.githubusercontent.com/sdsc/sdsc-summer-institute-2023/main/3.2_data_management/download-cifar-images.sh
--2023-08-07 21:11:26--  https://raw.githubusercontent.com/sdsc/sdsc-summer-institute-2023/main/3.2_data_management/download-cifar-images.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 780 [text/plain]
Saving to: ‘download-cifar-images.sh’

download-cifar-imag 100%[===================>]     780  --.-KB/s    in 0s      

2023-08-07 21:11:27 (74.6 MB/s) - ‘download-cifar-images.sh’ saved [780/780]

[train108@login01 ~]$

And then inspect the job script using cat.

Command:

cat download-cifar-images.sh

Output:

[train108@login01 ~]$ cat download-cifar-images.sh 
#!/usr/bin/env bash

#SBATCH --job-name=download-cifar-images
#SBATCH --account=gue998
#SBATCH --reservation=hpcds23cpu
#SBATCH --partition=shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=2G
#SBATCH --time=00:05:00
#SBATCH --output=%x.o%j.%N

declare -xr LUSTRE_PROJECTS_DIR="/expanse/lustre/projects/${SLURM_JOB_ACCOUNT}/${USER}"
declare -xr LUSTRE_SCRATCH_DIR="/expanse/lustre/scratch/${USER}/temp_project"

declare -xr LOCAL_SCRATCH_DIR="/scratch/${USER}/job_${SLURM_JOB_ID}"

module reset
module list
printenv

cd "${LOCAL_SCRATCH_DIR}"
git clone https://github.com/YoongiKim/CIFAR-10-images.git
tar -czf CIFAR-10-images.tar.gz CIFAR-10-images/
cp CIFAR-10-images.tar.gz "${HOME}"
cp CIFAR-10-images.tar.gz "${LUSTRE_SCRATCH_DIR}"
[train108@login01 ~]$

Submit the job to the scheduler.

Command:

sbatch download-cifar-images.sh

Output:

[train108@login01 ~]$ sbatch download-cifar-images.sh 
Submitted batch job 24472214
[train108@login01 ~]$

Check the status of the job.

Command:

squeue -u $USER

Output:

[train108@login01 ~]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          24472214    shared download train108 PD       0:00      1 (Priority)
[train108@login01 ~]$

Check that the new tarball is located in your HOME directory.

[xdtr108@login01 ~]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
[xdtr108@login01 ~]$ ls -lh
total 415M
drwxr-xr-x 2 xdtr108 uic157   10 Jun  4  2009 cifar-10-batches-py
drwxr-xr-x 3 xdtr108 uic157    3 Jul 26 09:17 CIFAR-10-images
-rw-r--r-- 1 xdtr108 uic157  42M Jul 26 09:41 CIFAR-10-images.tar.gz
-rw-r--r-- 1 xdtr108 uic157  78M Jul 26 09:15 CIFAR-10-images.zip
-rw-r--r-- 1 xdtr108 uic157   57 Jul 26 08:53 cifar-10-python.md5
-rw-r--r-- 1 xdtr108 uic157   86 Jul 26 08:55 cifar-10-python.sha256
-rw-r--r-- 1 xdtr108 uic157 163M Jun  4  2009 cifar-10-python.tar.gz
-rw-r--r-- 1 xdtr108 uic157 163M Jul 26 08:54 cifar-10-python.tgz
-rw-r--r-- 1 xdtr108 uic157 6.3K Jul 26 09:41 download-cifar-images.o14751956.exp-9-55
-rw-r--r-- 1 xdtr108 uic157  746 Jul 26 09:41 download-cifar-images.sh

Previous - CIFAR through the tubes: Downloading data from the internet

Next - Going parallel: Lustre basics

More files, more problems: Advantages and limitations of different filesystems​

More files, more problems: Advantages and limitations of different filesystems