Source Information¶


Author:

Last Updated Date: October 01, 2024 by Gloria Seo

Resources: https://medium.com/analytics-vidhya/sequence-model-time-series-prediction-using-tensorflow-2-0-665257beb25f


Goal¶

This Jupyter notebook demonstrates how to build an image classification model using TensorFlow and Keras. The goal is to classify images of flowers into five categories: daisies, dandelions, roses, sunflowers, and tulips.

Required Modules for the Jupyter Notebook¶

Before running the notebook, make sure to load the following modules.

Module: tensorflow, numpy, PIL, matplotlib,keras,layers, Sequential, pathlib

In [2]:
pip install tensorflow
Defaulting to user installation because normal site-packages is not writeable
Collecting tensorflow
  Downloading tensorflow-2.13.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (479.6 MB)
     |████████████████████            | 299.7 MB 117.1 MB/s eta 0:00:02
IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

     |████████████████████████████████| 479.6 MB 41 kB/s 
Collecting absl-py>=1.0.0
  Downloading absl_py-2.2.1-py3-none-any.whl (277 kB)
     |████████████████████████████████| 277 kB 152.2 MB/s eta 0:00:01
Requirement already satisfied: six>=1.12.0 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorflow) (1.15.0)
Collecting gast<=0.4.0,>=0.2.1
  Downloading gast-0.4.0-py3-none-any.whl (9.8 kB)
Requirement already satisfied: h5py>=2.9.0 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorflow) (2.10.0)
Collecting numpy<=1.24.3,>=1.22
  Downloading numpy-1.24.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
     |████████████████████████████████| 17.3 MB 144.5 MB/s eta 0:00:01
Collecting keras<2.14,>=2.13.1
  Downloading keras-2.13.1-py3-none-any.whl (1.7 MB)
     |████████████████████████████████| 1.7 MB 139.8 MB/s eta 0:00:01
Collecting google-pasta>=0.1.1
  Downloading google_pasta-0.2.0-py3-none-any.whl (57 kB)
     |████████████████████████████████| 57 kB 706 kB/s s eta 0:00:01
Collecting grpcio<2.0,>=1.24.3
  Downloading grpcio-1.70.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.0 MB)
     |████████████████████████████████| 6.0 MB 50.6 MB/s eta 0:00:01
Requirement already satisfied: wrapt>=1.11.0 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorflow) (1.11.2)
Collecting tensorflow-io-gcs-filesystem>=0.23.1; platform_machine != "arm64" or platform_system != "Darwin"
  Downloading tensorflow_io_gcs_filesystem-0.34.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.4 MB)
     |████████████████████████████████| 2.4 MB 103.0 MB/s eta 0:00:01
Requirement already satisfied: typing-extensions<4.6.0,>=3.6.6 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorflow) (3.7.4.3)
Collecting libclang>=13.0.0
  Downloading libclang-18.1.1-py2.py3-none-manylinux2010_x86_64.whl (24.5 MB)
     |████████████████████████████████| 24.5 MB 134.3 MB/s eta 0:00:01
Collecting tensorflow-estimator<2.14,>=2.13.0
  Downloading tensorflow_estimator-2.13.0-py2.py3-none-any.whl (440 kB)
     |████████████████████████████████| 440 kB 149.2 MB/s eta 0:00:01
Requirement already satisfied: setuptools in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorflow) (50.3.1.post20201107)
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3
  Downloading protobuf-4.25.6-cp37-abi3-manylinux2014_x86_64.whl (294 kB)
     |████████████████████████████████| 294 kB 140.4 MB/s eta 0:00:01
Collecting opt-einsum>=2.3.2
  Downloading opt_einsum-3.4.0-py3-none-any.whl (71 kB)
     |████████████████████████████████| 71 kB 29 kB/s /s eta 0:00:01
Collecting flatbuffers>=23.1.21
  Downloading flatbuffers-25.2.10-py2.py3-none-any.whl (30 kB)
Collecting astunparse>=1.6.0
  Downloading astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting termcolor>=1.1.0
  Downloading termcolor-2.4.0-py3-none-any.whl (7.7 kB)
Requirement already satisfied: packaging in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorflow) (20.4)
Collecting tensorboard<2.14,>=2.13
  Downloading tensorboard-2.13.0-py3-none-any.whl (5.6 MB)
     |████████████████████████████████| 5.6 MB 143.3 MB/s eta 0:00:01
Requirement already satisfied: wheel<1.0,>=0.23.0 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from astunparse>=1.6.0->tensorflow) (0.35.1)
Requirement already satisfied: pyparsing>=2.0.2 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from packaging->tensorflow) (2.4.7)
Requirement already satisfied: requests<3,>=2.21.0 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorboard<2.14,>=2.13->tensorflow) (2.24.0)
Requirement already satisfied: werkzeug>=1.0.1 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from tensorboard<2.14,>=2.13->tensorflow) (1.0.1)
Collecting google-auth<3,>=1.6.3
  Downloading google_auth-2.38.0-py2.py3-none-any.whl (210 kB)
     |████████████████████████████████| 210 kB 149.2 MB/s eta 0:00:01
Collecting tensorboard-data-server<0.8.0,>=0.7.0
  Downloading tensorboard_data_server-0.7.2-py3-none-any.whl (2.4 kB)
Collecting markdown>=2.6.8
  Downloading Markdown-3.7-py3-none-any.whl (106 kB)
     |████████████████████████████████| 106 kB 74.9 MB/s eta 0:00:01
Collecting google-auth-oauthlib<1.1,>=0.5
  Downloading google_auth_oauthlib-1.0.0-py2.py3-none-any.whl (18 kB)
Requirement already satisfied: idna<3,>=2.5 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard<2.14,>=2.13->tensorflow) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard<2.14,>=2.13->tensorflow) (1.25.11)
Requirement already satisfied: chardet<4,>=3.0.2 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard<2.14,>=2.13->tensorflow) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /cm/shared/apps/spack/cpu/opt/spack/linux-centos8-zen/gcc-8.3.1/anaconda3-2020.11-da3i7hmt6bdqbmuzq6pyt7kbm47wyrjp/lib/python3.8/site-packages (from requests<3,>=2.21.0->tensorboard<2.14,>=2.13->tensorflow) (2020.6.20)
Collecting rsa<5,>=3.1.4
  Downloading rsa-4.9-py3-none-any.whl (34 kB)
Collecting pyasn1-modules>=0.2.1
  Downloading pyasn1_modules-0.4.2-py3-none-any.whl (181 kB)
     |████████████████████████████████| 181 kB 143.8 MB/s eta 0:00:01
Collecting cachetools<6.0,>=2.0.0
  Downloading cachetools-5.5.2-py3-none-any.whl (10 kB)
Collecting importlib-metadata>=4.4; python_version < "3.10"
  Downloading importlib_metadata-8.5.0-py3-none-any.whl (26 kB)
Collecting requests-oauthlib>=0.7.0
  Downloading requests_oauthlib-2.0.0-py2.py3-none-any.whl (24 kB)
Collecting pyasn1>=0.1.3
  Downloading pyasn1-0.6.1-py3-none-any.whl (83 kB)
     |████████████████████████████████| 83 kB 193 kB/s s eta 0:00:01
Collecting zipp>=3.20
  Downloading zipp-3.20.2-py3-none-any.whl (9.2 kB)
Collecting oauthlib>=3.0.0
  Downloading oauthlib-3.2.2-py3-none-any.whl (151 kB)
     |████████████████████████████████| 151 kB 142.5 MB/s eta 0:00:01
Installing collected packages: absl-py, gast, numpy, keras, google-pasta, grpcio, tensorflow-io-gcs-filesystem, libclang, tensorflow-estimator, protobuf, opt-einsum, flatbuffers, astunparse, termcolor, pyasn1, rsa, pyasn1-modules, cachetools, google-auth, tensorboard-data-server, zipp, importlib-metadata, markdown, oauthlib, requests-oauthlib, google-auth-oauthlib, tensorboard, tensorflow
  Attempting uninstall: numpy
    Found existing installation: numpy 1.24.4
    Uninstalling numpy-1.24.4:
      Successfully uninstalled numpy-1.24.4
ERROR: Could not install packages due to an EnvironmentError: [Errno 16] Device or resource busy: '.nfs000000000cd4463b0000001a'

Note: you may need to restart the kernel to use updated packages.
In [3]:
import matplotlib.pyplot as plt
import numpy as np
import PIL
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
import pathlib
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-a97e297270c7> in <module>
      2 import numpy as np
      3 import PIL
----> 4 import tensorflow as tf
      5 from tensorflow import keras
      6 from tensorflow.keras import layers

ModuleNotFoundError: No module named 'tensorflow'

Loading the Data Set¶

The dataset is a collection of flower photos available online. We use TensorFlow’s utility function to download and extract it.

The dataset used is a collection of flower images, which is freely available online. This notebook uses a TensorFlow utility function to download and extract these images. The dataset includes 3,670 images of the five flower categories.

In [ ]:
# Image Classification Example for Comet, Stratus, Expanse

dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)
data_dir
In [ ]:
image_count = len(list(data_dir.glob('*/*.jpg')))
print(image_count)

It means that the total number of images in the dataset is 3670.

In [ ]:
roses = list(data_dir.glob('roses/*'))
PIL.Image.open(str(roses[0]))
In [ ]:
batch_size = 32
img_height = 180
img_width = 180
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)
In [ ]:
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)
In [ ]:
class_names = train_ds.class_names
print(class_names)
In [ ]:
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")
In [ ]:
for image_batch, labels_batch in train_ds:
  print(image_batch.shape)
  print(labels_batch.shape)
  break
In [ ]:
num_classes = 5

model = Sequential([
  layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

In Tensorflow, you have to compile the model graph before you begin training. This speeds up the runtime.

In [ ]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
model.summary()

Finally, we actually get to train something!¶

In [ ]:
epochs=10
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

Well we've trained... but now what? Did I do good?!¶

The answer is no. Run the below cell, and look at the graphs. The validation loss increases, while the training loss continues to decrease. This means your model is memorizing the training set, and isn't performing well on "real world" data (data it has never seen before). We're going to fix that by flipping and rotating the images in our dataset to artifically increase the number of pictures we have.

In [ ]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
In [ ]:
data_augmentation = keras.Sequential(
  [
    layers.experimental.preprocessing.RandomFlip("horizontal", 
                                                 input_shape=(img_height, 
                                                              img_width,
                                                              3)),
    layers.experimental.preprocessing.RandomRotation(0.1),
    layers.experimental.preprocessing.RandomZoom(0.1),
  ]
)
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
  for i in range(9):
    augmented_images = data_augmentation(images)
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(augmented_images[0].numpy().astype("uint8"))
    plt.axis("off")

We're also going to introduce a dropout layer, which randomly removes some of the output layers. It does this because one of these layers may in fact be misleading, making us think our model did well when in reality this is a local minima of the function, not a global one. By doing this randomly we reduce this situation occuring enough to get a better model. The strong will survive, so even if we remove a good output its ok, its better than having a lot of misleading ones.

In [ ]:
model = Sequential([
  data_augmentation,
  layers.experimental.preprocessing.Rescaling(1./255),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Dropout(0.2),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])
In [ ]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.summary()
In [ ]:
epochs = 15
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

After running the below cell, notice that the validation error decreases along with the training error.

In [ ]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('raining and Validation Loss')
plt.show()
In [ ]:
sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg"
sunflower_path = tf.keras.utils.get_file('Red_sunflower', origin=sunflower_url)

img = keras.preprocessing.image.load_img(
    sunflower_path, target_size=(img_height, img_width)
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch

predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])

print(
    "This image most likely belongs to {} with a {:.2f} percent confidence."
    .format(class_names[np.argmax(score)], 100 * np.max(score))
)

Submit Ticket¶

If you find anything that needs to be changed, edited, or if you would like to provide feedback or contribute to the notebook, please submit a ticket by contacting us at:

Email: consult@sdsc.edu

We appreciate your input and will review your suggestions promptly!

In [ ]: