Faster classification of the whole test set

The model we developed in the previous tutorial classified MNIST successfully but was rather slow. Like ANNs, to maximise performance when simulating small SNNs like this on a GPU, we need to simulate multiple copies of the model at once and run them on batches of input images. In this tutorial we will modify our model to do just that as well as off-loading further computation to the GPU to improve performance.

Install PyGeNN wheel from Google Drive

Download wheel file

[ ]:
if "google.colab" in str(get_ipython()):
    #import IPython
    #IPython.core.magics.execution.ExecutionMagics.run.func_defaults[2] = lambda a: a
    #%run "../install_collab.ipynb"
    !pip install gdown --upgrade
    !gdown 1V_GzXUDzcFz9QDIpxAD8QNEglcSipssW
    !pip install pygenn-5.0.0-cp310-cp310-linux_x86_64.whl
    %env CUDA_PATH=/usr/local/cuda
Requirement already satisfied: gdown in /usr/local/lib/python3.10/dist-packages (5.1.0)
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-packages (from gdown) (4.12.3)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from gdown) (3.13.1)
Requirement already satisfied: requests[socks] in /usr/local/lib/python3.10/dist-packages (from gdown) (2.31.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from gdown) (4.66.2)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4->gdown) (2.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (2024.2.2)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (1.7.1)
Downloading...
From: https://drive.google.com/uc?id=1V_GzXUDzcFz9QDIpxAD8QNEglcSipssW
To: /content/pygenn-5.0.0-cp310-cp310-linux_x86_64.whl
100% 8.29M/8.29M [00:00<00:00, 182MB/s]
Processing ./pygenn-5.0.0-cp310-cp310-linux_x86_64.whl
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from pygenn==5.0.0) (1.25.2)
Requirement already satisfied: deprecated in /usr/local/lib/python3.10/dist-packages (from pygenn==5.0.0) (1.2.14)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from pygenn==5.0.0) (5.9.5)
Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.10/dist-packages (from deprecated->pygenn==5.0.0) (1.14.1)
pygenn is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel.
env: CUDA_PATH=/usr/local/cuda

Download pre-trained weights and MNIST test data

[ ]:
!gdown 1cmNL8W0QZZtn3dPHiOQnVjGAYTk6Rhpc
!gdown 131lCXLEH6aTXnBZ9Nh4eJLSy5DQ6LKSF
Downloading...
From: https://drive.google.com/uc?id=1cmNL8W0QZZtn3dPHiOQnVjGAYTk6Rhpc
To: /content/weights_0_1.npy
100% 402k/402k [00:00<00:00, 50.3MB/s]
Downloading...
From: https://drive.google.com/uc?id=131lCXLEH6aTXnBZ9Nh4eJLSy5DQ6LKSF
To: /content/weights_1_2.npy
100% 5.25k/5.25k [00:00<00:00, 23.2MB/s]

Install MNIST package

[ ]:
!pip install mnist
Requirement already satisfied: mnist in /usr/local/lib/python3.10/dist-packages (0.2.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from mnist) (1.25.2)

Build model

Import standard module and PyGeNN functionality as before and configure simulation parameters

[ ]:
import mnist
import numpy as np
import matplotlib.pyplot as plt
from pygenn import (create_neuron_model, create_current_source_model, create_custom_update_model,
                    create_var_ref, init_postsynaptic, init_weight_update, GeNNModel)
from time import perf_counter
from tqdm.auto import tqdm

TIMESTEP = 1.0
PRESENT_TIMESTEPS = 100
INPUT_CURRENT_SCALE = 1.0 / 100.0

As we’re going to use it in a few places, we add an additional simulation parameter to define the batch size.

[ ]:
BATCH_SIZE = 128

Define the custom neuron and synapse models in exactly the same way as before

[ ]:
# Very simple integrate-and-fire neuron model
if_model = create_neuron_model(
    "if_model",
    params=["Vthr"],
    vars=[("V", "scalar"), ("SpikeCount", "unsigned int")],
    sim_code="V += Isyn * dt;",
    reset_code="""
    V = 0.0;
    SpikeCount++;
    """,
    threshold_condition_code="V >= Vthr")

cs_model = create_current_source_model(
    "cs_model",
    vars=[("magnitude", "scalar")],
    injection_code="injectCurrent(magnitude);")

As we increase the batch size of our model, the cost of resetting the spike counts and membrane voltages will increase. To counteract this, we can offload tasks like this to the GPU using a custom update model. These are defined using very similar syntax to neuron and synapse models but have one additional feature - variable references. These allow custom updates to be attached to existing neuron or synapse populations to modify their variables outside of the standard neuron and synapse updates.

[ ]:
reset_model = create_custom_update_model(
    "reset",
    var_refs=[("V", "scalar"), ("SpikeCount", "unsigned int")],
    update_code="""
    V = 0.0;
    SpikeCount = 0;
    """)

Create a new model in exactly the same way as before

[ ]:
model = GeNNModel("float", "tutorial_3")
model.dt = TIMESTEP

Set the model batch size

[ ]:
model.batch_size = BATCH_SIZE

Build model, load weights and create neuron, synapse and current source populations as before

[ ]:
# Load weights
weights_0_1 = np.load("weights_0_1.npy")
weights_1_2 = np.load("weights_1_2.npy")

if_params = {"Vthr": 5.0}
if_init = {"V": 0.0, "SpikeCount":0}
neurons = [model.add_neuron_population("neuron0", weights_0_1.shape[0],
                                       if_model, if_params, if_init),
           model.add_neuron_population("neuron1", weights_0_1.shape[1],
                                       if_model, if_params, if_init),
           model.add_neuron_population("neuron2", weights_1_2.shape[1],
                                       if_model, if_params, if_init)]
model.add_synapse_population(
        "synapse_0_1", "DENSE",
        neurons[0], neurons[1],
        init_weight_update("StaticPulse", {}, {"g": weights_0_1.flatten()}),
        init_postsynaptic("DeltaCurr"))
model.add_synapse_population(
        "synapse_1_2", "DENSE",
        neurons[1], neurons[2],
        init_weight_update("StaticPulse", {}, {"g": weights_1_2.flatten()}),
        init_postsynaptic("DeltaCurr"));

current_input = model.add_current_source("current_input", cs_model,
                                         neurons[0], {}, {"magnitude": 0.0})
[ ]:
for n in neurons:
    reset_var_refs = {"V": create_var_ref(n, "V"),
                      "SpikeCount": create_var_ref(n, "SpikeCount")}
    model.add_custom_update(f"{n.name}_reset", "Reset", reset_model,
                            {}, {}, reset_var_refs)
[ ]:
# Build and load our model
model.build()
model.load()

testing_images = mnist.test_images()
testing_labels = mnist.test_labels()

testing_images = np.reshape(testing_images, (testing_images.shape[0], -1))
assert testing_images.shape[1] == weights_0_1.shape[0]
assert np.max(testing_labels) == (weights_1_2.shape[1] - 1)

First of all, we determine where to split our test data to achieve our batch size and then use np.split to perform the splitting operation (the last batch will contain < BATCH_SIZE stimuli as 128 does not divide 10000 evenly)

[ ]:
batch_splits = range(BATCH_SIZE, testing_images.shape[0] + 1, BATCH_SIZE)

testing_image_batches = np.split(testing_images, batch_splits, axis=0)
testing_label_batches = np.split(testing_labels, batch_splits, axis=0)

Simulate model

Our batched simulation loop looks very similar to the loop we defined in the previous tutorial however: * We now loop over batches of images and labels rather than individual ones * When we copy images into the input current view, we only copy as many images as are present in this batch to handle the remainder in the final batch * We specify an axis for np.argmax so that we get the neuron with the largest spike count in each batch

[ ]:
current_input_magnitude = current_input.vars["magnitude"]
output_spike_count = neurons[-1].vars["SpikeCount"]
neuron_voltages = [n.vars["V"] for n in neurons]

# Simulate
num_correct = 0
start_time = perf_counter()
for img, lab in tqdm(zip(testing_image_batches, testing_label_batches),
                     total=len(testing_image_batches)):
    current_input_magnitude.view[:img.shape[0],:] = img * INPUT_CURRENT_SCALE
    current_input_magnitude.push_to_device()

    # Run reset custom update
    model.custom_update("Reset")

    for t in range(PRESENT_TIMESTEPS):
        model.step_time()

    # Download spike count from last layer
    output_spike_count.pull_from_device()

    # Find which neuron spiked most in each batch to get prediction
    predicted_lab = np.argmax(output_spike_count.view, axis=1)

    # Add number of
    num_correct += np.sum(predicted_lab[:lab.shape[0]] == lab)

end_time = perf_counter()
print(f"\nAccuracy {((num_correct / float(testing_images.shape[0])) * 100.0)}%%")
print(f"Time {end_time - start_time} seconds")


Accuracy 97.54%%
Time 0.34431284400000095 seconds

And…we get a speed up of over 30x compared to the previous tutorial