Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toeplitz KernelG #484

Merged
merged 27 commits into from
Dec 23, 2021
Merged

Toeplitz KernelG #484

merged 27 commits into from
Dec 23, 2021

Conversation

neworderofjamie
Copy link
Contributor

@neworderofjamie neworderofjamie commented Nov 18, 2021

Building on #478, this adds the TOEPLITZ_KERNELG matrix type which implements convolution-adjacent types of connectivity using a doubly-blocked Toeplitz approach. Essentially this PR involves:

  • A new snippet CodeGenerator::InitToeplitzConnectivitySnippet::Base snippet type containing the required code strings, state variables etc and CodeGenerator::InitToeplitzConnectivitySnippet::Init wrapper type for bundling snippets with required parameters. I don't really see users creating too many of these but it's a much nicer way of implementing the connectivity than adding loads of special stuff to SynapseGroup.
  • Yet more overrides for ModelSpec::addSynapsePopulation which take a CodeGenerator::InitToeplitzConnectivitySnippet::Init object as the last parameter
  • In code generator (where it won't break backward compatibility) have generally tried to rename ConnectivityInitialiser to SparseConnectivityInitialiser to make clear separation with new ToeplitzConnectivityInitialiser.
  • New CodeGenerator::PresynapticUpdateStrategySIMT::PostSpanToeplitz with first-pass implementation (no separating of input channels)

At optimal batch-size, performance is approximately 3x faster than PROCEDURAL_KERNELG on VGG16 models (both CIFAR10 scale and ImageNet). However, at ImageNet scale, this equates to 7 hours rather than 24 hours to performance inference over the whole validation set so still pretty slow. Because there are only kernel width * kernel height * kernel out channels threads launched for each synapse group, this currently (splitting spikes and getting a factor of kernel input channels more threads is likely to solve this) performs less well than PROCEDURAL_KERNELG with batch size 1.

Figure_1

TODO

  • Unit tests
  • CPU implementation (unlike procedural, I think this should have one)
  • Add support for striding
  • Profiling
  • Documentation

@codecov
Copy link

codecov bot commented Nov 18, 2021

Codecov Report

Merging #484 (0b393c3) into master (dcfd43f) will decrease coverage by 0.55%.
The diff coverage is 77.13%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #484      +/-   ##
==========================================
- Coverage   87.49%   86.94%   -0.56%     
==========================================
  Files          82       84       +2     
  Lines       17695    17972     +277     
==========================================
+ Hits        15483    15625     +142     
- Misses       2212     2347     +135     
Impacted Files Coverage Δ
include/genn/genn/code_generator/groupMerged.h 84.29% <ø> (ø)
...enn/genn/code_generator/synapseUpdateGroupMerged.h 100.00% <ø> (ø)
include/genn/genn/initSparseConnectivitySnippet.h 84.21% <ø> (ø)
include/genn/genn/synapseGroupInternal.h 75.00% <ø> (ø)
...enn/genn/code_generator/neuronUpdateGroupMerged.cc 95.28% <0.00%> (-0.14%) ⬇️
...nclude/genn/genn/initToeplitzConnectivitySnippet.h 50.00% <50.00%> (ø)
include/genn/genn/synapseMatrixType.h 75.00% <50.00%> (ø)
src/genn/backends/single_threaded_cpu/backend.cc 56.18% <56.21%> (-0.86%) ⬇️
...nn/code_generator/presynapticUpdateStrategySIMT.cc 87.77% <81.87%> (-1.57%) ⬇️
src/genn/genn/synapseGroup.cc 85.38% <81.91%> (-0.45%) ⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dcfd43f...0b393c3. Read the comment docs.

@neworderofjamie neworderofjamie marked this pull request as ready for review November 22, 2021 10:25
…ixed in state variable initialisation e.g. row build state variables
Copy link
Member

@tnowotny tnowotny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking things through & helpful discussion I believe I understand the scheme and the central calculations for the "on-demand" synaptic connections are right; including a cursory inspection of corner cases for different padding which seem consistent with TF conventions. I think this can be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants