Skip to content

Attempting to implement convolution in CUDA following XNOR-net strategy

Notifications You must be signed in to change notification settings

akhauriyash/XNOR-convolution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

XNOR convolution

Attempting to implement convolution in CUDA following XNOR-net strategy.

Prerequisites:

  • CUDA
  • CUDA capable GPU

To run:

Navigate to the directory where xnorconv.cu is located.

nvcc -arch=sm_50 xnorconv.cu -std=c++11 && ./a.out

To profile the application:

nvprof ./a.out

Note:

This is a work in progress. There might/should be some mistakes here. I started learning CUDA a month ago. Do let me know if you find any logical errors in the code.

TO DO:

  • Add support for variable input sizes
  • Add support for 3D convolution
  • Parallelize per convolution
  • Add code/function for general matrix multiplication (Already created, PM for code.)
  • Maximize shared memory usage - balance channel parallelization
  • Create a full precision verification kernel
  • Add full support for custom kernel sizes
  • Build a parser to take in shape arguments

Related/Relevant resources:

Paper on XNOR-Nets

Blog post 1

Blog post 2

Blog post 3

BinaryNet

XNOR-Net - AllenAI

About

Attempting to implement convolution in CUDA following XNOR-net strategy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages