deepL is a custom deep learning framework designed for efficient graph optimization and reverse-mode autodifferentiation. The framework provides foundational tools for building and training neural networks with both Python and C++ bindings, enabling flexibility and performance for machine learning tasks.
# Clone the repository
git clone https://github.com/ashwin1596/deepL.git
cd deepL
# Setup environment
conda env create -f environment.yml
conda activate deepl
# Build the library
bash scripts/build.sh
# Set Python path
export PYTHONPATH="${PWD}/build:${PYTHONPATH}"
# Run example
python examples/python/mnist_classifier.py
- Python 3.8 or higher
- CUDA Toolkit 11.0 or higher
- CMake 3.15 or higher
- C++17 compatible compiler
- GPU with compute capability 6.0 or higher(for faster computation), if not CPU mode can be used
- Required Python packages:
- NumPy
- PyBind11
- PyTorch (for examples)
- torchvision (for MNIST example)
- Deep Learning Framework with Automatic Differentiation using Reverse AutoGrad
- High-Performance Tensor Operations on both CPU and GPU
- Dynamic Computational Graph
- Python and C++ APIs
- CUDA-accelerated computations
- Built-in Components:
- Optimizers: SGD
- Layers: Sequential, Linear, ReLU
- Loss Functions: Cross Entropy Loss
- Automatic Differentiation Engine
- Clone the Repository:
git clone https://github.com/ashwin1596/deepL.git
cd deepL
- Create Conda Environment:
conda env create -f environment.yml
conda activate deepl
- Build the Library:
bash scripts/build.sh
- Set Python Path:
export PYTHONPATH="${PWD}/build:${PYTHONPATH}"
- CUDA Not Found: Verify CUDA installation and ensure
CUDA_HOME
is set correctly - Build Fails: Check compiler compatibility and CMake version
- Import Errors: Verify
PYTHONPATH
is set correctly - Missing Dependencies: Ensure all required packages are installed via conda/pip
import deepl as dl
import numpy as np
# Create input and target data
input_data = np.random.randn(10, 2).astype(np.float32)
target_data = np.random.randn(10, 1).astype(np.float32)
# Create input and target nodes
input_node = dl.Tensor(input_data)
target_node = dl.Tensor(target_data)
# Create model
builder = dl.GraphBuilder()
model = dl.Sequential(builder)
model.add_layer(dl.Linear(2, 4, builder))
model.add_layer(dl.ReLU(builder))
model.add_layer(dl.Linear(4, 1, builder))
# Create loss function
loss_fn = dl.CrossEntropyLoss(builder)
# Create optimizer
parameters = model.parameters()
optimizer = dl.SGD(parameters, learning_rate=0.01)
# Training loop
for epoch in range(10):
optimizer.zero_grad()
# Forward pass
outputs = model.forward(input_node)
loss = loss_fn.forward(outputs, target_node)
# Backward pass
builder.backward(loss)
optimizer.step()
#include <deepl/core/tensor.cuh>
#include <deepl/layers/sequential.h>
#include <deepl/builder/graph_builder.h>
int main() {
// Create model
auto builder = GraphBuilder();
auto model = Sequential(builder);
model.add_layer(Linear(784, 128, builder));
model.add_layer(ReLU(builder));
model.add_layer(Linear(128, 10, builder));
return 0;
}
The project is organized into several components:
- bindings/python: Python bindings for easy integration with Python-based workflows.
- docs: Documentation for the library and examples.
- examples: Python and C++ based examples demonstrating usage of the library.
- include/deepl: Header files defining core components, layers, loss functions, optimizers, and utilities.
- src: Source files implementing the core functionality.
- scripts: Scripts for building and maintaining the project.
Navigate to scripts and run build.sh
bash build.sh
For python set the PYTHONPATH
to build directory
export PYTHONPATH=$(pwd)/build:$PYTHONPATH
To use the library you can import it as import deeplearning
, see mnist_classifier.py.
To use the library for C++ code, link the necessary libraries as below.
nvcc examples/cpp/mnist_classifier.cpp -Iinclude/deepl -Lbuild/ -ldeeplearning_cpp -lcudart -lcublas -o example/cpp
- Tensors are used as storage objects.
- Tensors can store data on both CPU and GPU devices, depending on the selected device.
- Support for low-level operations:
reshape
: Change the shape of the tensor without altering its data.add
: Perform element-wise addition of two tensors.elementwise_multiply
: Perform element-wise multiplication of two tensors.matmul
: Perform matrix multiplication between tensors.transpose
: Transpose the dimensions of a tensor.divide
: Perform element-wise division of two tensors.exp
: Compute the exponential of each element in the tensor.binaralize
: Convert tensor elements to binary values (e.g., thresholding).neg
: Negate the elements of a tensor.log
: Compute the natural logarithm of each element in the tensor.
- Support for advanced operations:
sumAlongAxis
: Sum tensor elements along a specified axis.softmax
: Apply the softmax function to normalize tensor values.batchMatmul
: Perform batch matrix multiplication for tensors with batch dimensions.
- All operations are supported on both GPU and CPU devices.
- Operations such as matrix multiplication (
matmul
), addition, subtraction, and negation leverage CuBLAS for GPU-accelerated computation, enhancing the framework's speed and efficiency by utilizing GPU architecture effectively.
- Consists of graph nodes for storing data during the forward pass and adjoints during the backward pass.
- Provides wrappers for all tensor operations.
- Adjoint nodes store adjoint values and dependent nodes along with their partial derivatives.
- During the backward pass, each adjoint node processes the gradient in topological order, propagating the gradients backward.
DeepL employs a computation graph to represent neural network operations. The process includes:
- Graph Construction: Nodes represent operations, and edges define dependencies.
- Forward Pass: Compute outputs layer-by-layer.
- Backward Pass: Compute gradients using reverse-mode autodifferentiation.
Here is a simple example of building and training a neural network using DeepL:
import deepl as dl
import numpy as np
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
transform = transforms.Compose([
transforms.ToTensor()
])
config = dl.Config.get_instance()
config.set_device_type('GPU')
config.set_cuda_devices('0')
config.set_batch_size(32)
config.set_num_epochs(50)
batch_size = config.get_batch_size()
num_epochs = config.get_num_epochs()
num_classes = 10
# Download and transform the MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
test_dataset = datasets.MNIST(root='./data', train=False, transform=transform, download=True)
# Create data loaders
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
builder = dl.GraphBuilder()
model = dl.Sequential(builder)
model.add_layer(dl.Linear(28 * 28, 256, builder, layer_num=1)) # Input: 784, Output: 128
model.add_layer(dl.ReLU(builder, layer_num=2))
model.add_layer(dl.Linear(256, 128, builder, layer_num=3))
model.add_layer(dl.ReLU(builder, layer_num=4))
model.add_layer(dl.Linear(128, 10, builder, layer_num=5))
loss_fn = dl.CrossEntropyLoss(builder) # Loss function
parameters = model.parameters() # Get model parameters
optimizer = dl.SGD(parameters, learning_rate=0.001) # Optimizer
for epoch in range(num_epochs):
total_loss = 0.0
for batch_idx, (images, labels) in enumerate(train_loader):
optimizer.zero_grad()
# Flatten the images to match the input dimension (batch_size, 28*28)
images = images.view(images.size(0), -1).numpy().astype(np.float32) # Convert to numpy array
labels = labels.numpy().astype(int) # Convert labels to numpy array
# One-hot encode the labels
labels = np.eye(num_classes)[labels].astype(np.float32)
# Convert to Tensor
input_tensor = dl.Tensor(images, False)
target_tensor = dl.Tensor(labels, False)
input_node = builder.createVariable("input", input_tensor.transpose())
target_node = builder.createVariable("target", target_tensor.transpose())
# Forward pass
outputs = model.forward(input_node)
# Compute loss
loss = loss_fn.forward(outputs, target_node)
# Backward pass
builder.backward(loss)
optimizer.step()
# Accumulate loss
total_loss += loss.value().get_data()[0]
print(f"Epoch: {epoch+1}/{num_epochs}, Batch: {batch_idx+1}, Loss: {loss.value().get_data()[0]}")
print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(train_loader):.4f}")
deepL\
| .gitignore
| CMakeLists.txt
| dependencies.yml
| readme.md
|
+---bindings
| \---python
| deeplearning.cpp
|
+---examples
| \---python
| gradmtest.py
| mnist_classifier.py
| trans_test.py
| \---cpp
| mnist_classifier.cpp_
|
+---include
| \---deepl
| +---core
| +---layers
| +---loss
| +---optimisers
| \---utils
+---scripts
| build.sh
+---src
+---core
+---layers
+---loss
+---optimisers
\---utils
- Implement Tensor class and operations.
- Add Python bindings.
- Create example neural networks.
- Expand documentation.
- Optimize GPU implementations.
- Add loss functions and optimizers.