Skip to content

一个轻量化的深度学习框架,基于C++和CUDA编写,支持CPU和GPU加速,Python接口调用

License

Notifications You must be signed in to change notification settings

piood/LightTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LightTorch

一个轻量化的深度学习框架,基于C++和CUDA编写,支持CPU和GPU加速,使用pybind11,支持Python接口调用

实现内容:

  • 使用C++,CUDA编写底层Tensor,以及相应Ops,支持CPU,GPU加速,性能较好
  • 使用pybind11,支持Python调用,接口类似于Pytorch,方便易用
  • 使用拓扑排序实现计算图的构建,反向传播的梯度计算,构建SGD,Aadm优化器,更新参数
  • 实现了Conv, Layernorm,Linear等网络层,并在此基础上实现了Transformer, Resnet模型

安装教程

git clone https://github.com/piood/LightTorch
cd LightTorch/LightTorch
pip install -r requirements.txt #安装Python依赖库
make #编译C++,CUDA代码

使用示例

import ltorch #导入ltorch
from ltorch.apps.models import ResNet9
from ltorch.apps.simple_ml import train_cifar10, evaluate_cifar10
import numpy as np

device = ltorch.cuda() #GPU加速
dataset = ltorch.data.CIFAR10Dataset("ltorch_data/cifar-10-batches-py", train=True)
dataloader = ltorch.data.DataLoader(
         dataset=dataset,
         batch_size=256,
         shuffle=True, device=device)
model = ResNet9(device=device, dtype="float32")

loss_fn = ltorch.nn.SoftmaxLoss()
n_epochs=50
optimizer=ltorch.optim.Adam
lr=0.001
weight_decay=0.001
opt = optimizer(model.parameters(), lr=lr, weight_decay=weight_decay)

for epoch in range(n_epochs):
    correct, total_loss = 0, 0
    device = model.device
    model.train()
    for batch in dataloader:
        opt.reset_grad()
        X, y = batch
        X, y = ltorch.Tensor(X, device=device), ltorch.Tensor(y, device=device)
        out = model(X)
        loss = loss_fn(out, y)
        loss.backward()
        opt.step()
        correct += np.sum(np.argmax(out.numpy(), axis=1) == y.numpy())
        total_loss += loss.data.numpy() * y.shape[0]
    sample_nums = len(dataloader.dataset)
    avg_acc, avg_loss =  correct / sample_nums, total_loss / sample_nums
    print(f"Epoch: {epoch}, Acc: {avg_acc}, Loss: {avg_loss}")

示例文件

参考:

About

一个轻量化的深度学习框架,基于C++和CUDA编写,支持CPU和GPU加速,Python接口调用

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published