pygrad documentation¶
pygrad is a lightweight automatic differentiation engine:
written entirely in Python,
relying only on NumPy, Numba, and opt_einsum,
verified against Pytorch^*,
and less than 300kB in size.
Pygrad will be useful to you if you are looking to compute gradients and/or perform gradient descent for models with less than 1 million parameters.
Pygrad’s Tensor
object operates like a NumPy array, additionally storing gradients.
Tensors
:Store operations performed on them, with support for broadcasting
Perform backpropagation with
.backward()
Store gradients in
.grad
Support np.float16 to np.float128 data types
A simple example performing gradient descent on a Tensor:
from pygrad.tensor import Tensor
loss_fn = lambda y, yh: (y-yh)**2 # L2 norm
x = Tensor(1) # Tensor
y = x**2 + 0.25 # model
yh = 0.5 # float
for _ in range(1000):
y = x**2 + 0.25 # fwd pass
loss_fn(y,yh).backward() # populates x.grad
x.value = x.value - 0.01*x.grad # gradient descent
x.value, loss_fn(y,yh).value # 0.5, 0
This documentation includes examples using Tensors to perform gradient descent on the very simplest of functions to training a Vaswani Transformer with Adam.
For installation instructions and a quick glance at usage, see Usage. All classes and functions can be found in API. For in-depth module descriptions, check out Modules.
If you are interesting in contributing, please click here. If you are wondering who I am, click here
Note
*All operations are verified against Pytorch, except for Conv2D gradients
when performing strictly more than 1 backwards pass when reset_grad=False
.