GitHub - yogimogi/yodf: 'Hello, World!' Forward Mode Autdiff library with Tensorflow 1 like interface.

A 'Hello, World!' forward mode autodiff library.

This small (~500 lines) library is meant as an illustration of how forward mode autodiff can possibly be implemented. It lets you compute the value and the derivative of a function expressed as a computational flow using the primitives provided by the library. Interface of the library is very similar to Tensorflow 1.*. With Tensorflow 1.*, all the samples provided in examples folder can very well be run if you do import tensorflow as tf as opposed to import yodf as tf It supports following operations { "add", "subtract", "divide", "multiply", "pow", "sin", "cos", "log", "exp", "matmul", "sigmoid", "reduce_mean", "reduce_sum" }.

Installation

pip install yodf will install the library. Only dependency it has is numpy. Samples provided in examples folder also have dependency on matplotlib and scipy.

Basic usage

Below code computes the value and the derivative of the function x^2 at x=5.0

import yodf as tf
x = tf.Variable(5.0)
cost = x**2
with tf.Session() as s:
    # global_variables_initializer API added just so as to
    # resemble Tensorflow, it hardly does anything
    s.run(tf.global_variables_initializer())
    s.run(cost)
print(x.value, cost.value, cost.gradient)

## Output
## 5.0 25.0 10.0

Basic gradient descent example

Below code computes optima of the function x^2 along with the value at which optima occurs starting with x=5.0

import yodf as tf
x = tf.Variable(5.0)
cost = x**2
train = tf.train.GradientDescentOptimizer(learning_rate=0.2).minimize(cost)
with tf.Session() as s:
    s.run(tf.global_variables_initializer())
    for _ in range(50):
        _, cost_final, x_final = s.run([train, x, cost])
print(f"Minima: {cost_final:.10f}, x at minima: {x_final:.10f}")

## Output
## Minima: 0.0000000000, x at minima: 0.0000000000

How does it work?

It has a class called Tensor with Variable and _Constant as subclasses. Tensor object holds a value and a gradient. Gradient of a constant is 0 and that of a variable is 1 which is as good as saying d(x)/dx.
A tensor can also represent an operation and a tensor representating an operation gets created using a convenient function call like tf.sin() or tf.matmul() etc.

import numpy as np
import yodf as tf
x = tf.Variable(np.array([[1,1],[2,2]]))
op_sin = tf.sin(x)
print(op_sin)

## Output
## <yod.Tensor type=TensorType.INT, shape=(2, 2), operation='sin'>

You typically pass a tensor to run method of Session class which ends up evaluating the tensor along with its derivative. Execute method of tensor just knows how to compute derivative of basic arithmatic operations, power function and some of the transcendental functions like sin, cos, log, exp. It also knows how to compute derivative when matrix multiplication operation is involved. By applying the chain rule repeatedly to these operations, derivative of an arbitrary function (represented as a tensor) gets computed automatically. run method simply builds post order traversal tree of the tensor passed to it and evaluates all the nodes in the tree. GradientDescentOptimizer simply updates the value of the variable based on the gradient of the cost tensor passed to its minimize function.
With multiple independent variables, partial derivative of one variable gets computed at a time while the gradient of rest of the variables is set to 0. This, in turn, is done for all the variables and partial derivatives or the gradients of all the vatiables are accumulated by GradientDescentOptimizer which is not necessarily very clean.

Examples

Examples folder shows use of this library for

Limitiation of forward mode autodiff

Though with forward mode autodiff, derivative of a function with one independent variables gets computed during forward pass itself and no backward pass is needed as is the case with reverse mode autodiff (generalized backpropagation), with multiple indepdent variables (say weights in a neural network), as many passes are needed as number of indepdent variables. So as can be seen in linear regression sample, time needed by gradient descent linearly increases with increase in degree of polynomial you are trying to fit. For MNIST digit classification, this library becomes almost unusable due to large number of independent variables whose gradient needs to be computed. Machine learning frameworks like PyTorch, TensorFlow, Theano use reverse mode autodiff for gradient computation

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
examples		examples
tests		tests
yodf		yodf
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A 'Hello, World!' forward mode autodiff library.

Installation

Basic usage

Basic gradient descent example

How does it work?

Examples

Limitiation of forward mode autodiff

About

Releases

Packages

Languages

yogimogi/yodf

Folders and files

Latest commit

History

Repository files navigation

A 'Hello, World!' forward mode autodiff library.

Installation

Basic usage

Basic gradient descent example

How does it work?

Examples

Limitiation of forward mode autodiff

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages