Basic usage

PyADi performs first order automatic differentiation via source transformation.

Demos

Demo demo_babylonian

This module contains a function, fbabylonian() which takes a single argument x and returns the square root of x, the same as the builtin sqrt().

examples.demo_babylonian.fbabylonian(x)[source]
examples.demo_babylonian.gbabylonian(x, y=1, tol=1e-07)[source]
examples.demo_babylonian.run()[source]

We alias our function to f and define a starting value x0:

f = fbabylonian

x0 = 16

Then we are already good to go with calling DiffFor():

dr1, r1 = pyadi.DiffFor(f, x0, verbose=0)

print(f'dr = {dr1}, r = {r1}')

Since in general we will not know the correct result directly, it is always a good idea to compare against finite differences, at least once in the beginning, with DiffFD():

dr_fd1, r1 = pyadi.DiffFD(f, x0)

print(f'dr_fd = {dr_fd1}, r = {r1}')

This produces the following output:

x0 = 16
dr = [0.12500000000005562], r = 4.000000000000051
dr_fd = [0.12500001034254637], r = 4.000000000000051

When getting started with a new funcion, or for deeper insights, it may also be interesting to set verbose to some integer > 0. With verbose=1, information on the differentiation process is printed:

x0 = 16
Load and parse module __main__ source from /home/dev/src/projects/gh/pyadi/tests/examples/demo_babylonian.py: 3.3 ms
AD function produced for __main__.fbabylonian: d_fbabylonian
AD function produced for __main__.gbabylonian: d_gbabylonian
Timer fbabylonian adrun: 1.43 ms
AD factor fbabylonian: 1.43 ms / 5.72 µs = 249.33
dr = [0.12500000000005562], r = 4.000000000000051
dr_fd = [0.12500001034254637], r = 4.000000000000051
Timer fbabylonian adrun: 60.80 µs
AD factor fbabylonian: 60.80 µs / 4.53 µs = 13.42

With verbose=2, the differentiated function codes will also be printed, which for example for our function looks like this:

def d_fbabylonian(d_x, x):
    (d_r, r) = D(gbabylonian)((d_x, x))
    return (d_r, r)

The function calls another function gbabylonian() directly. The D() operator will proceed to differentiate it when d_fbabylonian is first called, as indicated by the messages above. For this reason the reported runtime of the first call and the AD runtime factor are relatively large. Obviously the result of the process will be cached, so the subsequent call executes much faster already, as seen in the last line.

The verbose=1 option produces these timing messages because option timings defaults to True. This not only to show off, but also to catch a common error, which is that something is not correct about the function, the arguments etc. These are detected by trying to run the function first.

To get the best performance you should of course set timings to False. Then, even with verbose enabled, the second call to DiffFor() produces no output at all.

As seen from the FD derivative values printed, this method inherently produces results that are only correct up to half the number of digits. The AD results however, should be correct up to the machine precision.

Demo demo_numpy

This demos shows that using numpy instead of plain floats works as well. We must however redefine our function slightly, because the stop condition does not work with arrays. The result is gbabylonian().

examples.demo_numpy.fbabylonian(x)[source]
examples.demo_numpy.gbabylonian(x, y=1, tol=1e-07)[source]
examples.demo_numpy.run()[source]

We alias our function to f and define a starting value x0:

f = fbabylonian

x0 = np.array([1,2,3,4,5], dtype=float)

When calling DiffFor() now, the AD function will be run five times automatically, and the result dr is a list of length five:

dr1, r1 = pyadi.DiffFor(f, x0, verbose=0)

We compare with DiffFD() as well:

dr_fd1, r1 = pyadi.DiffFD(f, x0)

Printing this output would by rather lengthy, and it contains a lot of zeros. The function is vector valued, so it does not make sense to iterate over all the inputs, we can get the derivatives as a vector using the seed parameter:

dr2, r2 = pyadi.DiffFor(f, x0, verbose=0, seed=[numpy.ones(5)])
dr_fd2, r2 = pyadi.DiffFD(f, x0, seed=[numpy.ones(5)])

The seed is a list, and each item defines a derivative direction, the AD function will run once for each item. The output is then:

x0 = [16.  2.  3.  4.  5.]
r0 = [4.000000000000051 1.414213562373095 1.732050807568877 2.
 2.23606797749979 ]
dr = [array([0.125000000000056, 0.353553390593274, 0.288675134594813,
       0.25             , 0.223606797749979])]
dr_fd = [array([0.125000010342546, 0.353553386567285, 0.288675128246041,
       0.250000009582863, 0.223606799742981])]

When we activate verbose=1 here, the output is something like this:

x0 = [16.  2.  3.  4.  5.]
r0 = [4.000000000000051 1.414213562373095 1.732050807568877 2.
 2.23606797749979 ]
Load and parse module __main__ source from /home/dev/src/projects/gh/pyadi/tests/examples/demo_numpy.py: 4.1 ms
AD function produced for __main__.fbabylonian: d_fbabylonian
AD function produced for __main__.gbabylonian: d_gbabylonian
Timer fbabylonian adrun: 1.55 ms
AD factor fbabylonian: 1.55 ms / 79.87 µs = 19.43
Timer fbabylonian adrun: 184.54 µs
AD factor fbabylonian: 184.54 µs / 79.87 µs = 2.31
Timer fbabylonian adrun: 152.11 µs
AD factor fbabylonian: 152.11 µs / 79.87 µs = 1.90
Timer fbabylonian adrun: 152.11 µs
AD factor fbabylonian: 152.11 µs / 79.87 µs = 1.90
Timer fbabylonian adrun: 151.16 µs
AD factor fbabylonian: 151.16 µs / 79.87 µs = 1.89
Timer fbabylonian adrun: 146.63 µs
AD factor fbabylonian: 146.63 µs / 68.19 µs = 2.15
dr = [array([0.125000000000056, 0.353553390593274, 0.288675134594813,
       0.25             , 0.223606797749979])]
dr_fd = [array([0.125000010342546, 0.353553386567285, 0.288675128246041,
       0.250000009582863, 0.223606799742981])]

The AD factors are a lot better than what we saw in the first example demo_babylonian. From the first AD factor reported, which includes the differentiation of the inner function call to gbabylonian(), we can conclude that the source transformation time is essentially the same. Since the function now takes longer, because numpy kicks in, the first AD factor is much lower already. Then the caching and the warming up leads to quite reasonable AD factors of sometimes even less than two. Of course this is still a small example, but this can be seen as an even harder case for AD, as the whole code is still interpreter-bound, the actual float ops play a very small role until we increase the vector length to 1000 at least.

Demo demo_directional

Let us explore in more detail how to use the seed option and what a directional derivative is, as the readers will certainly also remember from school.

examples.demo_directional.run()[source]

The vector function fbabylonian() is aliased to f and the starting value x0 is a short vevtor:

x0 = np.array([1,2,3,4,5], dtype=float)

We saw in demo_numpy that passing a derivative direction allows us to get all derivatives of this function as a single vector:

dx = numpy.ones(5)

How does this work? The AD process, running a differentiated function once, returns a single directional derivative. This is because we can chose one set of values for the derivative arguments, which are of the same type as the original arguments. The items in seed define those values, once for each direction:

dr2, r2 = pyadi.DiffFor(f, x0, verbose=0, seed=[dx])
dr_fd2, r2 = pyadi.DiffFD(f, x0, seed=[dx])

The result of this is exactly a directional derivative d/dx f(x) evaluated at the point x along the derivative direction dx, which is defined as d / dt f(x + t * dx) evaluated at the point t=0.

This means in particular, it can be expressed as the derivative of a function that has a single scalar float parameter t. We can try this just out by defining a suitable function foft:

def foft(t):
    return f(x0 + t * dx)

When we evaluate this function at t=0 we get f(x) and when we evaluate the derivatives at that point we get the directional derivatives of f:

dr2, r2 = pyadi.DiffFor(foft, 0, verbose=2)
dr_fd2, r2 = pyadi.DiffFD(foft, 0)

PyADi will differentiate foft to the following code, were the nonlocal values x0 and dx are treated as having no derivative, which produces the calls to dzeros():

def d_foft(d_t, t):
    d_t_c2 = ((d_t * dx) + (t * dzeros(dx)))
    t_c2 = (t * dx)
    return D(f)(((dzeros(x0) + d_t_c2), (x0 + t_c2)))

We now have two different ways to compute the same derivatives, which we can compare with each other, Again, the FD method will inherently produce results that are only correct up to half the number of digits. The AD results should be correct up to the machine precision.

The two FD results among each other will also be perfectly identical because the same float operations are performed in either case. For less well-behaved functions it may happen, however, that the directional derivative evaluated with FD is not the same as the full Jacobian evaluated with FD multiplied by dx. For AD these two will of course also be exactly identical.

Demo demo_args

This demo module shows how to pass additional parameters to the function while differentiating only w.r.t. the first.

examples.demo_args.run()[source]

The target function is f, alias gbabylonian().

This function has two parameters with defaults, so it can be called with one, two, or three positional arguments.

Two things are important to remember here:

  • parameters with defaults are often passed via keyword arguments. This however does not work with the entry point function DiffFor(), as it will consider all keyword arguments options. So don’t do this:

    # wrong: tol is now an option to DiffFor
    dr1, r1 = pyadi.DiffFor(f, x0, tol=1e-15, active=[0])
    

    In this case, we want to set the tol parameter, which is the third, so we also have to pass the second:

    # correct: use only positional arguments
    dr1, r1 = pyadi.DiffFor(f, x0, 1, 1e-15, active=[0])
    
  • we want to continue to differentiate w.r.t. to x only, so we use the active option, as seen above.

It is also possible to use the option f_kw to pass a dictionary as keywords to f, in which case, active would not be needed any more:

# also correct: use f_kw for keyword arguments
dr1, r1 = pyadi.DiffFor(f, x0, f_kw=dict(tol=1e-15))

Note that gbabylonian() itself does use keyword arguments normally. It is just the entry point function DiffFor() where the f_kw mechanism is applied.

Running this function produces the following output:

x0 = 12.4
dr = [0.1419904585617669], r = 3.5213633723318023
dr_fd = [0.1419904638311209], r = 3.5213633723318023
F error: 3.552713678800501e-15
AD/FD error: 5.269354008685667e-09
dr = [0.14199045856176618], r = 3.521363372331802
dr_fd = [0.1419904638311209], r = 3.521363372331802
F error: 0.0
AD/FD error: 5.269354730330633e-09
dr = [0.1422608716588418], r = 3.5219794497246726
dr_fd = [0.1422608919554591], r = 3.5219794497246726
F error: 0.004339244282906662
AD/FD error: 2.029661730351684e-08

This shows, the function varies in precision with the square of the tolerance threshold, but the AD vs. FD error remains roughly the same.

Demo demo_setrule

Show usage of setrule() to override a function’s automatic derivative, in this case. The same procedure must be used to provide derivatives for function that do not have Python source code. This concerns numpy in particular.

examples.demo_setrule.run()[source]

Use setrule() to override the derivative of gbabylonian().

Prints as output:

x0 = 12.4
dr = [0.1419904585617669], r = 3.5213633723318023
Rule called
dr2 = [0.14199045856176618], r = 3.5213633723318023

Differentiation

The main entry point to PyADi is DiffFor() which accepts a function with positional arguments, and evalutes the derivative at the point given by the arguments.

pyadi.DiffFor(function, *args, seed=1, active=[], f_kw=None, timings=True, verbose=0, dump=0, dumpdir='dump', **opts)[source]

Differentiate function and compute first-order derivatives evaluated at *args and **f_kw, w.r.t. all floats in args, possibly restricted by àctive.

Differentiate function function(*args) with forward mode AD to produce adfun. This function is the main entry point to start the differentiation process. This function basically does the following:

  • Differentiate function with DiffFunction() alias D()

  • Create one set of derivative arguments dx = dzeros(args) using dzeros()

  • For each seeddir in seed, initialize dx with seeddir using fill() and call adfun with dx and args appropriately.

The result is a tuple of the list of the derivative results thus produced, and the function result.

Although PyADi supports almost the full set of Python language features including keyword arguments, lambda functions, etc. the function given here must adhere to some restrictions:

  • function must be a regular Python function, defined with

    def, not a lambda expression.

  • This function only processes the positional arguments args and considers all keyword arguments as options to the process, additional keyword arguments can be passed using f_kw.

  • function can have parameter default values,

  • function can be a local function returned by whatever other function. This setup processs will not be differentiated.

  • function can also have a decorator, which will be differentiated.

It may in some cases be required, and it is no problem, to create additional toplevel functions that can be given to this function, for example to wrap a lambda expression.

It should not be required to build extra toplevel functions to inject global variables into the scope, since function can be a local function already. It will have access to the parent scopes as usual, but the values in it are treated as global values with zero derivative.

However, when a function returning a function is called within function, then this entire process, including possible calls to the result later, will be differentiated.

Parameters:
  • function (function) – Function to differentiate. Must be a regular function, defined with def, can be a local function.

  • args (list) – Function arguments. function will be differentiated with respect to all arguments or to those listed by active.

  • active (list or str) – Active arguments, like [0,1], [‘x’, ‘y’], or a comma-separated string like ‘x,y’. The empty list or string means all arguments. What actually happens is that a local function of only the active args, calling function, is generated by mkActArgFunction() and that is differentiated instead.

  • seed (1 or list) – Seed, derivative directions. When seed == 1, all derivative directions are computed. When seed is a list, then each entry must be a list or array of the same size as the total length of the active arguments. The function nvars() can compute that value.

  • f_kw (dict) – Further keyword arguments that will be passed to function as **f_kw. This wraps function with mkKwFunction().

  • opts (dict) – Further options opts including also verbose, dump and dumpdir are stored in a global variable transformOpts. These global options are added to the options of doDiffFunction() by each call to D() in the subsequent process.

Returns:

A tuple of the derivative and the function result. The derivative is a list with as many entries as there were seed directions.

Return type:

tuple

A similar result cam be obtained for comparison with DiffFD().

pyadi.DiffFD(f, *args, active=[], seed=1, h=1e-08, f_kw={}, **opts)[source]

Evaluate derivatves using central finite differences.

The function f is called two times for each derivative direction provided by seed, to evaluate a central finite difference with step size h. The function f is called once more to compute the original result.

Parameters:
  • f (function) – Function to differentiate.

  • args (list) – Function arguments.

  • h (float) – Step size, default 1e-8

  • active (list or str) – Active arguments, like [0,1], [‘x’, ‘y’], or a comma-separated string like ‘x,y’. The empty list or string means all arguments. What actually happens is that a local function of only the active args, calling function, is generated by mkActArgFunction() and that is differentiated instead.

  • seed (1 or list) – Seed, derivative directions. When seed == 1, all derivative directions are computed. When seed is a list, then each entry must be a list or array of the same size as the total length of the active arguments. The function nvars() can compute that value.

  • f_kw (dict) – Further keyword arguments that will be passed to f as **f_kw.

  • opts (dict) – Options, not used.

Returns:

A tuple of the derivative and the function result. The derivative is a list with as many entries as there were seed directions.

Return type:

tuple

When the source transformation fails with NoSource, the function setrule() must be used, maybe together with getrule() and delrule(), to install a runtime handler for the function in question into the default rule module forwardad.

pyadi.setrule(func, adfunc, mode='D')[source]

Install a runtime handler adfunc for func with mode.

This is a function that gets called with the current original and derivative parameters in the form adfunc(*args, **kw), where args is a list of 2*N arguments, derivative and original arguments interleaved.

Depending on mode the adfunc will be handled differently, in the first two cases a wrapper function is produced:

  • When mode='D', the handler will be called as adfunc(r, *args, **kw), where r is the function result, and has to return just the derivative d_r.

  • When mode='Dkw', the handler will be called as adfunc(r, d_kw, *args, **kw), where in addition d_kw, kw is the result of unjnd().

  • When mode='E', then the handler will be called directly as adfunc(*args, **kw), and has to return a tuple.

pyadi.getrule(func, mode='D')[source]

Return the runtime handler adfunc for func with mode. See setrule() for details.

Returns:

The currently installed handler.

Return type:

function

pyadi.delrule(func, mode='D')[source]

Delete the runtime handler for func with mode. See setrule() for details.

Returns:

The currently installed handler.

Return type:

function

exception pyadi.NoSource[source]

When no handler is found, AD will be performed behind the scenes, of which the results can be shown by using the parameters verbose or dump with DiffFor().

The main construct is the operator D(), which produces a differentiated function for any given function.

pyadi.D(function, **opts)

An alias for DiffFunction() so the generated code can be shorter.

The function py() produces the source code of the given function as a string, which is done by getting the AST with getast() and unparsing it with unparse(), which uses a slightly modified version of astunparse(), available at https://github.com/aiandit/astunparse/archive/refs/heads/main.zip, that is installed automatically via the requirements.txt file.

pyadi.py(func, info=False)[source]
pyadi.getast(func, **kw)[source]

The source transformation is implemented as a series of tree traversals, defined by the low-level AST transforming function differentiate(), and the more high-level function doSourceDiff(), the result of which can be seen with Dpy(). The result is an actual AST, of type astunparse.ASTNode, which has a str() converter that calls unparse(), while the repr() converter calls astunparse.unparse2j().

pyadi.Dpy(func, active=[], **kw)[source]
pyadi.differentiate(intree, activef=None, active=None, modules=None, filter=False, prefix=None, **kw)[source]
pyadi.doSourceDiff(function, opts)[source]

A crucial function is dzeros(), it is called in many situations by the differentiated code and the results of it are very important in several respects. For example, generic objects are not cloned, but all their fields are set to zeros. Arrays from numpy are cloned with zeros(), however. Which means, should a third type behave as zeros(), you would have to overwrite dzeros() (not yet possible). It is however not required to import dzeros() into your code, and likewise not the other runtime functions unzd(), joind(), and unjnd(), which, in addition to D() and Dc(), the differentiated code makes use of. For this reason, only dzeros() is exported, it may be useful when calling the functions produced by D() directly.

pyadi.dzeros(args, lev=0)[source]
pyadi.joind(ddl, dl)[source]
pyadi.unzd(d)[source]
pyadi.unjnd(d)[source]

An important function behind the scenes is also isbuiltin(), which basically checks if source code is available for the given object, class, or function. It does so by calling getmodule(), which returns the module file name

pyadi.isbuiltin(func)[source]
pyadi.getmodule(func)[source]

The heavy lifting for getast(), in the sense that there is a cache of Python modules, and their one, entire, master AST, and a mechanism to return clones of given functions or classes, is done by getmoddict() and resolveImports().

pyadi.astvisitor.getmoddict(mod, **opts)[source]

Return dictionary with ASTs of module classes and functions. Resolve from imports with resolveImports().

pyadi.astvisitor.resolveImports(mod, modfile, moddict, imports, modules, **opts)[source]

There are a couple of helper objects like, DWith which is a context manager that simply delegates to a tuple of context managers.

pyadi.DWith(*args, **kw)[source]

The function resolution in PyADi is entrirely generic, the behaviour of forwardad can by overwritten by any other python module, as for example by dummyad, where the first item of each of the tuples being past around is just any value that works.

The resulting functions can also be augmented, or decorated, by further rule modules that (maybe) compose an additional wrapper function of the function that the recieve from inner layers. Such modules are of the same structure but they do not interfere in the actual computations, for example timing and trace.