Basic usage¶
PyADi performs first order automatic differentiation via source transformation.
Demos¶
Demo demo_babylonian¶
This module contains a function, fbabylonian()
which takes a
single argument x
and returns the square root of x
, the same
as the builtin sqrt()
.
- examples.demo_babylonian.run()[source]¶
We alias our function to
f
and define a starting valuex0
:f = fbabylonian x0 = 16
Then we are already good to go with calling
DiffFor()
:dr1, r1 = pyadi.DiffFor(f, x0, verbose=0) print(f'dr = {dr1}, r = {r1}')
Since in general we will not know the correct result directly, it is always a good idea to compare against finite differences, at least once in the beginning, with
DiffFD()
:dr_fd1, r1 = pyadi.DiffFD(f, x0) print(f'dr_fd = {dr_fd1}, r = {r1}')
This produces the following output:
x0 = 16 dr = [0.12500000000005562], r = 4.000000000000051 dr_fd = [0.12500001034254637], r = 4.000000000000051
When getting started with a new funcion, or for deeper insights, it may also be interesting to set
verbose
to some integer > 0. With verbose=1, information on the differentiation process is printed:x0 = 16 Load and parse module __main__ source from /home/dev/src/projects/gh/pyadi/tests/examples/demo_babylonian.py: 3.3 ms AD function produced for __main__.fbabylonian: d_fbabylonian AD function produced for __main__.gbabylonian: d_gbabylonian Timer fbabylonian adrun: 1.43 ms AD factor fbabylonian: 1.43 ms / 5.72 µs = 249.33 dr = [0.12500000000005562], r = 4.000000000000051 dr_fd = [0.12500001034254637], r = 4.000000000000051 Timer fbabylonian adrun: 60.80 µs AD factor fbabylonian: 60.80 µs / 4.53 µs = 13.42
With verbose=2, the differentiated function codes will also be printed, which for example for our function looks like this:
def d_fbabylonian(d_x, x): (d_r, r) = D(gbabylonian)((d_x, x)) return (d_r, r)
The function calls another function
gbabylonian()
directly. TheD()
operator will proceed to differentiate it whend_fbabylonian
is first called, as indicated by the messages above. For this reason the reported runtime of the first call and the AD runtime factor are relatively large. Obviously the result of the process will be cached, so the subsequent call executes much faster already, as seen in the last line.The
verbose=1
option produces these timing messages because optiontimings
defaults toTrue
. This not only to show off, but also to catch a common error, which is that something is not correct about the function, the arguments etc. These are detected by trying to run the function first.To get the best performance you should of course set
timings
toFalse
. Then, even with verbose enabled, the second call toDiffFor()
produces no output at all.As seen from the FD derivative values printed, this method inherently produces results that are only correct up to half the number of digits. The AD results however, should be correct up to the machine precision.
Demo demo_numpy¶
This demos shows that using numpy instead of plain floats works as
well. We must however redefine our function slightly, because the stop
condition does not work with arrays. The result is
gbabylonian()
.
- examples.demo_numpy.run()[source]¶
We alias our function to
f
and define a starting valuex0
:f = fbabylonian x0 = np.array([1,2,3,4,5], dtype=float)
When calling
DiffFor()
now, the AD function will be run five times automatically, and the resultdr
is a list of length five:dr1, r1 = pyadi.DiffFor(f, x0, verbose=0)
We compare with
DiffFD()
as well:dr_fd1, r1 = pyadi.DiffFD(f, x0)
Printing this output would by rather lengthy, and it contains a lot of zeros. The function is vector valued, so it does not make sense to iterate over all the inputs, we can get the derivatives as a vector using the
seed
parameter:dr2, r2 = pyadi.DiffFor(f, x0, verbose=0, seed=[numpy.ones(5)]) dr_fd2, r2 = pyadi.DiffFD(f, x0, seed=[numpy.ones(5)])
The seed is a list, and each item defines a derivative direction, the AD function will run once for each item. The output is then:
x0 = [16. 2. 3. 4. 5.] r0 = [4.000000000000051 1.414213562373095 1.732050807568877 2. 2.23606797749979 ] dr = [array([0.125000000000056, 0.353553390593274, 0.288675134594813, 0.25 , 0.223606797749979])] dr_fd = [array([0.125000010342546, 0.353553386567285, 0.288675128246041, 0.250000009582863, 0.223606799742981])]
When we activate
verbose=1
here, the output is something like this:x0 = [16. 2. 3. 4. 5.] r0 = [4.000000000000051 1.414213562373095 1.732050807568877 2. 2.23606797749979 ] Load and parse module __main__ source from /home/dev/src/projects/gh/pyadi/tests/examples/demo_numpy.py: 4.1 ms AD function produced for __main__.fbabylonian: d_fbabylonian AD function produced for __main__.gbabylonian: d_gbabylonian Timer fbabylonian adrun: 1.55 ms AD factor fbabylonian: 1.55 ms / 79.87 µs = 19.43 Timer fbabylonian adrun: 184.54 µs AD factor fbabylonian: 184.54 µs / 79.87 µs = 2.31 Timer fbabylonian adrun: 152.11 µs AD factor fbabylonian: 152.11 µs / 79.87 µs = 1.90 Timer fbabylonian adrun: 152.11 µs AD factor fbabylonian: 152.11 µs / 79.87 µs = 1.90 Timer fbabylonian adrun: 151.16 µs AD factor fbabylonian: 151.16 µs / 79.87 µs = 1.89 Timer fbabylonian adrun: 146.63 µs AD factor fbabylonian: 146.63 µs / 68.19 µs = 2.15 dr = [array([0.125000000000056, 0.353553390593274, 0.288675134594813, 0.25 , 0.223606797749979])] dr_fd = [array([0.125000010342546, 0.353553386567285, 0.288675128246041, 0.250000009582863, 0.223606799742981])]
The AD factors are a lot better than what we saw in the first example
demo_babylonian
. From the first AD factor reported, which includes the differentiation of the inner function call togbabylonian()
, we can conclude that the source transformation time is essentially the same. Since the function now takes longer, becausenumpy
kicks in, the first AD factor is much lower already. Then the caching and the warming up leads to quite reasonable AD factors of sometimes even less than two. Of course this is still a small example, but this can be seen as an even harder case for AD, as the whole code is still interpreter-bound, the actual float ops play a very small role until we increase the vector length to 1000 at least.
Demo demo_directional¶
Let us explore in more detail how to use the seed
option and what
a directional derivative is, as the readers will certainly also
remember from school.
- examples.demo_directional.run()[source]¶
The vector function
fbabylonian()
is aliased tof
and the starting valuex0
is a short vevtor:x0 = np.array([1,2,3,4,5], dtype=float)
We saw in
demo_numpy
that passing a derivative direction allows us to get all derivatives of this function as a single vector:dx = numpy.ones(5)
How does this work? The AD process, running a differentiated function once, returns a single directional derivative. This is because we can chose one set of values for the derivative arguments, which are of the same type as the original arguments. The items in
seed
define those values, once for each direction:dr2, r2 = pyadi.DiffFor(f, x0, verbose=0, seed=[dx]) dr_fd2, r2 = pyadi.DiffFD(f, x0, seed=[dx])
The result of this is exactly a directional derivative d/dx f(x) evaluated at the point x along the derivative direction dx, which is defined as d / dt f(x + t * dx) evaluated at the point t=0.
This means in particular, it can be expressed as the derivative of a function that has a single scalar float parameter
t
. We can try this just out by defining a suitable functionfoft
:def foft(t): return f(x0 + t * dx)
When we evaluate this function at
t=0
we getf(x)
and when we evaluate the derivatives at that point we get the directional derivatives off
:dr2, r2 = pyadi.DiffFor(foft, 0, verbose=2) dr_fd2, r2 = pyadi.DiffFD(foft, 0)
PyADi will differentiate
foft
to the following code, were the nonlocal valuesx0
anddx
are treated as having no derivative, which produces the calls todzeros()
:def d_foft(d_t, t): d_t_c2 = ((d_t * dx) + (t * dzeros(dx))) t_c2 = (t * dx) return D(f)(((dzeros(x0) + d_t_c2), (x0 + t_c2)))
We now have two different ways to compute the same derivatives, which we can compare with each other, Again, the FD method will inherently produce results that are only correct up to half the number of digits. The AD results should be correct up to the machine precision.
The two FD results among each other will also be perfectly identical because the same float operations are performed in either case. For less well-behaved functions it may happen, however, that the directional derivative evaluated with FD is not the same as the full Jacobian evaluated with FD multiplied by
dx
. For AD these two will of course also be exactly identical.
Demo demo_args¶
This demo module shows how to pass additional parameters to the function while differentiating only w.r.t. the first.
- examples.demo_args.run()[source]¶
The target function is f, alias
gbabylonian()
.This function has two parameters with defaults, so it can be called with one, two, or three positional arguments.
Two things are important to remember here:
parameters with defaults are often passed via keyword arguments. This however does not work with the entry point function
DiffFor()
, as it will consider all keyword arguments options. So don’t do this:# wrong: tol is now an option to DiffFor dr1, r1 = pyadi.DiffFor(f, x0, tol=1e-15, active=[0])
In this case, we want to set the
tol
parameter, which is the third, so we also have to pass the second:# correct: use only positional arguments dr1, r1 = pyadi.DiffFor(f, x0, 1, 1e-15, active=[0])
we want to continue to differentiate w.r.t. to
x
only, so we use theactive
option, as seen above.
It is also possible to use the option
f_kw
to pass a dictionary as keywords tof
, in which case,active
would not be needed any more:# also correct: use f_kw for keyword arguments dr1, r1 = pyadi.DiffFor(f, x0, f_kw=dict(tol=1e-15))
Note that
gbabylonian()
itself does use keyword arguments normally. It is just the entry point functionDiffFor()
where thef_kw
mechanism is applied.Running this function produces the following output:
x0 = 12.4 dr = [0.1419904585617669], r = 3.5213633723318023 dr_fd = [0.1419904638311209], r = 3.5213633723318023 F error: 3.552713678800501e-15 AD/FD error: 5.269354008685667e-09 dr = [0.14199045856176618], r = 3.521363372331802 dr_fd = [0.1419904638311209], r = 3.521363372331802 F error: 0.0 AD/FD error: 5.269354730330633e-09 dr = [0.1422608716588418], r = 3.5219794497246726 dr_fd = [0.1422608919554591], r = 3.5219794497246726 F error: 0.004339244282906662 AD/FD error: 2.029661730351684e-08
This shows, the function varies in precision with the square of the tolerance threshold, but the AD vs. FD error remains roughly the same.
Demo demo_setrule¶
Show usage of setrule()
to override a function’s automatic
derivative, in this case. The same procedure must be used to provide
derivatives for function that do not have Python source code. This
concerns numpy
in particular.
- examples.demo_setrule.run()[source]¶
Use
setrule()
to override the derivative ofgbabylonian()
.Prints as output:
x0 = 12.4 dr = [0.1419904585617669], r = 3.5213633723318023 Rule called dr2 = [0.14199045856176618], r = 3.5213633723318023
Differentiation¶
The main entry point to PyADi is DiffFor()
which accepts a
function with positional arguments, and evalutes the derivative at the
point given by the arguments.
- pyadi.DiffFor(function, *args, seed=1, active=[], f_kw=None, timings=True, verbose=0, dump=0, dumpdir='dump', **opts)[source]
Differentiate
function
and compute first-order derivatives evaluated at*args
and**f_kw
, w.r.t. all floats inargs
, possibly restricted byàctive
.Differentiate function
function(*args)
with forward mode AD to produceadfun
. This function is the main entry point to start the differentiation process. This function basically does the following:Differentiate
function
withDiffFunction()
aliasD()
Create one set of derivative arguments
dx = dzeros(args)
usingdzeros()
For each seeddir in seed, initialize
dx
with seeddir usingfill()
and calladfun
withdx
andargs
appropriately.
The result is a tuple of the list of the derivative results thus produced, and the function result.
Although PyADi supports almost the full set of Python language features including keyword arguments, lambda functions, etc. the
function
given here must adhere to some restrictions:function
must be a regular Python function, defined withdef
, not a lambda expression.
This function only processes the positional arguments
args
and considers all keyword arguments as options to the process, additional keyword arguments can be passed usingf_kw
.function
can have parameter default values,function
can be a local function returned by whatever other function. This setup processs will not be differentiated.function
can also have a decorator, which will be differentiated.
It may in some cases be required, and it is no problem, to create additional toplevel functions that can be given to this function, for example to wrap a lambda expression.
It should not be required to build extra toplevel functions to inject global variables into the scope, since
function
can be a local function already. It will have access to the parent scopes as usual, but the values in it are treated as global values with zero derivative.However, when a function returning a function is called within
function
, then this entire process, including possible calls to the result later, will be differentiated.- Parameters:
function (function) – Function to differentiate. Must be a regular function, defined with
def
, can be a local function.args (list) – Function arguments.
function
will be differentiated with respect to all arguments or to those listed byactive
.active (list or str) – Active arguments, like [0,1], [‘x’, ‘y’], or a comma-separated string like ‘x,y’. The empty list or string means all arguments. What actually happens is that a local function of only the active
args
, callingfunction
, is generated bymkActArgFunction()
and that is differentiated instead.seed (1 or list) – Seed, derivative directions. When seed == 1, all derivative directions are computed. When seed is a list, then each entry must be a list or array of the same size as the total length of the active arguments. The function
nvars()
can compute that value.f_kw (dict) – Further keyword arguments that will be passed to
function
as**f_kw
. This wrapsfunction
withmkKwFunction()
.opts (dict) – Further options
opts
including also verbose, dump and dumpdir are stored in a global variabletransformOpts
. These global options are added to the options ofdoDiffFunction()
by each call toD()
in the subsequent process.
- Returns:
A tuple of the derivative and the function result. The derivative is a list with as many entries as there were seed directions.
- Return type:
A similar result cam be obtained for comparison with
DiffFD()
.
- pyadi.DiffFD(f, *args, active=[], seed=1, h=1e-08, f_kw={}, **opts)[source]
Evaluate derivatves using central finite differences.
The function f is called two times for each derivative direction provided by seed, to evaluate a central finite difference with step size h. The function f is called once more to compute the original result.
- Parameters:
f (function) – Function to differentiate.
args (list) – Function arguments.
h (float) – Step size, default 1e-8
active (list or str) – Active arguments, like [0,1], [‘x’, ‘y’], or a comma-separated string like ‘x,y’. The empty list or string means all arguments. What actually happens is that a local function of only the active
args
, callingfunction
, is generated bymkActArgFunction()
and that is differentiated instead.seed (1 or list) – Seed, derivative directions. When seed == 1, all derivative directions are computed. When seed is a list, then each entry must be a list or array of the same size as the total length of the active arguments. The function
nvars()
can compute that value.f_kw (dict) – Further keyword arguments that will be passed to
f
as**f_kw
.opts (dict) – Options, not used.
- Returns:
A tuple of the derivative and the function result. The derivative is a list with as many entries as there were seed directions.
- Return type:
When the source transformation fails with NoSource
, the
function setrule()
must be used, maybe together with
getrule()
and delrule()
, to install a runtime
handler for the function in question into the default rule module
forwardad
.
- pyadi.setrule(func, adfunc, mode='D')[source]
Install a runtime handler
adfunc
forfunc
withmode
.This is a function that gets called with the current original and derivative parameters in the form
adfunc(*args, **kw)
, whereargs
is a list of 2*N arguments, derivative and original arguments interleaved.Depending on
mode
theadfunc
will be handled differently, in the first two cases a wrapper function is produced:When
mode='D'
, the handler will be called asadfunc(r, *args, **kw)
, wherer
is the function result, and has to return just the derivatived_r
.When
mode='Dkw'
, the handler will be called asadfunc(r, d_kw, *args, **kw)
, where in additiond_kw, kw
is the result ofunjnd()
.When
mode='E'
, then the handler will be called directly asadfunc(*args, **kw)
, and has to return a tuple.
- pyadi.getrule(func, mode='D')[source]
Return the runtime handler
adfunc
forfunc
withmode
. Seesetrule()
for details.- Returns:
The currently installed handler.
- Return type:
function
- pyadi.delrule(func, mode='D')[source]
Delete the runtime handler for
func
withmode
. Seesetrule()
for details.- Returns:
The currently installed handler.
- Return type:
function
- exception pyadi.NoSource[source]
When no handler is found, AD will be performed behind the scenes, of
which the results can be shown by using the parameters verbose
or
dump
with DiffFor()
.
The main construct is the operator D()
, which produces a
differentiated function for any given function.
- pyadi.D(function, **opts)
An alias for
DiffFunction()
so the generated code can be shorter.
The function py()
produces the source code of the given
function as a string, which is done by getting the AST with
getast()
and unparsing it with unparse()
, which
uses a slightly modified version of astunparse()
, available
at https://github.com/aiandit/astunparse/archive/refs/heads/main.zip,
that is installed automatically via the requirements.txt
file.
- pyadi.py(func, info=False)[source]
- pyadi.getast(func, **kw)[source]
The source transformation is implemented as a series of tree
traversals, defined by the low-level AST transforming function
differentiate()
, and the more high-level function
doSourceDiff()
, the result of which can be seen with
Dpy()
. The result is an actual AST, of type
astunparse.ASTNode
, which has a str()
converter
that calls unparse()
, while the repr()
converter calls astunparse.unparse2j()
.
- pyadi.Dpy(func, active=[], **kw)[source]
- pyadi.differentiate(intree, activef=None, active=None, modules=None, filter=False, prefix=None, **kw)[source]
- pyadi.doSourceDiff(function, opts)[source]
A crucial function is dzeros()
, it is called in many
situations by the differentiated code and the results of it are very
important in several respects. For example, generic objects are not
cloned, but all their fields are set to zeros. Arrays from
numpy
are cloned with zeros()
,
however. Which means, should a third type behave as
zeros()
, you would have to overwrite dzeros()
(not yet possible). It is however not required to import
dzeros()
into your code, and likewise not the other runtime
functions unzd()
, joind()
, and unjnd()
,
which, in addition to D()
and Dc()
, the
differentiated code makes use of. For this reason, only
dzeros()
is exported, it may be useful when calling the
functions produced by D()
directly.
- pyadi.dzeros(args, lev=0)[source]
- pyadi.joind(ddl, dl)[source]
- pyadi.unzd(d)[source]
- pyadi.unjnd(d)[source]
An important function behind the scenes is also isbuiltin()
,
which basically checks if source code is available for the given
object, class, or function. It does so by calling
getmodule()
, which returns the module file name
- pyadi.isbuiltin(func)[source]
- pyadi.getmodule(func)[source]
The heavy lifting for getast()
, in the sense that there is a
cache of Python modules, and their one, entire, master AST, and a
mechanism to return clones of given functions or classes, is done by
getmoddict()
and resolveImports()
.
- pyadi.astvisitor.getmoddict(mod, **opts)[source]
Return dictionary with ASTs of module classes and functions. Resolve
from
imports withresolveImports()
.
- pyadi.astvisitor.resolveImports(mod, modfile, moddict, imports, modules, **opts)[source]
There are a couple of helper objects like, DWith
which is
a context manager that simply delegates to a tuple of context
managers.
The function resolution in PyADi is entrirely generic, the behaviour
of forwardad
can by overwritten by any other python module,
as for example by dummyad
, where the first item of each of the
tuples being past around is just any value that works.
The resulting functions can also be augmented, or decorated, by
further rule modules that (maybe) compose an additional wrapper
function of the function that the recieve from inner layers. Such
modules are of the same structure but they do not interfere in the
actual computations, for example timing
and
trace
.