Forum: >>> Magnum BBS <<<

Higher Order Logic Programming and Autograd

From Mild Shock@21:1/5 to All on Sat Mar 15 16:35:51 2025

Somehow I shied away from implementing call/n for
my new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
sys_callable_cacheable(C, D),
sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
call(C, X, Y),
sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to Mild Shock on Sat Mar 15 16:36:48 2025

What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example

to layered neural networks by
representing them as:

/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/

?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].

The model evaluation at a data point
is straight forward:

eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
matmul(M, V, H),
vecact(H, expit, J),
eval([J|L], R).

The backward calculation of deltas
is straight forward:

back([V], U, [D]) :- !,
vecact(U, V, sub, E),
vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
back([W|L], U, [D|R]),
mattran(M, M2),
matmul(M2, D, E),
vecact(E, V, mulderiv, D2).

You can use this to compute weight changes
and drive a gradient algorithm.

Mild Shock schrieb:

Somehow I shied away from implementing call/n for
my new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
   sys_callable_cacheable(C, D),
   sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
   call(C, X, Y),
   sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
    % CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to Mild Shock on Sat Mar 15 16:37:37 2025

But where is Autograd, automatic derivation from
some symbolic input? In general you can objectify
neural networks which I already did with the Prolog

list, and routines such as back/3 are pure Prolog.
Basically you could symbolically derive expit
(activation), mulderiv (the product with the derivative

of the activation) and matrran (the jacobian without
activation) from a DAG of vector functions. In a linear
neural network, the jacobian without activation is

the same as the weights, and expit has a simple derivative
that is based on the expit result itself which is
already stored as the activation:

/* g(x) = logistic function */
expit(X, Y) :- Y is 1/(1+exp(-X)).

/* g'(x) = g(x)*(1-g(x)) */
mulderiv(X, Y, Z) :- Z is X*Y*(1-Y).
See also:

A Gentle Introduction to torch.autograd https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

Mild Shock schrieb:

What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example

to layered neural networks by
representing them as:

/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/

?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].

The model evaluation at a data point
is straight forward:

eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
   matmul(M, V, H),
   vecact(H, expit, J),
   eval([J|L], R).

The backward calculation of deltas
is straight forward:

back([V], U, [D]) :- !,
   vecact(U, V, sub, E),
   vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
   back([W|L], U, [D|R]),
   mattran(M, M2),
   matmul(M2, D, E),
   vecact(E, V, mulderiv, D2).

You can use this to compute weight changes
and drive a gradient algorithm.

Mild Shock schrieb:

Somehow I shied away from implementing call/n for
my new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
    sys_callable_cacheable(C, D),
    sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
    call(C, X, Y),
    sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
     % CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
     maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to Mild Shock on Sun Mar 16 23:00:47 2025

Ok some progress report here. I have currently a
library(linear) in the working which is only a few
lines of code, but it provides vectors and matrixes.
One can use the library to define matrix exponentiation:

matexp(M, 1, M) :- !.
matexp(M, N, R) :- N mod 2 =:= 0, !,
I is N // 2,
matexp(M, I, H),
matmul(H, H, R).
matexp(M, N, R) :-
I is N-1,
matexp(M, I, H),
matmul(H, M, R).

And then do fancy stuff like answering the question
what are the last 8 digits of fibonacci(1000000):

?- time((fib(1000000, _X), Y is _X mod 10^8)).
% Zeit 28 ms, GC 0 ms, Lips 88857, Uhr 16.03.2025 22:48
Y = 42546875

The 28 ms execution time are not bad, since modulo was not
integrated into matexp/3, making it to compute the full
fibonacci(1000000) before taking the modulo. Not sure whether
JavaScript bigint is faster or slower than GMP ?

So what can we do with library(linear) besides implementing
eval/3 and back/3 ? We can finally update a neural network
and do this iteratively. Using a very simple random pick
to choose some training data sample:

update([V], _, [V]) :- !.
update([V,M|L], [_,M3|R], [V,M4|S]) :-
maplist(maplist(compose(add,mul(0.1))), M3, M, M4),
update(L, R, S).

iter(0, _, N, N) :- !.
iter(I, Z, N, M) :-
random(R), K is floor(R*4)+1,
call_nth(data(Z, X, Y), K),
eval(N, X, U),
back(U, Y, V),
update(U, V, W),
J is I-1,
iter(J, Z, W, M).

Disclaimer: This is only a proof of concept. It mostlikely
doesn’t have all the finess of Python torch.autograd. Also
it uses a very simple update of the weights via μ Δwij with
μ = 0.1. But you can already use it to learn an AND

or to learn an XOR.

Mild Shock schrieb:

Somehow I shied away from implementing call/n for
my new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
   sys_callable_cacheable(C, D),
   sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
   call(C, X, Y),
   sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
    % CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to Mild Shock on Sun Mar 16 23:10:55 2025

What made me do the lillte prototype? Try this one,
it has a little Java code. But its a little ancient
technologie using the sigmoid activation function. And
it seems to me it uses some graph datastructure:

Neural Networks
Rolf Pfieffer et al. - 2012

https://www.ifi.uzh.ch/dam/jcr:00000000-7f84-9c3b-ffff-fffffb34b58a/NN20120315.pdf

I guess it corresponds to this here, which is a SWI-Prolog and C
hybrid, when using FANN_SIGMOID:

FANN - Fast Artificial Neural Network
Package for SWI-Prolog - 2018
https://www.swi-prolog.org/pack/list?p=plfann

Translating the Java code to Prolog from the Pfeiffer
paper into linear algebra using vectors and matrixes, I
have now a little piece of pure Prolog code, that runs

also in the Browser, that can already learn an
AND, and its using the ReLU activation function,
i.e. not the FANN_SIGMOID activation function anymore.

I simulated the bias by an extra input neuron
which is always 1, because I was to lazy to have
bias in the model. Sample output:

A -- 0.99 ---\
\
B -- 0.99 -----+-- ReLu -->
/
1 -- -0.98 --/

It can als learn an XOR. Libraries such as PyTorch
cooperate with optimizer libraries that provide a
variety of gradient search methods. One needs

to study how these library are architectured so that
they provide plug and play. Maybe can bring the same
architecture to Prolog:

A Gentle Introduction to torch.autograd

Next, we load an optimizer, in this case SGD with a
learning rate of 0.01 and momentum of 0.9. We register all
the parameters of the model in the optimizer.

optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)

https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

Mild Shock schrieb:

Ok some progress report here. I have currently a
library(linear) in the working which is only a few
lines of code, but it provides vectors and matrixes.
One can use the library to define matrix exponentiation:

matexp(M, 1, M) :- !.
matexp(M, N, R) :- N mod 2 =:= 0, !,
   I is N // 2,
   matexp(M, I, H),
   matmul(H, H, R).
matexp(M, N, R) :-
   I is N-1,
   matexp(M, I, H),
   matmul(H, M, R).

And then do fancy stuff like answering the question
what are the last 8 digits of fibonacci(1000000):

?- time((fib(1000000, _X), Y is _X mod 10^8)).
% Zeit 28 ms, GC 0 ms, Lips 88857, Uhr 16.03.2025 22:48
Y = 42546875

The 28 ms execution time are not bad, since modulo was not
integrated into matexp/3, making it to compute the full
fibonacci(1000000) before taking the modulo. Not sure whether
JavaScript bigint is faster or slower than GMP ?

So what can we do with library(linear) besides implementing
eval/3 and back/3 ? We can finally update a neural network
and do this iteratively. Using a very simple random pick
to choose some training data sample:

update([V], _, [V]) :- !.
update([V,M|L], [_,M3|R], [V,M4|S]) :-
   maplist(maplist(compose(add,mul(0.1))), M3, M, M4),
   update(L, R, S).

iter(0, _, N, N) :- !.
iter(I, Z, N, M) :-
   random(R), K is floor(R*4)+1,
   call_nth(data(Z, X, Y), K),
   eval(N, X, U),
   back(U, Y, V),
   update(U, V, W),
   J is I-1,
   iter(J, Z, W, M).

Disclaimer: This is only a proof of concept. It mostlikely
doesn’t have all the finess of Python torch.autograd. Also
it uses a very simple update of the weights via μ Δwij with
μ = 0.1. But you can already use it to learn an AND

or to learn an XOR.

Mild Shock schrieb:

Somehow I shied away from implementing call/n for
my new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
    sys_callable_cacheable(C, D),
    sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
    call(C, X, Y),
    sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
     % CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
     maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Jack
  Sun Jun 8 06:41:19 2025
  from Mississipi via Telnet
- Centurion
  Sun Jun 8 00:08:04 2025
  from Berea, Ohio via Telnet
- Centurion
  Sat Jun 7 21:52:22 2025
  from Berea, Ohio via Telnet
- Tnmoc
  Sat Jun 7 13:44:20 2025
  from Milton Keynes via Telnet
- Tnmoc
  Sat Jun 7 13:40:01 2025
  from Milton Keynes via Telnet
- Plume
  Sat Jun 7 11:13:29 2025
  from Uk via SSH
- Gwylbert
  Sat Jun 7 08:57:45 2025
  from Sydney, Nsw via Telnet
- Centurion
  Sat Jun 7 04:30:40 2025
  from Berea, Ohio via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	487
Nodes:	16 (2 / 14)
Uptime:	00:32:05
Calls:	9,660
Calls today:	2
Files:	13,709
Messages:	6,166,402

Higher Order Logic Programming and Autograd

Who's Online

Recent Visitors

System Info