Quantum Finite Automata and Quiver Algebras

Jeffreys, George; Lau, Siu-Cheong

doi:10.3390/psf2022005032

Open AccessProceeding Paper

Quantum Finite Automata and Quiver Algebras^†

by

George Jeffreys

^* and

Siu-Cheong Lau

Department of Mathematics and Statistics, Boston University, 111 Cummington Mall, Boston, MA 02215, USA

^*

Author to whom correspondence should be addressed.

^†

Presented at the 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France, 18–22 July 2022.

Phys. Sci. Forum 2022, 5(1), 32; https://doi.org/10.3390/psf2022005032

Published: 14 December 2022

(This article belongs to the Proceedings of The 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

We find an application in quantum finite automata for the ideas and results of [JL21] and [JL22]. We reformulate quantum finite automata with multiple-time measurements using the algebraic notion of a near-ring. This gives a unified understanding towards quantum computing and deep learning. When the near-ring comes from a quiver, we have a nice moduli space of computing machines with a metric that can be optimized by gradient descent.

Keywords:

quantum computing; measurement problem; quantum finite automata (QFA); machine learning; near-ring; quiver representations; deep learning; moduli space; noncommutative geometry

1. Motivation: QFA and Near-Ring

In quantum theory, the evolution of states is unitary. An observable is modeled by a self-adjoint operator whose eigenvalues are the possible output values, and whose eigenvectors form an orthonormal basis of the state space. As a result, a typical quantum model simply consists of linear operators that form an algebra.

However, when passing from the quantum world to the real world, an actual probabilistic projection to an eigenstate is necessary. Such a probabilistic operation destroys the linear structure, so we need a non-linear (meaning non-distributive) algebraic structure to accommodate such operators. In the 20th century, there were several attempts to solve this problem. See for instance [1,2], and [3] (Chapter 3) for a nice survey. In particular, Pascual Jordan attempted to use near-ring for quantum mechanics.

Let us consider the scenario of quantum computing.

Definition 1

([4]). A quantum finite automata (QFA) is a tuple

Q = (V, q_{0}, F, Σ, {(U_{σ})}_{σ \in Σ})

where:

1: V is a finite set of states which generate the Hilbert space $H_{V}$ ;
2: $F \subset V$ is a set of final or accept states;
3: $q_{0}$ the initial state which is a unit vector in $H_{V}$ ;
4: Σ is a finite set called the alphabet;
5: For each $σ \in Σ$ , $U_{σ}$ is a unitary operator on $H_{V}$ .

An input to a QFA consists of a word w in the alphabet

Σ

of the form

w = w_{1} w_{2} \dots w_{n}

where

w_{i} \in Σ

for all i. w acts on the initial state of the QFA by

〈 q_{0} | U_{w}

, where

U_{w}

is the matrix

U_{w} : = U_{w_{1}} U_{w_{2}} \dots U_{w_{n}}

, and

〈 q_{0} |

is the row vector presentation of

q_{0}

. The probability that word w will end in an accept state is

P r (w) = ∥ {〈 q_{0} | U_{w} P ∥}^{2}

where

P : H_{V} \to H_{F}

is the projection from

H_{V}

to subspace

H_{F}

spanned by F.

Please note that the above definition has not taken the probabilistic projection into account. We make the following reformulation.

Definition 2.

A quantum computing machine is a tuple

((H_{V}, h), H_{F}, e, ρ_{G}, σ^{F})

, where

1: $(H_{V}, h)$ is a Hermitian vector space;
2: $H_{F} = C^{n}$ equipped with the standard metric, which is called the framing space;
3: $e : H_{F} \to H_{V}$ is an isometric embedding;
4: $ρ_{G} : G \to U (H_{V}, h)$ is a unitary representation of a group G.
5: $σ^{F} : H_{F} \to H_{F}$ is a probabilistic projection.

Ignoring the last item (5) for the moment, this coincides with Definition 1 by setting

G = 〈 Σ 〉

, the free group generated by a set

Σ

, and fixing an initial vector

q_{0} \in H_{V}

.

Here, we treat

H_{F}

as a vector space of its own and take an isometric embedding

e : H_{F} \to H_{V}

, rather than directly identifying

H_{F}

as a subspace in

H_{V}

. The state space

H_{V}

is treated as an abstract vector space without a preferred basis, while

H_{F}

is equipped with a fixed basis that has a real physical meaning (like up/down spinning of an electron). The framing map

e : H_{F} \to H_{V}

is interpreted as a bridge between the classical and the quantum world; the image of the fixed basis under e determines a subset of pure state vectors of a certain observable. The adjoint

e^{*} : H_{V} \to H_{F}

is an orthogonal projection. In the next section, e is no longer required to be an embedding when we consider non-unitary generalizations for machine learning.

For the last item (5), the probabilistic projection

σ^{F} : H_{F} \to H_{F}

can be modeled by a probability space. Specifically, consider a

H_{F}

-family of random variables

k : H_{F} \times Ω \to {1, \dots, | F |}

where

Ω

is a probability space (that has a probability measure), with the assumption that

\Pr (k (v) = j) = 〈 \frac{v}{∥ v ∥}, ϵ_{j} 〉

for every

v \in H_{F}

, where

ϵ_{j} \in H_{F}

denotes the j-th basic vector. Then

σ^{F} (v) : = ϵ_{k (v)}

.

The major additional ingredients in Definition 2, compared to Definition 1, are

e, e^{*}

and

σ^{F}

. Please note that they are not yet included in the machine language, which is currently the group G. Since e and

σ^{F}

are not invertible, we cannot enlarge G to include e nor

σ^{F}

as a group.

To remedy this, first note that (4) can be replaced with an algebra rather than a group, which exhibits linearity and allows not being invertible. Specifically, we require instead:

(4’): $ρ_{A} : A \to End (H_{V})$ is an algebra homomorphism for an algebra A (with unit $1_{A}$ ).

For instance, A can be the free algebra generated by a set

Σ

.

With such a modification, we can easily include the framing e and

e^{*}

into our language by taking the augmented algebra

A = A 〈 1_{F}, e, e^{*} 〉 / R

(1)

where R is generated by the relations

1_{F} \cdot 1_{F} = 1_{F}, 1_{F} \cdot e^{*} = e^{*}, e \cdot 1_{F} = e, 1_{A} \cdot e = e, e \cdot e = 0, e^{*} \cdot e^{*} = 0, 1_{F} \cdot e = 0, 1_{F} \cdot a = 0, e \cdot 1_{A} = 0, e^{*} \cdot 1_{F} = 0, 1_{A} \cdot e^{*} = 0

for any

a \in A

. The unit of

A

is

1_{A} + 1_{F}

.

However, we cannot further enlarge

A

to include

σ^{F}

as an algebra. The reason for this is that

σ^{F}

always maps to unit vectors and cannot be linear:

σ^{F} (v + w) \neq σ^{F} (v) + σ^{F} (w) .

To extend

A

by

σ^{F}

which models actual quantum measurement, we need the notion of a near-ring. It is a set A with two binary operations +, ∘ such that A is a group under `+’, `∘’ is associative, and right multiplication is distributive over addition:

(x + y) \circ z = x \circ z + y \circ z

for all

x, y, z \in A

(but left multiplication is not required distributive:

z \circ (x + y) \neq z \circ x + z \circ y

).

Define

\tilde{A}

to be the near-ring

\tilde{A} : = (1_{F} + e^{*} \cdot A \cdot e) {σ^{F}} .

This near-ring can be understood as the language that controls quantum computing machines. Elements of

\tilde{A}

can be recorded as rooted trees. An example is

a_{1} σ^{F} \circ (a_{11} + a_{12})

where

a_{1}, a_{11}, a_{12} \in (1_{F} + e^{*} \cdot A \cdot e)

. See also the tree on the left-hand side of Figure 1.

The advantage of putting all the algebraic structures into a single near-ring is that we can consider all the quantum computing machines (mathematically

\tilde{A}

-modules) controlled by a single near-ring at the same time. An element of

\tilde{A}

is a quantum algorithm, which can run in all quantum computers controlled by

\tilde{A}

.

2. Near-Ring and Differential Forms

In the setting of Definition 1 and Example 1, it is natural to relax the representations from unitary groups to matrix algebras

gl (n, C)

. Moreover, the quantum measurement can also be simulated by a non-linear function (called activation function). Such a modification will produce a computational model of deep learning.

Definition 3.

An activation module consists of:

1: A noncommutative algebra A and vector space V, F;
2: A family of metrics $h_{(ρ, e)}$ on V over the space of framed A-modules

$R = {Hom}_{alg} (A, End (V)) \times Hom (F, V)$

which is G-equivariant where $G = GL (V)$ ;
(3): A collection of possibly non-linear functions

$σ_{j}^{F} : F \to F .$

As in (1), we take the augmented algebra

A = A 〈 1_{F}, e, e^{*} 〉 / R

which produces linear computations in all framed A-modules simultaneously. With item (3), elements in the near-ring

\tilde{A} : = (1_{F} + e^{*} \cdot A \cdot e) {σ_{1}^{F}, \dots, σ_{N}^{F}}

induce non-linear functions on F, and so they are called non-linear algorithms. An example of how

\tilde{A}

induces non-linear functions on F upon fixing a point in R is given in Example 1.

R = {Hom}_{alg} (A, End (V)) \times Hom (F, V)

is understood as a family of computing machines: a point

(w, e)

in R fixes how

A

acts on V and the framing map

e \in Hom (F, V)

, and hence entirely determines how an algorithm runs in the machine corresponding to

(w, e)

.

Let us emphasize that the state space V is basis-free. The family of metrics is

GL (V)

-equivariant:

h_{(ρ, e)} (v, w) = h_{g \cdot (ρ, e)} (g \cdot v, g \cdot w)

for any

g \in GL (V)

. Thus, given

a \in \tilde{A}

, the non-linear functions that a induces for the two machines

r \in R

and

g \cdot r \in R

equal to each other. In other words, an algorithm

a \in \tilde{A}

drives all machines parametrized by the moduli stack

[R / GL (V)]

to produce functions on F:

\tilde{A} \times [R / GL (V)] \to Map (F, F) .

(2)

As mentioned above, the advantage is that the single near-ring

\tilde{A}

controls all machines in

[R / GL (V)]

and for all V simultaneously (independent of

dim V

).

In [5], we formulated noncommutative differential forms on a near-ring

\tilde{A}

, which induce

Map (F, F)

-valued differential forms on the moduli

[R / GL (V)]

. It is extended from the Karoubi-de Rham complex [6,7,8,9] for algebras to near-rings. (2) above is the special case for 0-forms, which are simply elements in

\tilde{A}

. The cases of 0-forms and 1-forms are particularly important for gradient descent: recall that gradient of a function is the metric dual of the differential of that function.

Theorem 1

([5]). There exists a degree-preserving map

D R^{•} (\tilde{A}) \to {(Ω^{•} (R, Map (F, F)))}^{GL (V)}

which commutes with d on the two sides.

A differential form on

\tilde{A}

can be recorded as a form-valued tree, see the right-hand side of Figure 1. They are rooted trees whose edges are labeled by

ϕ \in D R^{•} ({Mat}_{F} (\hat{A}))

; leaves are labeled by

α \in \tilde{A}

; the root is labeled by 1 (if not a leaf); nodes which are neither leaves nor the root are labeled by the symbols

D_{σ_{ℓ}}^{(p)} |_{α}

that correspond to the p-th order symmetric differentials of

σ_{ℓ}

.

In application to machine learning, an algorithm

\tilde{γ} \in \tilde{A}

induces a 0-form of

\tilde{A}

, for instance

\int_{K} {|\tilde{γ} (x) - f (x)|}^{2} d x

(3)

for a given dataset encoded as a function

f : K \to R

. This 0-form and its differential induces the cost function and its differential on

[R / G]

, respectively, which are the central objects in machine learning.

The differential forms are G-equivariant by construction. There have been a lot of recent works in learning for input data set that has Lie group symmetry [10,11,12,13,14,15,16]. On the other hand, our work has focused on the internal symmetry of the computing machine.

In general, the existence of fine moduli is a big problem in mathematics: the moduli stack

[R / G]

may be singular and pose difficulties in applying gradient descent. Fortunately, if A is a quiver algebra, its moduli space of framed quiver representations

[R / G]

is a smooth manifold

M

(with respect to a chosen stability condition) [17]. This leads us to deep learning explained in the next section.

3. Deep Learning over the Moduli Space of Quiver Representations

An artificial neural network (see Figure 2 for a simple example) consists of:

a graph $Q = (Q_{0}, Q_{1})$ , where $Q_{0}$ is a (finite) set of vertices (neurons) and $Q_{1}$ is a (finite) set of arrows starting and ending in vertices in $Q_{0}$ (transmission between neurons);
a quiver representation of Q, which associates a vector space $V_{i}$ to each vertex i and a linear map $w_{a}$ (called weights) to each arrow a. We denote by $t (a)$ and $h (a)$ the tail and head of an arrow a, respectively.
a non-linear function $V_{i} \to V_{i}$ for each vertex i (called an activation function for the neuron).

Activation functions are an important ingredient for neural network; on the other hand, it rarely appears in quiver theory. Its presence allows the neural network to produce non-linear functions.

Remark 1.

In the recent past there has been rising interest in the relations between machine learning and quiver representations [5,18,19,20]. Here, we simply put quiver representation as a part of the formulation of an artificial neural network.

In many applications, the dimension vector

\vec{d} \in Z_{\geq 0}^{Q_{0}}

is set to be

(1, \dots, 1)

, i.e., all vector spaces

V_{i}

associated with the vertices are one-dimensional. For us, it is an unnecessary constraint, and we allow

\vec{d}

to be any fixed integer vector.

Any non-trivial non-linear function

V_{i} \to V_{i}

cannot be

GL (V_{i})

-equivariant. However, in quiver theory,

V_{i}

is understood as a basis-free vector space and requires

GL (V_{i})

-equivariance. We resolve this conflict between neural network and quiver theory in [19] using framed quiver representations. The key idea is to put the non-linear function on the framing rather than on the basis-free vector spaces

V_{i}

.

Combining with the setting of the last section (Definition 2), we take:

$A = C Q$ , the quiver algebra. Elements are formal linear combinations of paths in Q (including the trivial paths at vertices), and product is given by concatenation of paths.
$V = ⨁_{i} V_{i}$ , the direct sum of all vector spaces over vertices.
Each vertex is associated with a framing vector space $F_{i}$ . Then $F = ⨁_{i} F_{i}$ .
Each point $(w, e) \in R = {Hom}_{alg} (C Q, End (V)) \times ⨁_{i} Hom (F_{i}, V_{i})$ is a framed quiver representation. Specifically, $w \in {Hom}_{alg} (C Q, End (V))$ associates a matrix $w_{a}$ to each arrow a of Q; $e^{(i)} \in Hom (F_{i}, V_{i})$ are the framing linear maps.
The group G is taken to be $\prod_{i} GL (V_{i})$ . An element $g = {(g_{i})}_{i \in Q_{0}}$ acts on R by

$g \cdot ({(w_{a})}_{a \in Q_{1}}, {(e^{(i)})}_{i \in Q_{0}}) = ({(g_{h (a)} \cdot w_{a} \cdot g_{t (a)}^{- 1})}_{a \in Q_{1}}, {(g_{i} \cdot e^{(i)})}_{i \in Q_{0}}) .$
We have (possibly non-linear) maps $σ_{i} : F_{i} \to F_{i}$ for each vertex. To match the notation of Definition 2, $σ_{i}$ can be taken as maps $F \to F$ by extension by zero.

By the celebrated result of [17], we have a fine moduli space of framed quiver representations

M = M_{n, d} = R ⫽ G

, where

n, d

are the dimension vectors for the framing

{F_{i}}_{i \in Q_{0}}

and representation

{V_{i}}_{i \in Q_{0}}

, respectively. In particular, we have the universal vector bundles

V_{i}

over

M

, whose fiber over each framed representation

[w, e] \in M

is the representing vector space

V_{i}

over the vertex i.

V_{i}

plays an important role in our computational model, namely a vector

v \in V_{i}

over a point

[w, e] \in M

is the state of the i-th neuron in the machine parametrized by

[w, e]

.

Remark 2.

The topology of

M

is well understood by [21] as iterated Grassmannian bundles. Framed quiver representations and their doubled counterparts play an important role in geometric representation theory [22,23].

To fulfill Definition 2 (see Item (2)), we need to equip each

V_{i}

with a bundle metric

h_{i}

, so that the adjoint

e^{*}

makes sense. In [19], we have found a bundle metric that is merely written in terms of the algebra

A

. It means the formula works for (infinitely many) quiver moduli for all dimension vectors of representations simultaneously.

Theorem 2

([19]). For a fixed vertex

(i)

, let

ρ_{i}

be the row vector whose entries are all the elements of the form

w_{γ} e^{(t (γ))} : R_{n, d} \to Hom (C^{n_{t (γ)}}, C^{d_{i}})

such that

h (γ) = i

. Consider

H_{i} : = ρ_{i} ρ_{i}^{*} = \sum_{h (γ) = i} (w_{γ} e^{(t (γ))}) {(w_{γ} e^{(t (γ))})}^{*}

(4)

as a map

ρ_{i} ρ_{i}^{*} : R_{n, d} \to End (C^{d_{i}})

. Then

{(ρ_{i} ρ_{i}^{*})}^{- 1}

is

GL (d)

-equivariant and descends to a Hermitian metric on

V_{i}

over

M

.

Example 1.

Consider the network in Figure 2. The quiver has the arrows

a_{1, k}^{(1)}, a_{2, k}^{(1)}

for

k = 1, \dots, s

(between the input and hidden layers) and

a_{k}^{(2)}

(between the hidden and output layers). In application, we consider the algorithm

\tilde{γ} : = \sum_{k = 1}^{s} {\hat{a}}_{k}^{(2)} σ_{k} \circ \sum_{j = 1}^{2} {\hat{a}}_{j k}^{(1)} \in \tilde{A}

where

{\hat{a}}_{j k}^{(1)} : = {(e^{(j)})}^{*} a_{j k}^{(1)} e^{{in}_{j}}

and

{\hat{a}}_{k}^{(2)} : = {(e^{out})}^{*} a_{k}^{(2)} e^{(k)}

. Please note that the adjoints

{(e^{(j)})}^{*}

and

{(e^{out})}^{*}

are with respect to the metric

H_{i}

and

H_{out}

, respectively.

\tilde{γ}

is recorded by the activation tree on the left-hand side of Figure 1, for the case

s = 2

.

\tilde{γ}

drives any activation module (with this given quiver algebra) to produce a function

F_{{in}_{1}} \times F_{{in}_{2}} \to F_{out}

. For instance, setting the representing dimension to be 1 and taking

σ_{i}

to be the ReLu function

max {x, 0}

on

R

is a popular choice. Data passes from the leaves to the root, which is called forward propagation.

Figure 1 shows the differential of

\tilde{γ} \in \tilde{A}

. This 1-form is given by

d (a_{1} σ_{1} \circ (a_{11} + a_{12}) + a_{2} σ_{2} \circ (a_{21} + a_{22}))

= d a_{1} (α_{1} + D_{σ_{1}}^{(1)} |_{α_{1}} (d a_{11} + d a_{12})) + d a_{2} (α_{2} + D_{σ_{2}}^{(1)} |_{α_{2}} (d a_{21} + d a_{22}))

where

α_{j} = σ_{j} \circ (a_{j 1} + a_{j 2})

. The terms are obtained by starting at the output node and moving backwards through the activation tree, which is well known as the backpropagation algorithm. Please note that this works on the algebraic level and is not specific to any representation.

d \tilde{γ}

induces a

Map (F_{{in}_{1}} \times F_{{in}_{2}}, F_{out})

-valued 1-form on

M_{(n, d)}

. We can also easily produce

R

-valued 1-forms, for instance by (3).

For stochastic gradient descent over the moduli

M_{n, d}

, to find the optimal machine, we still need one more ingredient: a metric on

M_{n, d}

, to turn a one-form to a vector field. Very nicely, the Ricci curvature of the metric (4) given above gives a well-defined metric on

M_{n, d}

. So far, all the ingredients involved (namely the algorithm

\tilde{γ}

, its differential, the bundle metric

H_{i}

and the metric on moduli) are purely written in algebraic symbols and work for moduli spaces in all dimensions

(n, d)

simultaneously.

Theorem 3

([19]). Suppose Q has no oriented cycles. Then

H_{T} : = \sum_{i} \partial \bar{\partial} log \det H_{i} = \sum_{i} (t r {(\partial ρ_{i})}^{*} H_{i} \partial ρ_{i} - t r (H_{i} ρ_{i} {(\partial ρ_{i})}^{*} H_{i} (\partial ρ_{i}) ρ_{i}^{*}))

(5)

defines a Kähler metric on

M_{n, d}

for any

(n, d)

.

Example 2.

Let us consider the network of Figure 2 again, with

s = 2

for simplicity. Let

n = d = (1, 1, 1, 1)

. Over the chart where

e^{(i)} \neq 0

for all

i = {in}_{1}, {in}_{2}, 1, 2, out

, the

GL (d)

-equivariance allows us to assume that

e^{(i)} {(e^{(i)})}^{*} = 1

. Then

H_{{in}_{j}}

are trivial for

j = 1, 2

, and so

\partial \bar{\partial} log H_{{in}_{j}} = 0

. Let

x_{1} = (w_{11}^{(1)}, w_{21}^{(1)})

and

x_{2} = (w_{12}^{(1)}, w_{22}^{(1)})

. We have

\partial \bar{\partial} log H_{j} = \partial \bar{\partial} log {(1 + | x_{j} |^{2})}^{- 1} = \frac{(1 + | x_{j} |^{2}) d x_{j} \land d {\bar{x}}_{j}^{t} + \bar{x_{j}} d x_{j}^{t} d \bar{x_{j}} x_{j}^{t}}{(1 + | x_{j} {|^{2})}^{2}};

\partial \bar{\partial} log H_{o u t} = \partial \bar{\partial} log {(1 + | w_{1}^{(2)} |^{2} | x_{1} |^{2} + | w_{2}^{(2)} |^{2} {| x_{2} |}^{2})}^{- 1} = \frac{d x_{j} \land d \bar{x_{j}} + d w_{j}^{(2)} \land d {\bar{w_{j}}}^{(2)}}{(1 + | w_{1}^{(2)} |^{2} | x_{1} |^{2} + | w_{2}^{(2)} |^{2} | x_{2} |^{2})}

+ \frac{(| w_{j}^{(2)} |^{2} \bar{x_{j}} d x_{j}^{t} d \bar{x_{j}} x_{j}^{t}) + (| x_{j} |^{2} d w_{j} \land d \bar{w_{j}}) + (\bar{w_{j}} x_{j} d w_{j} \land d \bar{x_{j}}) + (w_{j} \bar{x_{j}} d x_{j} \land d \bar{w_{j}})}{(1 + | w_{1}^{(2)} |^{2} | x_{1} |^{2} + | w_{2}^{(2)} |^{2} | x_{2} {|^{2})}^{2}} .

4. Uniformization of Metrics over the Moduli

The original formulation of deep learning is over the flat vector space of representations

{Hom}_{alg} (C Q, End (V))

, rather than the moduli space

M = R ⫽ G

of framed representations which has a semi-positive metric

H_{T}

. In [5], we found the following way of connecting our new approach with the original approach by varying the bundle metric

H_{i}

in (4).

We shall assume

n_{i} \geq d_{i}

\forall i

. Let us write the framing map (which is a rectangular matrix) as

e^{(i)} = (ϵ_{i}

b_{i})

where

ϵ_{i}

is the largest square matrix and

b_{i}

is the remaining part. (In applications

b_{i}

usually consists of `bias vectors’.) This allows us to rewrite Equation (4) in the following way:

H_{i} (α) = {(ϵ_{i} ϵ_{i}^{*} + α_{e^{(i)}} b_{i} b_{i}^{*} + \sum_{γ : h (γ) = i, γ \neq e^{(i)}} α_{γ} w_{γ} e^{t (γ)} {(w_{γ} e^{t (γ)})}^{*})}^{- 1} .

with

α = {(α_{γ})}_{γ : h (γ) = i} = (1, \dots, 1)

.

If instead, we set

α_{γ}

to different values, then

H_{i} (α)

will still be G-equivariant, but it may no longer positive definite on the whole space

M

. Motivated by the brilliant construction of dual Hermitian symmetric spaces, we define

M (α) : = {[w, e] \in M : H_{i} (α) is positive definite at [w, e]} .

Elements

[w, e] \in M (α)

are called space-like representations with respect to

H_{i} (α)

. If we set

α = \vec{0}

, it becomes

H_{i}^{0} : = {(ϵ_{i} ϵ_{i}^{*})}^{- 1}

and

M^{0} : = M (α = \vec{0})

is exactly the flat vector space

{Hom}_{alg} (C Q, End (V))

. This recovers the original Euclidean learning.

On the other hand, setting

α = (- 1, \dots, - 1)

, we obtain a semi-negative moduli space

(M^{-}, H_{T})

[5], which is a generalization of the hyperbolic spaces or non-compact dual of Grassmannians. This is a very useful setting as there have been several fascinating works done on machine learning performed over hyperbolic spaces, for example [24,25,26,27].

Remark 3.

The above construction gives a family of metrics

H_{i}

parametrized by α.

α_{γ}

can be interpreted as filtering parameters that encode the importance of the paths γ. It is interesting to compare this with the celebrated attention mechanism. We can build models that use these parameters to filter away noise signals during the learning process. We are investigating the applications in transfer learning, especially in situations where we only have a small sample of representative data.

Author Contributions

Conceptualization, G.J. and S.-C.L.; writing—original draft preparation, G.J.; writing—review and editing, S.-C.L.; initializing and advising the overall ideas and structures, S.-C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to acknowledge Bernd Henschenmacher, whose communications with us about historical developments of quantum physics has partially inspired this project.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jordan, P. Zur Axiomatik der Quanten-Algebra; Verlag der Akademie der Wissenschaften und der Literatur in Mainz, in Komm. F. Steiner Verlag: Mainz, Germany, 1950. [Google Scholar]
Segal, I. Postulates for General Quantum Mechanics. Ann. Math. 1947, 48, 930. [Google Scholar] [CrossRef]
Liebmann, M.; Ruhaak, H.; Henschenmacher, B. Non-Associative Algebras and Quantum Physics—A Historical Perspective. arXiv 2019, arXiv:1909.04027. [Google Scholar]
Moore, C.; Crutchfield, J.P. Quantum automata and quantum grammars. Theor. Comput. Sci. 2000, 237, 275–306. [Google Scholar] [CrossRef] [Green Version]
Jeffreys, G.; Lau, S.C. Noncommutative Geometry of Computational Models and Uniformization for Framed Quiver Varieties. arXiv 2022, arXiv:2201.05900. [Google Scholar]
Connes, A. Noncommutative differential geometry. Publ. Math. l’IHÉS 1985, 62, 41–144. [Google Scholar] [CrossRef]
Cuntz, J.; Quillen, D. Algebra extensions and nonsingularity. J. Am. Math. Soc. 1995, 8, 251–289. [Google Scholar] [CrossRef]
Ginzburg, V. Lectures on Noncommutative Geometry. arXiv 2005, arXiv:0506603. [Google Scholar]
Tacchella, A. An introduction to associative geometry with applications to integrable systems. J. Geom. Phys. 2017, 118, 202–233. [Google Scholar] [CrossRef] [Green Version]
Barbaresco, F. Souriau-Casimir Lie Groups Thermodynamics and Machine Learning. In Geometric Structures of Statistical Physics, Information Geometry, and Learning, SPIGL’20, Les Houches, France, 27–31 July 2021; Springer: Cham, Switzerland, 2021; pp. 53–83. [Google Scholar] [CrossRef]
Cohen, T.; Welling, M. Group Equivariant Convolutional Networks. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; Volume 48, pp. 2990–2999. [Google Scholar]
Cohen, T.; Geiger, M.; Weiler, M. A General Theory of Equivariant CNNs on Homogeneous Spaces. arXiv 2019, arXiv:1811.02017. [Google Scholar]
Cohen, T.; Geiger, M.; Koehler, J.; Welling, M. Spherical CNNs. In Proceedings of the ICLR, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Cohen, T.; Weiler, M.; Kicanaoglu, B.; Welling, M. Gauge Equivariant Convolutional Networks and the Icosahedral CNN. In Proceedings of the International Conference on Machine Learning (ICML), Long Beach, CA, USA, 10–15 June 2019. [Google Scholar]
Cheng, M.; Anagiannis, V.; Weiler, M.; de Haan, P.; Cohen, T.; Welling, M. Covariance in Physics and Convolutional Neural Networks. arXiv 2019, arXiv:1906.02481. [Google Scholar]
de Haan, P.; Cohen, T.; Welling, M. Natural Graph Networks. arXiv 2020, arXiv:2007.08349. [Google Scholar]
King, A. Moduli of representations of finite-dimensional algebras. Q. J. Math. Oxf. Ser. 1994, 45, 515–530. [Google Scholar]
Armenta, M.A.; Jodoin, P.M. The Representation Theory of Neural Networks. arXiv 2020, arXiv:2007.12213. [Google Scholar] [CrossRef]
Jeffreys, G.; Lau, S.C. Kähler Geometry of Quiver Varieties and Machine Learning. arXiv 2021, arXiv:2101.11487. [Google Scholar]
Ganev, I.; Walters, R. The QR decomposition for radial neural networks. arXiv 2021, arXiv:2107.02550. [Google Scholar]
Reineke, M. Framed quiver moduli, cohomology, and quantum groups. J. Algebra 2008, 320, 94–115. [Google Scholar] [CrossRef] [Green Version]
Nakajima, H. Instantons on ALE spaces, quiver varieties, and Kac-Moody algebras. Duke Math. J. 1994, 76, 365–416. [Google Scholar] [CrossRef] [Green Version]
Nakajima, H. Quiver varieties and finite-dimensional representations of quantum affine algebras. J. Am. Math. Soc. 2001, 14, 145–238. [Google Scholar] [CrossRef]
Nickel, M.; Kiela, D. Poincaré Embeddings for Learning Hierarchical Representations. In Proceedings of the NIPS, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Ganea, O.E.; Bécigneul, G.; Hofmann, T. Hyperbolic Entailment Cones for Learning Hierarchical Embeddings. arXiv 2018, arXiv:1804.01882. [Google Scholar]
Sala, F.; Sa, C.D.; Gu, A.; Ré, C. Representation Tradeoffs for Hyperbolic Embeddings. Proc. Mach. Learn. Res. 2018, 80, 4460–4469. [Google Scholar]
Ganea, O.E.; Bécigneul, G.; Hofmann, T. Hyperbolic Neural Networks. arXiv 2018, arXiv:1805.09112. [Google Scholar]

Figure 1. An example of the backpropagation algorithm for the network in Figure 2 when

s = 2

.

Figure 1. An example of the backpropagation algorithm for the network in Figure 2 when

s = 2

.

Figure 2. A simple neural network with one hidden layer with s-many neurons. Neural networks in applications are typically much more complicated, but in nature are still quiver representations equipped with activation functions.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jeffreys, G.; Lau, S.-C. Quantum Finite Automata and Quiver Algebras. Phys. Sci. Forum 2022, 5, 32. https://doi.org/10.3390/psf2022005032

AMA Style

Jeffreys G, Lau S-C. Quantum Finite Automata and Quiver Algebras. Physical Sciences Forum. 2022; 5(1):32. https://doi.org/10.3390/psf2022005032

Chicago/Turabian Style

Jeffreys, George, and Siu-Cheong Lau. 2022. "Quantum Finite Automata and Quiver Algebras" Physical Sciences Forum 5, no. 1: 32. https://doi.org/10.3390/psf2022005032

Article Menu

Quantum Finite Automata and Quiver Algebras^†

Abstract

1. Motivation: QFA and Near-Ring

2. Near-Ring and Differential Forms

3. Deep Learning over the Moduli Space of Quiver Representations

4. Uniformization of Metrics over the Moduli

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Quantum Finite Automata and Quiver Algebras †

Abstract

1. Motivation: QFA and Near-Ring

2. Near-Ring and Differential Forms

3. Deep Learning over the Moduli Space of Quiver Representations

4. Uniformization of Metrics over the Moduli

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Quantum Finite Automata and Quiver Algebras^†