Next Article in Journal
Model Selection in the World of Maximum Entropy
Previous Article in Journal
Time-Dependent Maximum Entropy Model for Populations of Retinal Ganglion Cells
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Quantum Finite Automata and Quiver Algebras †

Department of Mathematics and Statistics, Boston University, 111 Cummington Mall, Boston, MA 02215, USA
*
Author to whom correspondence should be addressed.
Presented at the 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France, 18–22 July 2022.
Phys. Sci. Forum 2022, 5(1), 32; https://doi.org/10.3390/psf2022005032
Published: 14 December 2022

Abstract

:
We find an application in quantum finite automata for the ideas and results of [JL21] and [JL22]. We reformulate quantum finite automata with multiple-time measurements using the algebraic notion of a near-ring. This gives a unified understanding towards quantum computing and deep learning. When the near-ring comes from a quiver, we have a nice moduli space of computing machines with a metric that can be optimized by gradient descent.

1. Motivation: QFA and Near-Ring

In quantum theory, the evolution of states is unitary. An observable is modeled by a self-adjoint operator whose eigenvalues are the possible output values, and whose eigenvectors form an orthonormal basis of the state space. As a result, a typical quantum model simply consists of linear operators that form an algebra.
However, when passing from the quantum world to the real world, an actual probabilistic projection to an eigenstate is necessary. Such a probabilistic operation destroys the linear structure, so we need a non-linear (meaning non-distributive) algebraic structure to accommodate such operators. In the 20th century, there were several attempts to solve this problem. See for instance [1,2], and [3] (Chapter 3) for a nice survey. In particular, Pascual Jordan attempted to use near-ring for quantum mechanics.
Let us consider the scenario of quantum computing.
Definition 1
([4]). A quantum finite automata (QFA) is a tuple Q = ( V , q 0 , F , Σ , ( U σ ) σ Σ ) where:
1 
V is a finite set of states which generate the Hilbert space H V ;
2 
F V is a set of final or accept states;
3 
q 0 the initial state which is a unit vector in H V ;
4 
Σ is a finite set called the alphabet;
5 
For each σ Σ , U σ is a unitary operator on H V .
An input to a QFA consists of a word w in the alphabet Σ of the form w = w 1 w 2 w n where w i Σ for all i. w acts on the initial state of the QFA by q 0 | U w , where U w is the matrix U w : = U w 1 U w 2 U w n , and q 0 | is the row vector presentation of q 0 . The probability that word w will end in an accept state is
P r ( w ) = q 0 | U w P 2
where P : H V H F is the projection from H V to subspace H F spanned by F.
Please note that the above definition has not taken the probabilistic projection into account. We make the following reformulation.
Definition 2.
A quantum computing machine is a tuple ( ( H V , h ) , H F , e , ρ G , σ F ) , where
1 
( H V , h ) is a Hermitian vector space;
2 
H F = C n equipped with the standard metric, which is called the framing space;
3 
e : H F H V is an isometric embedding;
4 
ρ G : G U ( H V , h ) is a unitary representation of a group G.
5 
σ F : H F H F is a probabilistic projection.
Ignoring the last item (5) for the moment, this coincides with Definition 1 by setting G = Σ , the free group generated by a set Σ , and fixing an initial vector q 0 H V .
Here, we treat H F as a vector space of its own and take an isometric embedding e : H F H V , rather than directly identifying H F as a subspace in H V . The state space H V is treated as an abstract vector space without a preferred basis, while H F is equipped with a fixed basis that has a real physical meaning (like up/down spinning of an electron). The framing map e : H F H V is interpreted as a bridge between the classical and the quantum world; the image of the fixed basis under e determines a subset of pure state vectors of a certain observable. The adjoint e * : H V H F is an orthogonal projection. In the next section, e is no longer required to be an embedding when we consider non-unitary generalizations for machine learning.
For the last item (5), the probabilistic projection σ F : H F H F can be modeled by a probability space. Specifically, consider a H F -family of random variables
k : H F × Ω { 1 , , | F | }
where Ω is a probability space (that has a probability measure), with the assumption that Pr ( k ( v ) = j ) = v v , ϵ j for every v H F , where ϵ j H F denotes the j-th basic vector. Then σ F ( v ) : = ϵ k ( v ) .
The major additional ingredients in Definition 2, compared to Definition 1, are e , e * and σ F . Please note that they are not yet included in the machine language, which is currently the group G. Since e and σ F are not invertible, we cannot enlarge G to include e nor σ F as a group.
To remedy this, first note that (4) can be replaced with an algebra rather than a group, which exhibits linearity and allows not being invertible. Specifically, we require instead:
(4’)
ρ A : A End ( H V ) is an algebra homomorphism for an algebra A (with unit 1 A ).
For instance, A can be the free algebra generated by a set Σ .
With such a modification, we can easily include the framing e and e * into our language by taking the augmented algebra
A = A 1 F , e , e * / R
where R is generated by the relations 1 F · 1 F = 1 F , 1 F · e * = e * , e · 1 F = e , 1 A · e = e , e · e = 0 , e * · e * = 0 , 1 F · e = 0 , 1 F · a = 0 , e · 1 A = 0 , e * · 1 F = 0 , 1 A · e * = 0 for any a A . The unit of A is 1 A + 1 F .
However, we cannot further enlarge A to include σ F as an algebra. The reason for this is that σ F always maps to unit vectors and cannot be linear:
σ F ( v + w ) σ F ( v ) + σ F ( w ) .
To extend A by σ F which models actual quantum measurement, we need the notion of a near-ring. It is a set A with two binary operations +, ∘ such that A is a group under `+’, `∘’ is associative, and right multiplication is distributive over addition: ( x + y ) z = x z + y z for all x , y , z A (but left multiplication is not required distributive: z ( x + y ) z x + z y ).
Define A ˜ to be the near-ring
A ˜ : = ( 1 F + e * · A · e ) { σ F } .
This near-ring can be understood as the language that controls quantum computing machines. Elements of A ˜ can be recorded as rooted trees. An example is a 1 σ F ( a 11 + a 12 ) where a 1 , a 11 , a 12 ( 1 F + e * · A · e ) . See also the tree on the left-hand side of Figure 1.
The advantage of putting all the algebraic structures into a single near-ring is that we can consider all the quantum computing machines (mathematically A ˜ -modules) controlled by a single near-ring at the same time. An element of A ˜ is a quantum algorithm, which can run in all quantum computers controlled by A ˜ .

2. Near-Ring and Differential Forms

In the setting of Definition 1 and Example 1, it is natural to relax the representations from unitary groups to matrix algebras gl ( n , C ) . Moreover, the quantum measurement can also be simulated by a non-linear function (called activation function). Such a modification will produce a computational model of deep learning.
Definition 3.
An activation module consists of:
1 
A noncommutative algebra A and vector space V, F;
2 
A family of metrics h ( ρ , e ) on V over the space of framed A-modules
R = Hom alg ( A , End ( V ) ) × Hom ( F , V )
which is G-equivariant where G = GL ( V ) ;
(3)
A collection of possibly non-linear functions
σ j F : F F .
As in (1), we take the augmented algebra A = A 1 F , e , e * / R which produces linear computations in all framed A-modules simultaneously. With item (3), elements in the near-ring
A ˜ : = ( 1 F + e * · A · e ) { σ 1 F , , σ N F }
induce non-linear functions on F, and so they are called non-linear algorithms. An example of how A ˜ induces non-linear functions on F upon fixing a point in R is given in Example 1.
R = Hom alg ( A , End ( V ) ) × Hom ( F , V ) is understood as a family of computing machines: a point ( w , e ) in R fixes how A acts on V and the framing map e Hom ( F , V ) , and hence entirely determines how an algorithm runs in the machine corresponding to ( w , e ) .
Let us emphasize that the state space V is basis-free. The family of metrics is GL ( V ) -equivariant: h ( ρ , e ) ( v , w ) = h g · ( ρ , e ) ( g · v , g · w ) for any g GL ( V ) . Thus, given a A ˜ , the non-linear functions that a induces for the two machines r R and g · r R equal to each other. In other words, an algorithm a A ˜ drives all machines parametrized by the moduli stack [ R / GL ( V ) ] to produce functions on F:
A ˜ × [ R / GL ( V ) ] Map ( F , F ) .
As mentioned above, the advantage is that the single near-ring A ˜ controls all machines in [ R / GL ( V ) ] and for all V simultaneously (independent of dim V ).
In [5], we formulated noncommutative differential forms on a near-ring A ˜ , which induce Map ( F , F ) -valued differential forms on the moduli [ R / GL ( V ) ] . It is extended from the Karoubi-de Rham complex [6,7,8,9] for algebras to near-rings. (2) above is the special case for 0-forms, which are simply elements in A ˜ . The cases of 0-forms and 1-forms are particularly important for gradient descent: recall that gradient of a function is the metric dual of the differential of that function.
Theorem 1
([5]). There exists a degree-preserving map
D R ( A ˜ ) ( Ω ( R , Map ( F , F ) ) ) GL ( V )
which commutes with d on the two sides.
A differential form on A ˜ can be recorded as a form-valued tree, see the right-hand side of Figure 1. They are rooted trees whose edges are labeled by ϕ D R ( Mat F ( A ^ ) ) ; leaves are labeled by α A ˜ ; the root is labeled by 1 (if not a leaf); nodes which are neither leaves nor the root are labeled by the symbols D σ ( p ) | α that correspond to the p-th order symmetric differentials of σ .
In application to machine learning, an algorithm γ ˜ A ˜ induces a 0-form of A ˜ , for instance
K γ ˜ ( x ) f x 2 d x
for a given dataset encoded as a function f : K R . This 0-form and its differential induces the cost function and its differential on [ R / G ] , respectively, which are the central objects in machine learning.
The differential forms are G-equivariant by construction. There have been a lot of recent works in learning for input data set that has Lie group symmetry [10,11,12,13,14,15,16]. On the other hand, our work has focused on the internal symmetry of the computing machine.
In general, the existence of fine moduli is a big problem in mathematics: the moduli stack [ R / G ] may be singular and pose difficulties in applying gradient descent. Fortunately, if A is a quiver algebra, its moduli space of framed quiver representations [ R / G ] is a smooth manifold M (with respect to a chosen stability condition) [17]. This leads us to deep learning explained in the next section.

3. Deep Learning over the Moduli Space of Quiver Representations

An artificial neural network (see Figure 2 for a simple example) consists of:
  • a graph  Q = ( Q 0 , Q 1 ) , where Q 0 is a (finite) set of vertices (neurons) and Q 1 is a (finite) set of arrows starting and ending in vertices in Q 0 (transmission between neurons);
  • a quiver representation of Q, which associates a vector space V i to each vertex i and a linear map w a (called weights) to each arrow a. We denote by t ( a ) and h ( a ) the tail and head of an arrow a, respectively.
  • a non-linear function V i V i for each vertex i (called an activation function for the neuron).
Activation functions are an important ingredient for neural network; on the other hand, it rarely appears in quiver theory. Its presence allows the neural network to produce non-linear functions.
Remark 1.
In the recent past there has been rising interest in the relations between machine learning and quiver representations [5,18,19,20]. Here, we simply put quiver representation as a part of the formulation of an artificial neural network.
In many applications, the dimension vector d Z 0 Q 0 is set to be ( 1 , , 1 ) , i.e., all vector spaces V i associated with the vertices are one-dimensional. For us, it is an unnecessary constraint, and we allow d to be any fixed integer vector.
Any non-trivial non-linear function V i V i cannot be GL ( V i ) -equivariant. However, in quiver theory, V i is understood as a basis-free vector space and requires GL ( V i ) -equivariance. We resolve this conflict between neural network and quiver theory in [19] using framed quiver representations. The key idea is to put the non-linear function on the framing rather than on the basis-free vector spaces V i .
Combining with the setting of the last section (Definition 2), we take:
  • A = C Q , the quiver algebra. Elements are formal linear combinations of paths in Q (including the trivial paths at vertices), and product is given by concatenation of paths.
  • V = i V i , the direct sum of all vector spaces over vertices.
  • Each vertex is associated with a framing vector space F i . Then F = i F i .
  • Each point ( w , e ) R = Hom alg ( C Q , End ( V ) ) × i Hom ( F i , V i ) is a framed quiver representation. Specifically, w Hom alg ( C Q , End ( V ) ) associates a matrix w a to each arrow a of Q; e ( i ) Hom ( F i , V i ) are the framing linear maps.
  • The group G is taken to be i GL ( V i ) . An element g = ( g i ) i Q 0 acts on R by
    g · ( w a ) a Q 1 , ( e ( i ) ) i Q 0 = ( g h ( a ) · w a · g t ( a ) 1 ) a Q 1 , ( g i · e ( i ) ) i Q 0 .
  • We have (possibly non-linear) maps σ i : F i F i for each vertex. To match the notation of Definition 2, σ i can be taken as maps F F by extension by zero.
By the celebrated result of [17], we have a fine moduli space of framed quiver representations M = M n , d = R G , where n , d are the dimension vectors for the framing { F i } i Q 0 and representation { V i } i Q 0 , respectively. In particular, we have the universal vector bundles V i over M , whose fiber over each framed representation [ w , e ] M is the representing vector space V i over the vertex i.
V i plays an important role in our computational model, namely a vector v V i over a point [ w , e ] M is the state of the i-th neuron in the machine parametrized by [ w , e ] .
Remark 2.
The topology of M is well understood by [21] as iterated Grassmannian bundles. Framed quiver representations and their doubled counterparts play an important role in geometric representation theory [22,23].
To fulfill Definition 2 (see Item (2)), we need to equip each V i with a bundle metric h i , so that the adjoint e * makes sense. In [19], we have found a bundle metric that is merely written in terms of the algebra A . It means the formula works for (infinitely many) quiver moduli for all dimension vectors of representations simultaneously.
Theorem 2
([19]). For a fixed vertex ( i ) , let ρ i be the row vector whose entries are all the elements of the form w γ e ( t ( γ ) ) : R n , d Hom ( C n t ( γ ) , C d i ) such that h ( γ ) = i . Consider
H i : = ρ i ρ i * = h ( γ ) = i w γ e ( t ( γ ) ) w γ e ( t ( γ ) ) *
as a map ρ i ρ i * : R n , d End ( C d i ) . Then ( ρ i ρ i * ) 1 is GL ( d ) -equivariant and descends to a Hermitian metric on V i over M .
Example 1.
Consider the network in Figure 2. The quiver has the arrows a 1 , k ( 1 ) , a 2 , k ( 1 ) for k = 1 , , s (between the input and hidden layers) and a k ( 2 ) (between the hidden and output layers). In application, we consider the algorithm
γ ˜ : = k = 1 s a ^ k ( 2 ) σ k j = 1 2 a ^ j k ( 1 ) A ˜
where a ^ j k ( 1 ) : = e ( j ) * a j k ( 1 ) e in j and a ^ k ( 2 ) : = e out * a k ( 2 ) e ( k ) . Please note that the adjoints e ( j ) * and e out * are with respect to the metric H i and H out , respectively.
γ ˜ is recorded by the activation tree on the left-hand side of Figure 1, for the case s = 2 . γ ˜ drives any activation module (with this given quiver algebra) to produce a function F in 1 × F in 2 F out . For instance, setting the representing dimension to be 1 and taking σ i to be the ReLu function max { x , 0 } on R is a popular choice. Data passes from the leaves to the root, which is called forward propagation.
Figure 1 shows the differential of γ ˜ A ˜ . This 1-form is given by
d ( a 1 σ 1 ( a 11 + a 12 ) + a 2 σ 2 ( a 21 + a 22 ) )
= d a 1 ( α 1 + D σ 1 ( 1 ) | α 1 ( d a 11 + d a 12 ) ) + d a 2 ( α 2 + D σ 2 ( 1 ) | α 2 ( d a 21 + d a 22 ) )
where α j = σ j ( a j 1 + a j 2 ) . The terms are obtained by starting at the output node and moving backwards through the activation tree, which is well known as the backpropagation algorithm. Please note that this works on the algebraic level and is not specific to any representation.
d γ ˜ induces a Map ( F in 1 × F in 2 , F out ) -valued 1-form on M ( n , d ) . We can also easily produce R -valued 1-forms, for instance by (3).
For stochastic gradient descent over the moduli M n , d , to find the optimal machine, we still need one more ingredient: a metric on M n , d , to turn a one-form to a vector field. Very nicely, the Ricci curvature of the metric (4) given above gives a well-defined metric on M n , d . So far, all the ingredients involved (namely the algorithm γ ˜ , its differential, the bundle metric H i and the metric on moduli) are purely written in algebraic symbols and work for moduli spaces in all dimensions ( n , d ) simultaneously.
Theorem 3
([19]). Suppose Q has no oriented cycles. Then
H T : = i ¯ log det H i = i t r ( ρ i ) * H i ρ i t r H i ρ i ( ρ i ) * H i ( ρ i ) ρ i *
defines a Kähler metric on M n , d for any ( n , d ) .
Example 2.
Let us consider the network of Figure 2 again, with s = 2 for simplicity. Let n = d = ( 1 , 1 , 1 , 1 ) . Over the chart where e ( i ) 0 for all i = in 1 , in 2 , 1 , 2 , out , the GL ( d ) -equivariance allows us to assume that e ( i ) ( e ( i ) ) * = 1 . Then H in j are trivial for j = 1 , 2 , and so ¯ log H in j = 0 . Let x 1 = ( w 11 ( 1 ) , w 21 ( 1 ) ) and x 2 = ( w 12 ( 1 ) , w 22 ( 1 ) ) . We have
¯ log H j = ¯ log 1 + | x j | 2 1 = ( 1 + | x j | 2 ) d x j d x ¯ j t + x j ¯ d x j t d x j ¯ x j t ( 1 + | x j | 2 ) 2 ;
¯ log H o u t = ¯ log 1 + | w 1 ( 2 ) | 2 | x 1 | 2 + | w 2 ( 2 ) | 2 | x 2 | 2 1 = d x j d x j ¯ + d w j ( 2 ) d w j ¯ ( 2 ) ( 1 + | w 1 ( 2 ) | 2 | x 1 | 2 + | w 2 ( 2 ) | 2 | x 2 | 2 )
+ ( | w j ( 2 ) | 2 x j ¯ d x j t d x j ¯ x j t ) + ( | x j | 2 d w j d w j ¯ ) + ( w j ¯ x j d w j d x j ¯ ) + ( w j x j ¯ d x j d w j ¯ ) ( 1 + | w 1 ( 2 ) | 2 | x 1 | 2 + | w 2 ( 2 ) | 2 | x 2 | 2 ) 2 .

4. Uniformization of Metrics over the Moduli

The original formulation of deep learning is over the flat vector space of representations Hom alg ( C Q , End ( V ) ) , rather than the moduli space M = R G of framed representations which has a semi-positive metric H T . In [5], we found the following way of connecting our new approach with the original approach by varying the bundle metric H i in (4).
We shall assume n i d i i . Let us write the framing map (which is a rectangular matrix) as e ( i ) = ( ϵ i b i ) where ϵ i is the largest square matrix and b i is the remaining part. (In applications b i usually consists of `bias vectors’.) This allows us to rewrite Equation (4) in the following way:
H i ( α ) = ϵ i ϵ i * + α e ( i ) b i b i * + γ : h ( γ ) = i , γ e ( i ) α γ w γ e t ( γ ) w γ e t ( γ ) * 1 .
with α = ( α γ ) γ : h ( γ ) = i = ( 1 , , 1 ) .
If instead, we set α γ to different values, then H i ( α ) will still be G-equivariant, but it may no longer positive definite on the whole space M . Motivated by the brilliant construction of dual Hermitian symmetric spaces, we define
M ( α ) : = { [ w , e ] M : H i ( α ) is positive definite at [ w , e ] } .
Elements [ w , e ] M ( α ) are called space-like representations with respect to H i ( α ) . If we set α = 0 , it becomes
H i 0 : = ϵ i ϵ i * 1
and M 0 : = M ( α = 0 ) is exactly the flat vector space Hom alg ( C Q , End ( V ) ) . This recovers the original Euclidean learning.
On the other hand, setting α = ( 1 , , 1 ) , we obtain a semi-negative moduli space ( M , H T ) [5], which is a generalization of the hyperbolic spaces or non-compact dual of Grassmannians. This is a very useful setting as there have been several fascinating works done on machine learning performed over hyperbolic spaces, for example [24,25,26,27].
Remark 3.
The above construction gives a family of metrics H i parametrized by α. α γ can be interpreted as filtering parameters that encode the importance of the paths γ. It is interesting to compare this with the celebrated attention mechanism. We can build models that use these parameters to filter away noise signals during the learning process. We are investigating the applications in transfer learning, especially in situations where we only have a small sample of representative data.

Author Contributions

Conceptualization, G.J. and S.-C.L.; writing—original draft preparation, G.J.; writing—review and editing, S.-C.L.; initializing and advising the overall ideas and structures, S.-C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to acknowledge Bernd Henschenmacher, whose communications with us about historical developments of quantum physics has partially inspired this project.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Jordan, P. Zur Axiomatik der Quanten-Algebra; Verlag der Akademie der Wissenschaften und der Literatur in Mainz, in Komm. F. Steiner Verlag: Mainz, Germany, 1950. [Google Scholar]
  2. Segal, I. Postulates for General Quantum Mechanics. Ann. Math. 1947, 48, 930. [Google Scholar] [CrossRef]
  3. Liebmann, M.; Ruhaak, H.; Henschenmacher, B. Non-Associative Algebras and Quantum Physics—A Historical Perspective. arXiv 2019, arXiv:1909.04027. [Google Scholar]
  4. Moore, C.; Crutchfield, J.P. Quantum automata and quantum grammars. Theor. Comput. Sci. 2000, 237, 275–306. [Google Scholar] [CrossRef] [Green Version]
  5. Jeffreys, G.; Lau, S.C. Noncommutative Geometry of Computational Models and Uniformization for Framed Quiver Varieties. arXiv 2022, arXiv:2201.05900. [Google Scholar]
  6. Connes, A. Noncommutative differential geometry. Publ. Math. l’IHÉS 1985, 62, 41–144. [Google Scholar] [CrossRef]
  7. Cuntz, J.; Quillen, D. Algebra extensions and nonsingularity. J. Am. Math. Soc. 1995, 8, 251–289. [Google Scholar] [CrossRef]
  8. Ginzburg, V. Lectures on Noncommutative Geometry. arXiv 2005, arXiv:0506603. [Google Scholar]
  9. Tacchella, A. An introduction to associative geometry with applications to integrable systems. J. Geom. Phys. 2017, 118, 202–233. [Google Scholar] [CrossRef] [Green Version]
  10. Barbaresco, F. Souriau-Casimir Lie Groups Thermodynamics and Machine Learning. In Geometric Structures of Statistical Physics, Information Geometry, and Learning, SPIGL’20, Les Houches, France, 27–31 July 2021; Springer: Cham, Switzerland, 2021; pp. 53–83. [Google Scholar] [CrossRef]
  11. Cohen, T.; Welling, M. Group Equivariant Convolutional Networks. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; Volume 48, pp. 2990–2999. [Google Scholar]
  12. Cohen, T.; Geiger, M.; Weiler, M. A General Theory of Equivariant CNNs on Homogeneous Spaces. arXiv 2019, arXiv:1811.02017. [Google Scholar]
  13. Cohen, T.; Geiger, M.; Koehler, J.; Welling, M. Spherical CNNs. In Proceedings of the ICLR, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
  14. Cohen, T.; Weiler, M.; Kicanaoglu, B.; Welling, M. Gauge Equivariant Convolutional Networks and the Icosahedral CNN. In Proceedings of the International Conference on Machine Learning (ICML), Long Beach, CA, USA, 10–15 June 2019. [Google Scholar]
  15. Cheng, M.; Anagiannis, V.; Weiler, M.; de Haan, P.; Cohen, T.; Welling, M. Covariance in Physics and Convolutional Neural Networks. arXiv 2019, arXiv:1906.02481. [Google Scholar]
  16. de Haan, P.; Cohen, T.; Welling, M. Natural Graph Networks. arXiv 2020, arXiv:2007.08349. [Google Scholar]
  17. King, A. Moduli of representations of finite-dimensional algebras. Q. J. Math. Oxf. Ser. 1994, 45, 515–530. [Google Scholar]
  18. Armenta, M.A.; Jodoin, P.M. The Representation Theory of Neural Networks. arXiv 2020, arXiv:2007.12213. [Google Scholar] [CrossRef]
  19. Jeffreys, G.; Lau, S.C. Kähler Geometry of Quiver Varieties and Machine Learning. arXiv 2021, arXiv:2101.11487. [Google Scholar]
  20. Ganev, I.; Walters, R. The QR decomposition for radial neural networks. arXiv 2021, arXiv:2107.02550. [Google Scholar]
  21. Reineke, M. Framed quiver moduli, cohomology, and quantum groups. J. Algebra 2008, 320, 94–115. [Google Scholar] [CrossRef] [Green Version]
  22. Nakajima, H. Instantons on ALE spaces, quiver varieties, and Kac-Moody algebras. Duke Math. J. 1994, 76, 365–416. [Google Scholar] [CrossRef] [Green Version]
  23. Nakajima, H. Quiver varieties and finite-dimensional representations of quantum affine algebras. J. Am. Math. Soc. 2001, 14, 145–238. [Google Scholar] [CrossRef]
  24. Nickel, M.; Kiela, D. Poincaré Embeddings for Learning Hierarchical Representations. In Proceedings of the NIPS, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  25. Ganea, O.E.; Bécigneul, G.; Hofmann, T. Hyperbolic Entailment Cones for Learning Hierarchical Embeddings. arXiv 2018, arXiv:1804.01882. [Google Scholar]
  26. Sala, F.; Sa, C.D.; Gu, A.; Ré, C. Representation Tradeoffs for Hyperbolic Embeddings. Proc. Mach. Learn. Res. 2018, 80, 4460–4469. [Google Scholar]
  27. Ganea, O.E.; Bécigneul, G.; Hofmann, T. Hyperbolic Neural Networks. arXiv 2018, arXiv:1805.09112. [Google Scholar]
Figure 1. An example of the backpropagation algorithm for the network in Figure 2 when s = 2 .
Figure 1. An example of the backpropagation algorithm for the network in Figure 2 when s = 2 .
Psf 05 00032 g001
Figure 2. A simple neural network with one hidden layer with s-many neurons. Neural networks in applications are typically much more complicated, but in nature are still quiver representations equipped with activation functions.
Figure 2. A simple neural network with one hidden layer with s-many neurons. Neural networks in applications are typically much more complicated, but in nature are still quiver representations equipped with activation functions.
Psf 05 00032 g002
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Jeffreys, G.; Lau, S.-C. Quantum Finite Automata and Quiver Algebras. Phys. Sci. Forum 2022, 5, 32. https://doi.org/10.3390/psf2022005032

AMA Style

Jeffreys G, Lau S-C. Quantum Finite Automata and Quiver Algebras. Physical Sciences Forum. 2022; 5(1):32. https://doi.org/10.3390/psf2022005032

Chicago/Turabian Style

Jeffreys, George, and Siu-Cheong Lau. 2022. "Quantum Finite Automata and Quiver Algebras" Physical Sciences Forum 5, no. 1: 32. https://doi.org/10.3390/psf2022005032

Article Metrics

Back to TopTop