On Foundational Physics

Skilling, John; Knuth, Kevin H.

doi:10.3390/psf2022005040

Open AccessProceeding Paper

On Foundational Physics^†

by

John Skilling

^1,‡

and

Kevin H. Knuth

^2,*,‡

¹

Maximum Entropy Data Consultants, V93 H7VW Kenmare, Ireland

²

Department of Physics, University at Albany (SUNY), Albany, NY 12222, USA

^*

Author to whom correspondence should be addressed.

^†

Presented at the 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Paris, France, 18–22 July 2022.

^‡

These authors contributed equally to this work.

Phys. Sci. Forum 2022, 5(1), 40; https://doi.org/10.3390/psf2022005040

Published: 3 January 2023

(This article belongs to the Proceedings of The 41st International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering)

Download Versions Notes

Abstract

:

As physicists, we wish to make mental models of the world around us. For this to be useful, we need to be able to classify features of the world into symbols and develop a rational calculus for their manipulation. In seeking maximal generality, we aim for minimal restrictive assumptions. That inquiry starts by developing basic arithmetic and proceeds to develop the formalism of quantum theory and relativity.

Keywords:

quantum; spacetime; uncertainty; symmetry; foundational physics

1. Simple Mathematics

We seek a calculus that will allow us to manipulate objects of discourse (electrons, elephants, playing cards, …, or even ideas such as beliefs). We will represent the objects with symbols, on which the calculus will operate.

1.1. Combining in Parallel: Addition

Consider a collection of playing cards, for simplicity all from the same pack, so individually identifiable as needed. Cards can be collected into hands, which can themselves be collected into bigger hands. Psf 05 00040 i001

For our initial purpose, the identity and order of the cards does not matter within a hand. In other words, the hands can be shuffled. In formal terms, we call this associative commutativity.

\begin{matrix} \begin{matrix} A with B = B with A \\ A with (B with C) = (A with B) with C \end{matrix} \end{matrix} \begin{matrix} (commutative) \\ (associative) \end{matrix}

(Assumptions are

\begin{matrix} boxed \end{matrix}

). Mathematicians will recognize this particular example of “with” as the union of disjoint sets, but we prefer to use informal language, which emphasizes symmetries over objects. From those symmetries, we proceed to develop basic equations of physics.

Obviously, we can represent this with numbers

a, b, c

(scalars) representing

A, B, C

and + representing “with”.

\begin{matrix} Quantity is represented by a scalar \end{matrix}

That enables us to get started by bringing single properties into mathematical representation. What gets added up after other content has been abstracted away is the number (formally, the cardinality). Thus,

a = 3

,

b = 2

, and

c = 1

above. This “sum rule”

\begin{matrix} a + b = b + a \\ a + (b + c) = (a + b) + c \end{matrix}

(1)

is not difficult: young children understand it, and generalization from integers to real numbers is immediate. Less obviously, associative commutativity forces this basic addition [1,2,3]. Technically, we must also assume closure (cards with cards = cards, which is really just a common sense definition of what a set is) and extensibility (if there is a limit—such as 52 in this example—we never reach it, so it does not matter). There happens to be no alternative to + other than by differently coding the numbers—formally known as isomorphism—encode x as

y = f (x)

if you want, but compensate by summing

f^{- 1} (y)

not y. Mathematicians call additive stuff measures.

In science, we learn about the world and model it by decomposing it into nominally independent parts, which can be analyzed in isolation. To accommodate the advantageous and practical concept of independent systems, quantification must obey the sum rule (or be isomorphic to it). Adopting non-additive quantification precludes the possibility of considering systems that are independent, which needlessly impairs a quantitative physical theory.

1.2. Combination in Series: Multiplication

We may also want to transmit stuff, possibly with modification. This is how we deal with identifiable operations. We want this to be distributive so that whatever is transmitted adds up properly at the other end to preserve shuffling invariance.

\begin{matrix} \begin{matrix} A then (B with C) = (A then B) with (A then C) \\ (A with B) then C = (A then C) with (B then C) \end{matrix} \end{matrix} \begin{matrix} (left distributive) \\ (right distributive) \end{matrix}

In the context of scalar representation, this is the foundation of multiplication. If leftward modification by A is to leave addition intact, then the operation “A then” must be linear, and similarly for “then C” on the right. Hence, “then” is multiplication “·”. It follows that the transmission operation “X then Y” must be represented by

k \cdot x \cdot y

where k is some constant that can, without loss of generality, be scaled to 1. This is the Product Rule. In symbols,

\begin{matrix} a \cdot (b + c) = a \cdot b + a \cdot c \\ (a + b) \cdot c = a \cdot c + b \cdot c \end{matrix}

(2)

We would also wish to have associativity

A then (B then C) = (A then B) then C (associative)

so that a chained sequence of operations could be arbitrarily decomposed into subsequences. Combining things in series, sequencing, possesses the symmetries of associative distributivity. However, in the scalar context the representation

a \cdot (b \cdot c) = (a \cdot b) \cdot c

of associative “then” holds automatically, so there is no need for the extra assumption.

2. Probability, Information, and Geometry

Probability

Pr (X ∣ T)

is to be a quantification of some restricted set X of possibilities within some surrounding “provisional truth” set T. Memberships of a set can be shuffled without changing the set, and successive contractions

T \supseteq X \supseteq Y \dots

can be sequenced. With associative commutativity and distributivity assured, the standard sum and product rules of probability calculus necessarily follow [3]

\begin{matrix} Pr (A or B) = Pr (A) + Pr (B) & (A and B disjoint) \\ Pr (A and B) = Pr (A) Pr (B ∣ A) \end{matrix}\} ‖ all conditional on T

(3)

with overlapping summation

Pr (A or B) + Pr (A and B) = Pr (A) + Pr (B)

an immediate consequence. Probabilities can be variously interpreted but they are normalized distributions and the rules are firm. The range is

0 \leq Pr \leq 1

, and any derogation would imply conflict with the founding symmetries of shuffling and sequencing.

2.1. Information

Typically, in inference, a prior distribution

q

is modified to a posterior

p

by incorporating new data. We may then seek to quantify such

q

-to-

p

change as the information

H (p, q)

supplied by the data, either as an abstract quantity in its own right or in order to select a minimally disturbed destination

p

. It is easy to check that

H (p, q) = \sum_{i} p_{i} log \frac{p_{i}}{q_{i}}

(4)

is additive under direct-product combination

p = p^{'} \times p^{″}

,

q = q^{'} \times q^{″}

of normalized and independent distributions

(q^{'}, q^{″})

-to-

(p^{'}, p^{″})

. It can also be proved [3] that (4) is the only additive form in which the arguments can be probabilities, so the form of H is uniquely specified.

Note that H is asymmetric in its arguments. That must be so because it is easy for new data to annihilate a possibility whereas, once annihilated, it can not be recovered. That is,

q_{i} > 0

can induce

p_{i} = 0

without difficulty, but

q_{i} = 0

cannot give rise to

p_{i} > 0

, the symptom being that H would be infinite.

2.2. Geometry

Probability distributions

p = (p_{1}, p_{2}, \dots, p_{n})

on n cells cover the unit simplex

\sum p_{i} = 1

continuously, on which there is wide freedom to impose a local metric

{(d s)}^{2} = \sum g_{i j} d p_{i} d p_{j}

integrating to macroscopic distances

s (p, q)

. However, no such distance can agree with H because H is asymmetric, whereas distances are symmetric by definition. Arguably, the least damaging distance comes from using the curvature of H to define distance elements as

{(d s)}^{2} = \sum \frac{{(d p_{i})}^{2}}{p_{i}} \approx 2 H (p, p - d p)

(5)

This Fisher information metric

g_{i j} = δ_{i j} / p_{i}

integrates to macroscopic distances

s (p, q) = 2 arccos \sum \sqrt{p_{i} q_{i}}

(6)

varying between 0 and

π

. This is known as information geometry [4,5].

However, this s (indeed any s) conflicts with H, which was forced by independence. The local curvature is the same, but the quantity is different. That means that geometry necessarily introduces dependence between supposedly independent systems. Information geometry is mathematics, but it is not science.

In science, we wish to relate distributions that are substantially different, but in order to isolate systems for analysis, we need to be able to assume independence. Without independence, we do not have probability. Without probability, the derivation of information falls. Users can either have probability distributions (with independence represented by coordinate-wise direct product multiplication) or they can have geometrical distances between distributions (which may make sense as mathematics, but disobey independence), but they can not have both.

Geometry does have a place in physics, where particles follow geometrical trajectories. However, those trajectories are defined by Hamiltonians (energy) not information (entropy). In elementary quantum applications, the Fisher metric on quantum states

p = {| ψ |}^{2}

reproduces the spherical symmetry of wavefunction

ψ

on its Hilbert sphere, so conflict is not expected there, but that limited special case does not justify wider application.

2.3. Counter-Examples

The central symptom of failure for science is that, as applied to normalized distributions, geometry leaks across dimensions. Take an example: problem A goes from

q^{'} = (\frac{1}{2}, \frac{1}{2})

to

p^{'} = (1, 0)

. There being only one degree of freedom in the 2-simplex, the geodesic path is necessarily parameterized by

(a, 1 - a)

for

\frac{1}{2} \leq a \leq 1

with distance

s = π / 2

. Problem B again goes from

q^{″} = (\frac{1}{2}, \frac{1}{2})

to

p^{″} = (1, 0)

, but on different cells. Its geodesic is similarly parameterized by

(b, 1 - b)

for

\frac{1}{2} \leq b \leq 1

with distance

s = π / 2

. However, when A and B are treated as a joint problem

A \times B

going from

q = q^{'} \times q^{″} = (\frac{1}{4}, \frac{1}{4}, \frac{1}{4}, \frac{1}{4})

to

p = p^{'} \times p^{″} = (1, 0, 0, 0)

, geometry cannot, and does not, keep the supposedly independent dimensions apart. Instead, with three cells out of four the same at start and end, the geodesic follows

(1 - 3 c, c, c, c)

for

\frac{1}{4} \geq c \geq 0

with distance

s = 2 π / 3

.

(7)

Information H is of course properly additive (here

log 2 + log 2 = log 4

); however, distances are not. Information is arbitrarily additive as independent systems are included without limit, whereas distances are not (and cannot be because they are bounded by

π

).

\begin{matrix} H (A \times B) = log 4 & = & H (A) + H (B) = log 2 + log 2 & (independent ✓) \\ s (A \times B) = \frac{2}{3} π & \neq & s (A) + s (B) = \frac{1}{2} π + \frac{1}{2} π & (connected \times) \end{matrix}

(8)

Geodesic paths, integrating to lengths, are basic to geometry and define areas and volumes. The Fisher metric induces invariant volumes with local density

ρ = \prod p_{i}^{- 1 / 2}

. However, this form depends on the cellular decomposition, whereas macroscopic applications which do not depend on (arbitrarily fine) decomposition are ubiquitous in science. Statisticians call probability distributions on arbitrary decompositions “processes”, but distributions incorporating the Fisher density lack this property. For a start, coarse-graining two cells into one,

P = p_{1} + p_{2}

, yields a density

Pr (P)

which is constant, and not of the

P^{- 1 / 2}

Fisher form. Again, we see information geometry (if useful at all) being locked to a particular dimension.

Linear constraints, which reduce dimensionality, also destroy the original metric. For example, the constraint

p_{1} = p_{3}

on a 3-cell application

p_{1} + p_{2} + p_{3} = 1

with Fisher density

ρ \propto {(p_{1} p_{2} p_{3})}^{- 1 / 2}

leads to

Pr (p_{2}) \propto {(1 - p_{2})}^{- 1} p_{2}^{- 1 / 2}

dominated by the singularity at

p_{2} = 1

. Observing

p_{1} = p_{3}

seems to imply

p_{1} = p_{3} \approx 0

, which in science would usually be a very dubious inference. Inferences derived through geometrical formulation could be correct by “chance” or for special applications, but there can be no general guarantee.

3. Simple Physics: Quantification and Uncertainty

Physics is more subtle than elementary mathematics. In mathematics, quantities are idealized with potentially unlimited precision. Yet, in a physical world lacking probes of unlimited delicacy, our knowledge of quantities is necessarily accompanied by uncertainty. Consequently, physics requires a calculus of number pairs, not just scalars for quantity alone.

\begin{matrix} (Quantity & uncertainty) is represented by a number pair . \end{matrix} (pair postulate)

Note that we aim to work with a pair of numbers, quantity and uncertainty. This minimalist abstraction is deeper than (hence preferable to) the familiar

x \pm σ

, which summarizes a probability density function

Pr (x)

by supplying the first two moments while leaving higher moments (which need not exist) undefined. We will still require shuffling and sequencing. Shuffling can be thought of as proto-space and sequencing as proto-time, without which our intellectual endeavors fail.

3.1. Addition and the Sum Rule

Our description of physical objects still needs to obey shuffling just as before

\begin{matrix} \begin{matrix} A with B = B with A \\ A with (B with C) = (A with B) with C \end{matrix} \end{matrix} \begin{matrix} (commutative) \\ (associative) \end{matrix}

but the representation should be by number pairs instead of simple scalars. Moreover, the applications are now focused on basic physics. Obviously, component-wise addition of pairs

(\binom{a_{1}}{a_{2}}) + (\binom{b_{1}}{b_{2}}) = (\binom{a_{1} + b_{1}}{a_{2} + b_{2}})

(9)

obeys shuffling because it is just two copies of scalar addition so it could represent “with”. Less obviously, this basic addition is forced: there happens to be no alternative (up to isomorphism). The proof of that—in any dimension as it happens—follows that for scalars [6].

3.2. Multiplication and the Product Rules

Our description should still be distributive and also associative (sequencing).

\begin{matrix} \begin{matrix} A then (B with C) = (A then B) with (A then C) \\ (A with B) then C = (A then C) with (B then C) \\ A then (B then C) = (A then B) then C \end{matrix} \end{matrix} \begin{matrix} (left distributive) \\ (right distributive) \\ (associative) \end{matrix}

For scalars, representation of associativity was automatic. With pairs, multiplication is more complicated and associativity needs to be assumed.

As before, distributivity implies linearity, so the representation of “then” needs to be a multiplication that is linear in both factors. However, the factors are now number pairs so that general bilinear multiplication has eight terms, each with some coefficient

φ

.

{(a \cdot b)}_{i} = \sum_{j, k} φ_{i j k} a_{j} b_{k} \forall i, index range is {1, 2}

(10)

Associativity

a \cdot (b \cdot c) = (a \cdot b) \cdot c

then imposes

\sum_{k} φ_{s p k} φ_{k q r} = \sum_{k} φ_{s k r} φ_{k p q} \forall p, q, r, s

(11)

These 16 quadratic equations restrict the eight coefficients

φ

significantly, but not completely. Quadratics have multiple solutions and the equations allow (up to isomorphism) three different product rules [6] for pairs. We call these pair-valued product rules A, B and C.

(\binom{a_{1}}{a_{2}}) \cdot (\binom{b_{1}}{b_{2}}) = \underset{A}{\underset{︸}{(\binom{a_{1} b_{1} - a_{2} b_{2}}{a_{1} b_{2} + a_{2} b_{1}})}} or \underset{B}{\underset{︸}{(\binom{a_{1} b_{1} + a_{2} b_{2}}{a_{1} b_{2} + a_{2} b_{1}})}} or \underset{C}{\underset{︸}{(\binom{a_{1} b_{1}}{a_{1} b_{2} + a_{2} b_{1}})}}

(12)

There are also degenerate scalar–pair and scalar–scalar rules

a \cdot b = a_{1} b

or

b_{1} a

or

a_{1} b_{1}

so that ordinary scalar multiplication is included in this coherent foundation. There can be no conflict with probabilistic inference because the founding symmetries of shuffling and sequencing are the same.

Each rule can be cast as a matrix operator

(\binom{a_{1}}{a_{2}}) \cdot = \underset{A}{\underset{︸}{(\begin{matrix} a_{1} & - a_{2} \\ a_{2} & a_{1} \end{matrix})}} or \underset{B}{\underset{︸}{(\begin{matrix} a_{1} & a_{2} \\ a_{2} & a_{1} \end{matrix})}} or \underset{C}{\underset{︸}{(\begin{matrix} a_{1} & 0 \\ a_{2} & a_{1} \end{matrix})}}

(13)

From these three rules follows the rich subtlety of physics—all following from sequencing and shuffling of quantity-and-uncertainty according to the pair postulate.

Note that usage of all three pair–pair rules A, B, and C is not an assumption. Prohibition of a rule would be an assumption, amounting to a new restrictive law of physics. Acceptance of the rules is not. Everything is allowed that is not prohibited by some extra law.

4. Complex Numbers and the Born Rule

We have the three product rules, A, B, and C, as matrix operators (13). Among these, pick out those of unit determinant

| det (a \cdot) | = 1

whose repeated operation sends operands overall neither towards infinity nor towards zero. We identify these to have unit quantity. They can be rewritten with just one free parameter

θ

, which we call phase.

a (θ) \cdot = \underset{A}{\underset{︸}{(\begin{matrix} cos θ & - sin θ \\ sin θ & cos θ \end{matrix})}} or \underset{B}{\underset{︸}{(\begin{matrix} cosh θ & sinh θ \\ sinh θ & cosh θ \end{matrix})}} or \underset{C}{\underset{︸}{(\begin{matrix} 1 & 0 \\ θ & 1 \end{matrix})}}

(14)

We are aiming to represent “quantity & uncertainty”, so somehow we need to express the uncertainty around fixed unit quantity as some probability distribution

Pr (θ)

over this phase.

4.1. Phase

Phase is additive

a (θ) \cdot a (ϕ) \cdot = a (θ + ϕ) \cdot

for each of the rules, and conforms to commutative associativity (shuffling). Hence, any numerical assignment needs to be linear in the range of the phase interval(s), consequently of uniform density so that

Pr (θ) = constant

. For rule A, where phase is

2 π

-periodic, this means

Pr (θ) = \frac{1}{2 π}, θ \in [0, 2 π) .

(15)

For rules B and C, the range of

θ

is

(- \infty, \infty)

infinite, which admits no meaningful uniform probability distribution—it would sum to zero over any finite range, which is rightfully rejected as “improper”. So it is rule A alone that enables meaningful identification of uncertainty with phase, along with unit quantity being unit modulus.

In rule A, we recognize complex arithmetic (

a = a_{1} + i a_{2}

with

a (θ)

generated as

e^{i θ}

for unit quantity). This is why quantum theory uses complex numbers. There is no alternative and no mystery. Unit objects are sampled as

ψ = e^{i θ}

for uniformly random

θ

. Since

θ

can be arbitrary, the sum rule then fills out the complex plane continuously. Continuity is then automatic.

Every unknown phase represents an independent object. More precisely, phases unknown to us on the basis of our current information represent what appear to us to be independent objects. Somebody else (God, perhaps) might know better, but we cannot take advantage of knowledge that we do not possess. We represent

ψ

computationally as an ensemble

{ψ}_{θ}

of random phases.

4.2. Amplitude

With phase being uniformly distributed, general quantity will be identified with some phase-independent scalar function

f (| a |)

of modulus

| a |

with

f (1) = 1

from unit quantity. Suppose we have two independent objects, sampled as

x e^{i θ}

and

y e^{i ϕ}

where x and y are moduli (positive real). Their quantities will be

f (x)

and

f (y)

. The sum rule dictates that the combination is represented by

ψ = x e^{i θ} + y e^{i ϕ}

.

If we knew

θ

and

ϕ

, then we would assign

f (| ψ |)

as the quantity. However, we do not know

θ

and

ϕ

for independent objects and the rules of probability instruct us to average over what we do not know in order to estimate a result. So, quantity is the average

≪ f (| ψ |) ≫_{θ, ϕ}

over the uniformly distributed phases, for which double-angle brackets

≪ \dots ≫

denote statistical mean.

On the other hand, shuffling invariance requires the quantity of independent objects to be additive, so in this case the quantity is

f (x) + f (y)

.

≪ f (| x e^{i θ} + y e^{i ϕ} |) ≫_{θ, ϕ} = f (x) + f (y)

(16)

The unique solution of this functional equation is

f (ℓ) = ℓ^{2}

. In general, then,

Quantity (ψ) = {≪ | ψ |}^{2} ≫

(17)

which is the Born rule. There is no restriction in the calculus to unit quantity and the calculus is not requiring normalization of objects as commonly assumed in quantum formalism.

If an object has several different properties, they can be stacked together in a vector, which can be arbitrarily scaled and rotated by the sum rule to reach

Quantification (ψ) = ≪ 〈 ψ | X | ψ 〉 ≫

(18)

where

X

is a general Hermitian matrix. Here we employ the Dirac braket notation, which denotes object states with the ket

| ψ 〉

, and with the Hermitian conjugate of ket

| ψ 〉

being bra

〈 ψ |

). Eigenvector k of

X

is associated with property k with the associated real eigenvalue representing the corresponding responsivity. Hermitian matrices represent how objects can be observed and hence manipulated. This extension is standard.

5. The Pauli Matrices

Quantification-with-uncertainty is represented by complex numbers, but the symmetries of shuffling and sequencing still apply to vectors

(ψ^{(1)}, ψ^{(2)}, ψ^{(3)}, \dots)

of them. In particular, for pairs of complex numbers, we still have the two-dimensional sum rule and the three two-dimensional product rules A, B, and C with generators (19). The original domain of A, B, and C was real, but their derivation [6] holds equally in the complex domain, which we now consider.

We use matrix exponentials to write the rules (14) in generator form

a (θ) \cdot = exp [θ \underset{G_{A}}{\underset{︸}{(\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix})}}] or exp [θ \underset{G_{B}}{\underset{︸}{(\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix})}}] or exp [θ \underset{G_{C}}{\underset{︸}{(\begin{matrix} 0 & 0 \\ 1 & 0 \end{matrix})}}]

(19)

Generator C is not Hermitian, so it cannot (yet) be interpreted as an observable operator that could be used for physical manipulation. However, generator B is already Hermitian, and generator A can be made so by multiplying by

\sqrt{- 1}

. Their product yields a third “Pauli matrix”, which completes a Hermitian matrix representation of the Pauli group.

σ_{0} = \underset{1}{\underset{︸}{(\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix})}}, σ_{x} = \underset{G_{B}}{\underset{︸}{(\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix})}}, σ_{y} = \underset{j G_{A}}{\underset{︸}{(\begin{matrix} 0 & - j \\ j & 0 \end{matrix})}}, σ_{z} = \underset{- j σ_{x} σ_{y}}{\underset{︸}{(\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix})}} .

(20)

Suffices

x, y, z

anticipate later identification with spatial axes, and we have used the engineering symbol j for this

\sqrt{- 1}

because the parity of coordinates

x, y, z

(which is what j will become) need not be the same as the earlier

i = \sqrt{- 1}

, which defines the phases of objects placed in those coordinates (and will come to define the arrow of time). Different quantities should have different symbols.

6. Spinors

Take a pair of complex numbers, representing an object that can exist in either or both of two states, perhaps ↑ and ↓, with no eventual loss of generality because multiple properties can be built up from just two.

ψ = (\binom{ψ_{↑}}{ψ_{↓}}) = (\binom{ψ_{0} + i ψ_{1}}{ψ_{2} + i ψ_{3}})

(21)

Such pairs of complex numbers are called spinors. The Pauli observables are

\vec{q} \{\begin{matrix} q_{0} = 〈 ψ | σ_{0} | ψ 〉 = ψ_{↑}^{*} ψ_{↑} + ψ_{↓}^{*} ψ_{↓} \\ q_{x} = 〈 ψ | σ_{x} | ψ 〉 = ψ_{↑}^{*} ψ_{↓} + ψ_{↓}^{*} ψ_{↑} \\ j q_{y} = j 〈 ψ | σ_{y} | ψ 〉 = ψ_{↑}^{*} ψ_{↓} - ψ_{↓}^{*} ψ_{↑} \\ q_{z} = 〈 ψ | σ_{z} | ψ 〉 = ψ_{↑}^{*} ψ_{↑} - ψ_{↓}^{*} ψ_{↓} \end{matrix} \begin{matrix} \begin{matrix}  \end{matrix}\} \end{matrix} \begin{matrix} q \end{matrix}

(22)

These scalar quadruplets obey the “null identity”

q_{0}^{2} - q_{x}^{2} - q_{y}^{2} - q_{z}^{2} = 0

(23)

Being invariant, that means that different views of a spinor (or, equivalently, a single observer’s view of a changing spinor) must be related through the Minkowski metric

g = \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & - 1 & 0 & 0 \\ 0 & 0 & - 1 & 0 \\ 0 & 0 & 0 & - 1 \end{matrix}

(24)

Minkowski defines the bilinear scalar product

u \cdot v = u_{0} v_{0} - u_{x} v_{x} - u_{y} v_{y} - u_{z} v_{z} = u_{0} v_{0} - u \cdot v

(25)

and generates covariant dual vectors

\underset{\to}{q} = (q_{0}, - q_{x}, - q_{y}, - q_{z})

(26)

from contravariant vectors

\vec{q} = (q_{0}, q_{x}, q_{y}, q_{z}) .

(27)

Spinor Transform

Because of the null identity, there are only three independent ways of manipulating a spinor, not four. Accordingly, any small manipulation can be expressed as an operator having

x, y, z

components only

d Λ = - \frac{1}{2} (\underset{d λ \cdot σ}{\underset{︸}{d λ_{x} σ_{x} + d λ_{y} σ_{y} + d λ_{z} σ_{z}}})

(28)

(the

- \frac{1}{2}

is mere convention) defined by three coefficients

d λ_{k} = d ξ_{k} + i d η_{k}

, which may be complex, giving six real control parameters. Evaluation shows that spinor quadruplets (22) and their duals transform under

d Λ

as

d \vec{q} = d \begin{matrix} q_{0} \\ q \end{matrix} = \begin{matrix} 0 & - d ξ \cdot \\ - d ξ & - i j d η \times \end{matrix} \begin{matrix} q_{0} \\ q \end{matrix} = d S \vec{q}

(29)

d \underset{\to}{q} = d \begin{matrix} q_{0} \\ - q \end{matrix} = \begin{matrix} 0 & d ξ \cdot \\ d ξ & - i j d η \times \end{matrix} \begin{matrix} q_{0} \\ - q \end{matrix} = - d S^{*} \underset{\to}{q}

(30)

where · and × are the standard three-dimensional scalar and vector products. The imaginary coefficients

η

clearly represent a three-dimensional rotation, while the real parts

ξ

represent boost, later to be associated with motion. Scalar products (25),

u \cdot v

, are invariant so that we may legitimately call spinor quadruplets Lorentz 4-vectors.

However, these changes depend on

i j

, which could be either

- 1

or

+ 1

depending on whether or not i was selected to be the same

\sqrt{- 1}

as j. Either choice would be a restriction that would exclude the other, with observable consequences including inability to depart from null quadruplets. We find no justification for such choice, and our strategy for science is to allow everything that is not prohibited by a justified restriction. That means using both signs for

i j

.

7. Bispinors

To use both signs of

i j

, it is conventional (and simpler) to fix the coordinates (with j defining

σ

) and duplicate the contents (with i defining

ψ

). Hence, a spinor

ψ

involving i is to be accompanied by a conjugate spinor

ϕ

involving

- i

. So, we need to represent the fundamental object, not with a single spinor as in (21), but with a bispinor

\underset{~}{Ψ} = \begin{matrix} ψ \\ ϕ^{*} \end{matrix} = \begin{matrix} ψ_{↑} \\ ψ_{↓} \\ ϕ_{↑}^{*} \\ ϕ_{↓}^{*} \end{matrix} = \begin{matrix} ψ_{0} + i ψ_{1} \\ ψ_{2} + i ψ_{3} \\ ϕ_{0} - i ϕ_{1} \\ ϕ_{2} - i ϕ_{3} \end{matrix} \begin{matrix} right-hand chirality \\ left-hand chirality \end{matrix}

(31)

whereas j denoted right-or-left parity, i denotes chirality, which is not the same, but (confusingly) is also termed right or left. However, j sets the axes and i sets what is placed in them, which is different. The signs

+ i

and

- i

in (31) are exchangeable because exchange simply swaps the artificial chirality labels, which we call ambiconjugacy.

7.1. Bispinor Transform

To apply a Lorentz transform to a bispinor, the two chiral component spinors

ψ

and

ϕ

each need to transform, so that if

ψ

transforms through

λ \cdot σ

, then

ϕ^{*}

transforms through the conjugate

λ^{*} \cdot σ

. (Conjugation is on

λ

, which operates on the i in

\underset{~}{Ψ}

, but not on j, which would inappropriately reverse the coordinate

σ_{y}

.) However, the ambiconjugacy relationship is involuntary (self-inverse), and this allows a minus sign. In fact, a minus sign after the style of (30) is required.

d \underset{~}{Ψ} = d \underset{\approx}{Λ} \underset{~}{Ψ}, d \underset{\approx}{Λ} = - \frac{1}{2} \begin{matrix} d λ \cdot σ & 0 \\ 0 & - d λ^{*} \cdot σ \end{matrix}

(32)

While right-hand

ψ

transforms by contravariant

d λ

as in (29), left-hand

ϕ^{*}

transforms by

- d λ^{*}

. That has the same effect on observables (29) as deconjugating by unreversing i, so the net effect for fixed j is that

ϕ

transforms by covariant

- d λ

. Right and left components transform oppositely, through

d ψ = d S ψ

and

d ϕ = - d S ϕ

. Consequently,

ψ_{↑} ϕ_{↑} + ψ_{↓} ϕ_{↓} is a Lorentz invariant .

This linkage between right and left chirality is the key to physics, underlying time and space.

7.2. Bispinor Observables

In order to build observables on bispinors as a whole, we put left-hand (covariant) quadruplets

\underset{\to}{ℓ} = ϕ^{*}

on the same (contravariant) footing as right-hand

\vec{r} = ψ

by adopting

\vec{ℓ} = g \underset{\to}{ℓ}

. We can then set sum and difference 4-vectors

\vec{p} = \vec{r} + \vec{ℓ} \{\begin{matrix} p_{0} = r_{0} + ℓ_{0} & , & s_{0} = r_{0} - ℓ_{0} \\ p_{x} = r_{x} - ℓ_{x} & , & s_{x} = r_{x} + ℓ_{x} \\ p_{y} = r_{y} - ℓ_{y} & , & s_{y} = r_{y} + ℓ_{y} \\ p_{z} = r_{z} - ℓ_{z} & , & s_{z} = r_{z} + ℓ_{z} \end{matrix}\} \vec{s} = \vec{r} - \vec{ℓ}

(33)

whose components are accessed with

q = 〈 \underset{~}{Ψ} | \underset{\approx}{Q} | \underset{~}{Ψ} 〉

as a template through block-diagonal Hermitian unitary operators

\begin{matrix} \underset{\approx}{p}_{0} = \begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}, & \underset{\approx}{p}_{x} = \begin{matrix} σ_{x} & 0 \\ 0 & - σ_{x} \end{matrix}, & \underset{\approx}{p}_{y} = \begin{matrix} σ_{y} & 0 \\ 0 & - σ_{y} \end{matrix}, & \underset{\approx}{p}_{z} = \begin{matrix} σ_{z} & 0 \\ 0 & - σ_{z} \end{matrix} \\ \underset{\approx}{s}_{0} = \begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}, & \underset{\approx}{s}_{x} = \begin{matrix} σ_{x} & 0 \\ 0 & σ_{x} \end{matrix}, & \underset{\approx}{s}_{y} = \begin{matrix} σ_{y} & 0 \\ 0 & σ_{y} \end{matrix}, & \underset{\approx}{s}_{z} = \begin{matrix} σ_{z} & 0 \\ 0 & σ_{z} \end{matrix} \end{matrix}

(34)

With

\vec{r}

and

\underset{\to}{ℓ}

being null, the bispinor observables obey

\vec{p} \cdot \vec{p} = + m^{2}, \vec{s} \cdot \vec{s} = - m^{2}, \vec{p} \cdot \vec{s} = 0 .

(35)

where

m = | ψ_{↑} ϕ_{↑} + ψ_{↓} ϕ_{↓} |

is the Lorentz invariant. Physics is no longer restricted to null 4-vectors, and m becomes known as proper mass while

\vec{p}

is 4-momentum and

\vec{s}

is spin. Without the minus sign in (32),

ψ

and

ϕ

would have transformed the same, making the chiral components observationally disconnected. Physics would have remained restricted to null 4-vectors, negating our strategy of generality.

8. Review

With the basic calculus of bispinors in place, development proceeds to operators and the Dirac matrices. Then, product rule C generates equations where we recognize time and space, not directly but through differentials

\partial_{t}

and

(\partial_{x}, \partial_{y}, \partial_{z})

which lead to the Schrödinger representation and the Dirac equation. All these standard equations of physics follow from the same simple shuffling and sequencing that underlie measures, probability, and information.

\begin{matrix} • We see why complex numbers underlie physics . & (14) rule A \\ • We see why uncertainty is represented by unknown phase . & (15) \\ • We see why quantification uses the modulus - squared Born rule . & (18) \\ • We see why observables are modelled by Hermitian matrices . & (19) \\ • We see why the Pauli matrices are required . & (20) \\ • We see why a bispinor representation is needed . & (31) chirality \\ • We see why we have Lorentz transforms . & (29) \\ • We see why there are three dimensions of space . & (24) \\ • We see why the metric is locally Minkowski . & (24) \\ • We see why there is a limiting speed (of light) . & (23) null identity \end{matrix}

Spacetime is seen to be a property of quantum theory, not a container for it.

These deep foundations of physics are an inevitable consequence of accepting uncertainty, and arise from pure thought alone with no need for further assumptions beyond shuffling and sequencing. Further development of the calculus of interactions is expected to lead to forces and to quantization through measurement.

Author Contributions

The authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Aczél, J. Lectures on Functional Equations and Their Applications; Academic Press: New York, NY, USA, 1966. [Google Scholar]
Aczél, J. The associativity equation re-revisited. In Proceedings of the 23rd International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Jackson Hole, WY, USA, 3–8 August 2003; Erickson, G.J., Zhai, Y., Eds.; AIP Conf. Proc. 707. AIP: New York, NY, USA, 2004; pp. 195–203. [Google Scholar] [CrossRef]
Knuth, K.H.; Skilling, J. Foundations of inference. Axioms 2012, 1, 38–73. [Google Scholar] [CrossRef] [Green Version]
Amari, S.I. Differential-Geometrical Methods in Statistics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1985; Volume 28. [Google Scholar]
Rao, C.R. Differential metrics in probability spaces. Differ. Geom. Stat. Inference 1987, 10, 217–240. [Google Scholar]
Skilling, J.; Knuth, K.H. The symmetrical foundation of Measure, Probability and Quantum theories. Ann. Phys. 2019, 531, 1800057. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Skilling, J.; Knuth, K.H. On Foundational Physics. Phys. Sci. Forum 2022, 5, 40. https://doi.org/10.3390/psf2022005040

AMA Style

Skilling J, Knuth KH. On Foundational Physics. Physical Sciences Forum. 2022; 5(1):40. https://doi.org/10.3390/psf2022005040

Chicago/Turabian Style

Skilling, John, and Kevin H. Knuth. 2022. "On Foundational Physics" Physical Sciences Forum 5, no. 1: 40. https://doi.org/10.3390/psf2022005040

Article Menu

On Foundational Physics^†

Abstract

1. Simple Mathematics

1.1. Combining in Parallel: Addition

1.2. Combination in Series: Multiplication

2. Probability, Information, and Geometry

2.1. Information

2.2. Geometry

2.3. Counter-Examples

3. Simple Physics: Quantification and Uncertainty

3.1. Addition and the Sum Rule

3.2. Multiplication and the Product Rules

4. Complex Numbers and the Born Rule

4.1. Phase

4.2. Amplitude

5. The Pauli Matrices

6. Spinors

Spinor Transform

7. Bispinors

7.1. Bispinor Transform

7.2. Bispinor Observables

8. Review

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

On Foundational Physics †

Abstract

1. Simple Mathematics

1.1. Combining in Parallel: Addition

1.2. Combination in Series: Multiplication

2. Probability, Information, and Geometry

2.1. Information

2.2. Geometry

2.3. Counter-Examples

3. Simple Physics: Quantification and Uncertainty

3.1. Addition and the Sum Rule

3.2. Multiplication and the Product Rules

4. Complex Numbers and the Born Rule

4.1. Phase

4.2. Amplitude

5. The Pauli Matrices

6. Spinors

Spinor Transform

7. Bispinors

7.1. Bispinor Transform

7.2. Bispinor Observables

8. Review

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

On Foundational Physics^†