Skip to main content# Wave-function collapse versus Euler’s formula*

# Wave-function parametrization

### The wave-function is a parametrization of any probability measure

### Quantum Mechanics versus a non-commutative generalization of probability theory

### Symmetries and unitary representations

### Deterministic transformations

### Euler’s formula for the probability clock

### The Stern-Gerlach experiment

### Black hole information paradox and the Stern-Gerlach experiment

### Euler’s formula for a phase-space with 4 states

### Euler’s formula for a generic phase-space

### Complex and Quaternionic Hilbert spaces

### Comparing the time evolution with a stochastic process

### Time translation is a stochastic process if and only if it is deterministic

### Symmetries as irreversible processes

### Quantum Mechanics is EPR-complete

### Any deterministic theory compatible with relativistic Quantum Mechanics necessarily respects relativistic causality

### A deterministic theory compatible with relativistic Quantum Mechanics

### The Young’s double slit experiment

### Do the Bell inequalities hold?

### Conditioned probability and constrained systems

### A translation-invariant time-evolution and (classical) statistical field theory

Definition of a class of statistical field theories where the probability distribution of the infinite-dimensional phase-space is mathematically defined by a wave-function in a Fock-space, allowing Hamiltonians which are non-polynomial in the fields.

Published onNov 05, 2021

Wave-function collapse versus Euler’s formula*

Measure spaces (in particular, algebraic measure theory as defined below) have many applications in classical mechanics and dynamical systems [1], when non-linear equations and/or complex phase spaces become an obstacle.

Any measure space can be defined as a probability space: a probability space is a measure space where a probability density was chosen, which is any measurable function normalized such that its measure is 1. In the historical (that is, Kolmogorov’s) probability theory, a probability space has three parts: a phase space (which is the set of possible states of a system); a complete Boolean algebra of events (where each event is a subset of the set of possible states); and a probability measure which assigns a probability to each event. The probability is a map from complex random events to abstract random events, shifting all ambiguity with the notion of randomness to the abstract random events as described in the following: that the probability of an event is $0.32$ means that our event has the same likelihood of finding a treasure that we know it was hidden in the sand of a 1 km wide beach, if we only search for it with a metal detector in a 320m interval. While this treasure hunt is ambiguous (are there any clues for the location of the treasure?, etc.) the map from our complex events to this treasure hunt is unambiguous.

On the other hand, a *standard* measure space is isomorphic up to sets with null measure to the real Lebesgue measure in the unit interval or to a discrete (finite or countable) measure space or to a mixture of the two. Thus, topological notions such as dimension do not apply to standard measure spaces. Most probability spaces with real-world applications are *standard *measure spaces. Equivalently, a *standard* measure space can be defined such that the following correspondence holds: every commutative von Neumann algebra on a separable real Hilbert space is isomorphic to $L^\infty(X,\mu)$ for some standard measure space $(X,\mu)$ and conversely, for every standard measure space $(X,\mu)$ the corresponding Lebesgue space $L^\infty(X,\mu)$ is a von Neumann algebra. As expected, the representation of an algebra of events in a real Hilbert space uses projection-valued measures [2][3][4][5]. A projection-valued measure assigns a self-adjoint projection operator of a real Hilbert space to each event, in such a way that the boolean algebra of events is represented by the commutative von Neumann algebra. Thus, intersection/union of events is represented by products/sums of projections, respectively. The state of the ensemble is a linear functional which assigns a probability to each projection. Thus, there is an algebraic measure theory[1][6][7][8] based on commutative von Neumann algebras on a separable Hilbert space which is essentially equivalent to measure theory (for standard measure spaces).

To be sure, the algebraic measure theory is based on *commutative* algebras, thus it is not a non-commutative generalization of probability or information theory1. That is, there is no need for a conceptual or foundational revolution such as qubits replacing bits when switching from the historical to the algebraic probability theory [6]. Moreover, this is a common procedure in mathematics, as illustrated in the following quote [9] (note that a probability measure is related to integration):

The fundamental notions of calculus, namely differentiation and integration, are often viewed as being the quintessential concepts in mathematical analysis, as their standard definitions involve the concept of a limit. However, it is possible to capture most of the essence of these notions by purely algebraic means (almost completely avoiding the use of limits, Riemann sums, and similar devices), which turns out to be useful when trying to generalize these concepts[...]T. Tao (2013)[9]

The relation of the algebraic measure theory with probability theory is the following: let $p(x|y)$ be a conditional probability density between two *standard* measure spaces $X, Y$ (possibly with continuous and discrete parts), then $p(x|y)p(y)=p(y)T^2(x,y)=p(y|x)p(x)$ for any probability density $p(y)$ and some bounded operator $T$ on the separable Hilbert space. From the condition $\int dx\ p(y|x)p(x)=p(y)$, we conclude that $T$ is an isometry. But since the Hilbert space is separable, there is an orthonormal discrete basis and we can build a unitary operator $U$ through the Gram-Schmidt process such that $TT^\dagger U=T$. We can then enlarge the discrete part of the phase-space $Y$ to include the indices corresponding to the elements of the basis that were missing, setting $p(y)=0$ for the new indices. Thus $p(x|y)p(y)=p(y)U^2(x,y)=p(y|x)p(x)$ and so any conditioned probability density can be represented by a unitary operator on the separable Hilbert space. Conversely, any operator on the separable Hilbert space also corresponds to a conditioned probability density. In the particular case where the *standard* measure space $Y$ has just one element, we get the result that the wave-function is a parametrization of any probability measure.

The linearity of the commutative algebra; avoiding a fixed phase space *a priori*; and the fact that we can map complex random phenomena to an abstract random process unambiguously*,* are obvious advantages for algebraic measure theory when we want to compare probability theory with Quantum Mechanics, where the linearity of the canonical transformations is guaranteed by Wigner’s theorem (it is essentially a consequence of the Born’s rule applied to a non-commutative algebra of operators[10][11][12]); the Hilbert space of wave-functions replaces the phase space; and the canonical transformations are non-deterministic.

The algebraic measure theory is also different from defining the phase space as a reproducing kernel Hilbert space [13][14], since no phase space (whether it is a Hilbert space or not) is defined *a priori*. Note that defining the phase space as a Sobolev Hilbert space is common in classical field theory[15], but defining a general probability measure in such space is still an open problem.

Quantum Mechanics leaves room for a non-commutative generalization of probability theory, since the wave-function could also assign a probability to non-diagonal projections, these non-diagonal projections would generate a non-commutative algebra [16].

Consider for instance the projection $P_X$ to a region of space $X$ and a projection $U P_p U^\dagger$ to a region of momentum $p$, where $P_X$ and $P_p$ are diagonal in the same basis. The projections $P_X$ and $U P_p U^\dagger$ are related by a Fourier transform $U$ and thus are diagonal in different basis and do not commute (they are complementary observables). Since we can choose to measure position or momentum, it seems that Quantum Mechanics is a non-commutative generalization of probability theory [16].

But due to the wave-function collapse, Quantum Mechanics is not a non-commutative generalization of probability theory despite the appearances: the measurement of the momentum is only possible if a physical transformation of the statistical ensemble also occurs, as we show in the following.

Suppose that $E(P_X)$ is the probability that the system is in the region of space $X$, for the state of the ensemble $E$ diagonal (i.e. verifying $E(O)=0$ for operators $O$ with null diagonal). Using the notation of [17], we have $E(A)=\mathrm{tr}(\rho A)$ where $A$ is any operator and $\rho$ is a self-adjoint operator with $\mathrm{tr}(\rho)=1$ and $\rho=\sum_{X} P_X \rho P_X$, because $\rho$ is diagonal in the same basis where the projection operators $P_X$ are diagonal. If an operator $O$ has null diagonal in the same basis where $P_X$ is diagonal, then $P_X O P_X=0$ for any $X$ and then $E(O)=\sum_{X} \mathrm{tr}(P_X \rho P_X O)=0$.

If we consider a unitary transformation $U$ on the ensemble, then after the wave-function collapse we have a new ensemble with state $E_U$ given by:

$\begin{aligned}
E_U(A)=&tr(\rho_U U A U^\dagger)\\
\rho_U=&\sum_{p} U P_p U^\dagger \rho U P_p U^\dagger\end{aligned}$(1)

If an operator $O$ has null diagonal in the same basis where $P_p$ is diagonal, then $P_p O P_p=0$ for any $p$ and then:

$\begin{aligned}
E_U(O)=&\sum_{p} \mathrm{tr}(P_p U^\dagger \rho U P_p O)=0\end{aligned}$(2)

If an operator $D$ is diagonal in the same basis where $P_p$ is diagonal, then $D=\sum_{p} P_p D P_p$ and then:

$\begin{aligned}
E_U(D)&=\sum_{p} \mathrm{tr}(P_p U^\dagger \rho U P_p D)=\\
&=\mathrm{tr}(\rho U D U^\dagger)=E(U D U^\dagger)\end{aligned}$(3)

Thus, we define:

$\begin{aligned}
E_U(D)&=E(U D U^\dagger)\\
E_U(O)&=0\end{aligned}$(4)

Where $D$ is a diagonal operator and $O$ is an operator with null diagonal. The equation ([eq:collapse]) is due to the wave-function collapse. Thus $E_U(P_p)=E(U P_p U^\dagger)$ is the probability that the system is in the region of momentum $p$, for the state of the ensemble $E_U$. But the ensembles $E$ and $E_U$ are different, there is a physical transformation relating them.

Without collapse, we would have $E_U(O)=E(U O U^\dagger)\neq 0$ for operators $O$ with null-diagonal and we could talk about a common state of the ensemble $E$ assigning probabilities to a non-commutative algebra. But the collapse keeps Quantum Mechanics as a standard probability theory, even when complementary observables are considered. We could argue that the collapse plays a key role in the consistency of the theory, as we will see below.

At first sight, our result that the wave-function is merely a parametrization of any probability measure, resembles Gleason’s theorem [18][19]. However, there is a key difference: we are dealing with commuting projections and consequently with the wave-function, while Gleason’s theorem says that any probability measure for all *non-commuting* projections defined in a Hilbert space (with dimension $\geq 3$) can be parametrized by a density matrix. Note that a density matrix includes mixed states, and thus it is more general than a pure state which is represented by a wave-function.

We can check the difference in the 2-dimensional real case. Our result is that there is always a wave-function $\Psi$ such that $\Psi^2(1)=\cos^2(\theta)$ and $\Psi^2(2)=\sin^2(\theta)$ for any $\theta$.

However, if we consider non-commuting projections and a diagonal constant density matrix $\rho=\frac{1}{2}$, then we have:

$\begin{aligned}
\begin{cases}
\mathrm{tr}(\rho \left[\begin{smallmatrix} 1 & 0\\ 0 & 0 \end{smallmatrix}\right])=\frac{1}{2}\\
\mathrm{tr}(\rho \frac{1}{2}\left[\begin{smallmatrix} 1 & 1\\ 1 & 1 \end{smallmatrix}\right])=\frac{1}{2}
\end{cases}\end{aligned}$(5)

Our result implies that there is a pure state, such that:

$\begin{aligned}
\mathrm{tr}(\rho \left[\begin{smallmatrix} 1 & 0\\ 0 & 0 \end{smallmatrix}\right])=\frac{1}{2}\end{aligned}$(6)

(e.g. $\rho=\frac{1}{2}\left[\begin{smallmatrix} 1 & 1\\ 1 & 1 \end{smallmatrix}\right]$)

And there is another possibly different pure state, such that:

$\begin{aligned}
\mathrm{tr}(\rho \frac{1}{2}\left[\begin{smallmatrix} 1 & 1\\ 1 & 1 \end{smallmatrix}\right])=\frac{1}{2}\end{aligned}$(7)

(e.g. $\rho=\left[\begin{smallmatrix} 1 & 0\\ 0 & 0 \end{smallmatrix}\right]$)

But there is no $\rho$ which is a pure state, such that:

$\begin{aligned}
\begin{cases}
\mathrm{tr}(\rho \left[\begin{smallmatrix} 1 & 0\\ 0 & 0 \end{smallmatrix}\right])=\frac{1}{2}\\
\mathrm{tr}(\rho \frac{1}{2}\left[\begin{smallmatrix} 1 & 1\\ 1 & 1 \end{smallmatrix}\right])=\frac{1}{2}
\end{cases}\end{aligned}$(8)

On the other hand, Gleason’s theorem implies that there is a $\rho$ which is a mixed state, such that :

$\begin{aligned}
\begin{cases}
\mathrm{tr}(\rho \left[\begin{smallmatrix} 1 & 0\\ 0 & 0 \end{smallmatrix}\right])=\frac{1}{2}\\
\mathrm{tr}(\rho \frac{1}{2}\left[\begin{smallmatrix} 1 & 1\\ 1 & 1 \end{smallmatrix}\right])=\frac{1}{2}
\end{cases}\end{aligned}$(9)

Gleason’s theorem is relevant if we neglect the wave-function collapse, since it attaches a unique density matrix to non-commuting operators. However, the wave-function collapse affects differently the density matrix when different non-commuting operators are considered, so that after measurement the density matrix is no longer unique. In contrast, without the wave-function collapse, the wave-function parametrization of a probability measure would not be possible.

Another difference is that our result applies to standard probability theory, while Gleason’s theorem applies to a non-commutative generalization of probability theory.

A dynamical system can be classified by all possible transformations that can occur in the phase space at each evolution step. When these transformations are a function of a group, then the classification of the dynamical system becomes independent from the evolution stage (usually called the *time*). The group/transformations are then called the symmetry group/transformations (also called canonical transformations). Note that the function relating the group to the phase space does not need to conserve the group action because in a dynamical system all possible transformations that can occur at each evolution step do not need to form a group at all.

In the case of Quantum Mechanics, all possible transformations that can occur in the phase space at each evolution step are given by conditional probability densities between two standard measure spaces. There is a surjective function from the group of linear and unitary operators on a separable Hilbert space to all such conditional probability densities. Thus there is not necessarily a group action of a symmetry group on the probability density (for the state of a system) itself. We address in Section 12 when such action on the probability distribution exists and when it does not exist.

Crucially, the symmetry transformations include all the deterministic transformations, which will be defined in the following. Thus the symmetry transformations are a generalization of the deterministic transformations.

A deterministic transformation acts as $E(P_A)\to E(P_B)$ where $A,B$ are events and $P_A$ is a projection operator, for any expectation functional $E$ and event $A$. When the probability is concentrated in the neighborhood of a single outcome (say $A$), we have effectively a deterministic case and this transformation ($A\to B$) conserves the determinism, thus it is a deterministic transformation.

Note that above, $P_A$ and $P_B$ necessarily commute. On the other hand, if the transformation is such that $E(P_A)\to E(U P_A U^\dagger)$ where $U$ is a unitary operator and $P_A$ and $U P_A U^\dagger$ do not commute, then the transformation cannot be deterministic. Consider the discrete case with $E(P_n)$ given by $\mathrm{tr}(P_m P_n)=\delta_{mn}$ up to a normalization factor, for instance. Then $\mathrm{tr}(P_mP_n)\to \mathrm{tr}(P_mU P_n U^\dagger)=U^2_{nm}$. If the transformation would be deterministic, then necessarily $U^2_{nm}=\delta_{kn}$ for some $k=f(n)$ dependent on $n$, and so $U P_n U^\dagger=P_{l}$ with $l=f^{-1}(n)$ would commute with $P_n$.

We conclude that a transformation $U$ is deterministic if and only if $P_A$ and $U P_A U^\dagger$ commute for all events $A$. Thus, the complementarity of two observables (e.g. position and momentum) is due to the random nature of the symmetry transformation relating the two observables. This clarifies that probability theory has no trouble in dealing with non-commuting observables, as long as the collapse of the wave-function occurs. Note that Quantum Mechanics is not a generalization of probability theory, but it is definitely a generalization of classical mechanics since it involves non-deterministic symmetry transformations. For instance, the time evolution may be non-deterministic unlike in classical mechanics.

The previous sections established that the ensemble interpretation is self-consistent. However, the ensemble interpretation does not address the question why the wave-function plays a central role in the calculation of the probability distribution, unlike most other interpretations of quantum mechanics. By being compatible with most (if not all) interpretations of Quantum Mechanics, the ensemble interpretation is in practice a common denominator of most interpretations of Quantum Mechanics. It is useful, but it is not enough.

In this and in the following sections we will show that the wave-function is nothing else than one possible parametrization of any probability distribution. The wave-function can be described as a multi-dimensional generalization of Euler’s formula, and its collapse as a generalization of taking the real part of Euler’s formula. The wave-function plays a central role because it is a good parametrization that allows us to represent a group of transformations using linear transformations of the hypersphere. It is precisely the fact that the hypersphere is not the phase-space of the theory that implies the collapse of the wave-function. Without collapse, the wave-function parametrization would be inconsistent.

Suppose that we have an oscillatory motion of a ball, with position $x=\cos(t)$ and we want to make a translation in time2, $\cos(t)\to \cos(t+a)$. This is a non-linear transformation. However, if we consider not only the position but also the velocity of the ball, we have the “wave-function” given by the Euler’s formula $q(t)=e^{it}$ and $x$ is the real part of $q$. Then, a translation is represented by a rotation $q(t+a)=e^{ia} q(t)$. To know $x$ after the translation, we need to take the real part of the wave-function $e^{ia} q(t)$, *after* applying the translation operator.

Of course, $\cos(t)$ is not positive and so it has nothing to do with probabilities. However, we can easily apply Euler’s formula to a probability clock. A probability clock [@NOOM] is a time-varying probability distribution for a phase-space with 2 states, such that the probabilities are $\cos^2(t)$ and $\sin^2(t)$, for the first and second states respectively.

A 2-dimensional real wave-function allows us to apply the Euler’s formula to the probability clock:

$\begin{aligned}
\Psi(t)=\exp\left(\left[\begin{smallmatrix} 0 & -1\\ 1 & 0 \end{smallmatrix}\right] t\right)\left[\begin{smallmatrix} 1\\ 0 \end{smallmatrix}\right]=\left[\begin{smallmatrix} \cos(t)\\ \sin(t) \end{smallmatrix}\right]\end{aligned}$(10)

The Euler’s formula for the density matrix is:

$\begin{aligned}
\Psi \Psi^\dagger=\left[\begin{smallmatrix} \cos^2(t) & \cos(t)\sin(t)\\ \cos(t)\sin(t) & \sin^2(t) \end{smallmatrix}\right]=\frac{1}{2}+ \left[\begin{smallmatrix} \frac{1}{2} & 0\\ 0 & -\frac{1}{2} \end{smallmatrix}\right](\cos(2t) +J\sin(2t))\end{aligned}$(11)

Where $J=\left[\begin{smallmatrix} 0 & 1 \\ -1 & 0 \end{smallmatrix}\right]$ plays the role of the imaginary unit in the Euler’s formula for the probability clock. A measurement using a diagonal projection triggers the collapse of the wave-function, such that a new density matrix is obtained by setting the off-diagonal part (i.e. the part proportional to $J$) of the original density matrix to zero. The probability distribution is given by the diagonal part of the density matrix, i.e. by taking the “real part” of the “complex number” $\cos(2t) +J\sin(2t)$:

$\begin{aligned}
\mathrm{diag}(\Psi\Psi^\dagger)=\left[\begin{smallmatrix} \cos^2(t) & 0\\ 0 & \sin^2(t) \end{smallmatrix}\right]=\frac{1}{2}+\left[\begin{smallmatrix} \frac{1}{2} & 0\\ 0 & -\frac{1}{2} \end{smallmatrix}\right]\cos(2t)\end{aligned}$(12)

Since $\cos^2(t)+\sin^2(t)=1$ and $0<\cos^2(t)<1$, we can confirm that the wave-function parametrizes all probability distribution functions for a phase-space with 2 states, i.e. for any probability $p$ there is an angle $t$ such that the cosine $\cos(t)$ of that angle verifies $\cos^2(t)=p$. Moreover, two wave-functions are always related by a rotation $\Psi(t+a)=\exp\left(J a\right)\Psi(t)$, for some $a$.

Note that the rotation is an invertible linear transformation that preserves the space of wave-functions. This does not happen with probability distributions: the most general linear transformation of a probability distribution that preserves the space of probability distributions is:

$\begin{aligned}
M(a,b)=\left[\begin{smallmatrix} \cos^2(a) & \cos^2(b)\\ \sin^2(a) & \sin^2(b)\end{smallmatrix}\right]\ \ \ \ \mathrm{(where\ a,b\ are\ real\ numbers)}\end{aligned}$(13)

because if we apply $M$ to a deterministic distribution $\left[\begin{smallmatrix} 1\\ 0 \end{smallmatrix}\right]$ or $\left[\begin{smallmatrix} 0\\ 1 \end{smallmatrix}\right]$ we must obtain probability distributions which leads to the constraints $\cos^2(a)+\sin^2(a)=\cos^2(b)+\sin^2(b)=1$ and $\cos^2(a),\sin^2(a),\cos^2(b),\sin^2(b)\geq 0$; the matrix $M$ such that:

$\begin{aligned}
M\frac{1}{2}\left[\begin{smallmatrix} 1 \\ 1 \end{smallmatrix}\right]=\left[\begin{smallmatrix} 1\\ 0 \end{smallmatrix}\right]\end{aligned}$(14)

is necessarily singular and so it is not suitable to represent a symmetry group.

The wave-function is thus a good parametrization which allows us to represent a group of transformations using linear transformations of the points of a circle. The collapse of the wave-function is nothing more than taking the real part of a complex number as in most applications of Euler’s formula in Engineering, reflecting the fact that the circle is not the phase-space of the theory. Thus the wave-function is nothing more than a parametrization of the probability distribution.

We follow reference [@sakurai] for the description of the Stern-Gerlach experiment, first carried out in Frankfurt by O. Stern and W. Gerlach in 1922. This experiment makes a strong case in favor of generalizing the symmetry transformations to become non-deterministic, moreover the theoretical predictions only require a phase-space with two states like the one already discussed in the previous section. Note that we only make measurements along the z and x-axis, but if we also had made measurements along the y-axis then the phase space would require four states or a parametrization with a complex wave-function, see Section 10. Some articles such as reference [@sgquantum] argue that a “full quantum” analysis of the Stern-Gerlach experiment must involve the position degrees of freedom and thus a phase-space with more than two states. But as in every theoretical model for any real experiment we should consider only a phase-space which is as large as it is strictly necessary to compute all predictions for all practical purposes and do not waste time with redundant calculations which only add complexity and increase the likelihood of committing mistakes. Of course, the real Stern-Gerlach experiment involves much more than two states, for instance if the electrical power feeding the experiment is shutdown due to an earthquake or if the man managing the experiment has a heart attack it will affect the experimental results, but all predictions for all practical purposes can be computed using a phase-space with only two degrees of freedom.

In the Stern-Gerlach experiment, a beam of silver atoms is sent through a magnetic field with a gradient along the z or x-axis and their deflection is observed. The results show that the silver atoms possess an intrinsic angular momentum (spin) that takes only one of two possible values (here represented by the symbols + and -). Moreover in sequential Stern-Gerlach experiments (see figure 1), the measurement of the spin along the z-axis destroys the information about a atom’s spin along the x axis.

We consider in the phase-space, not only the spin of one atom of the beam, but also the angle of orientation of a macroscopic object which serves as a reference, for pedagogical purposes. The corresponding complete wave-function is thus a reducible representation of the rotation group. When we apply a rotation to the phase-space, the rotation is a non-deterministic transformation of the spin of the atom and a deterministic transformation of the macroscopic object. Thus, to keep track of the part of the wave-function corresponding to the angle of orientation of the reference macroscopic object we only need the central value of the probability distribution for such angle, which we will call simply “the angle” for brevity. And then we only consider the part of the wave-function corresponding to the spin of the atom.

In Equation 12, $cos^2(t)$ is the probability for the spin to be in the state $+$, while $sin^2(t)$ is the probability for the spin to be in the state $-$. The non-deterministic symmetry transformation given by a rotation of the spin along the $x-z$ plane is parametrized by the parameter $t$ and its linear representation on the wave-function is described in Equation 10.

In the first measurement, the angle of the reference macroscopic object is 0 with respect to the z-axis; and we know for sure that the spin is in the state $+$ ($t=0$) because we are measuring the spin along the z-axis of atoms that were previously filtered to be in the state $+$ when measuring the spin along the z-axis (see the first graph in figure 1).

A second sequential measurement along the x-axis means that we rotate the reference macroscopic object 90 degrees along the x-z plane so the new angle is 90 degrees; for the atom we first make a 45-degrees rotation along the x-z plane ($t=\pi/4$)3 and then we determine whether the spin is in the $+$ or $-$ state (i.e. the wave-function collapses, see the second graph in figure 1). The probability for the spin to be in the states $+/-$ is now $50\%/50\%$, because the rotation is a non-deterministic symmetry transformation.

A third sequential measurement along the z-axis means that we rotate the reference macroscopic object -90 degrees along the x-z plane so the new angle is again 0 degrees; for the atom we first apply a -45-degrees rotation along the x-z plane ($t=-\pi/4$) to the atoms with spin $+$ and then we determine whether the spin is in the $+$ or $-$ state (i.e. the wave-function collapses one more time, see the third graph in figure 1). Despite that in the first measurement the spin was in the state $+$, the probability for the spin to be in the states $+/-$ is $50\%/50\%$ in the third measurement, because the rotation is a non-deterministic symmetry transformation and we applied it in the second and third measurements to switch from the z to the x-axis and then to switch again from the x to the z-axis.

As we have seen in the previous sections, generalizing the symmetry transformations to be non-deterministic suffices to account for all experimental results described by Quantum Mechanics, with the Stern-Gerlach experiment being one example. The question remaining is whether the Euler’s formula applies for phase-spaces with more than 2 states, which would imply that the collapse of the wave-function is merely a mathematical artifact of the wave-function parametrization.

What is exactly a black hole from the point of view of a quantum theory? That’s a tough question. Because of that, the black hole information paradox is not necessarily related with *real* black holes.

Nevertheless, we can always think of the Stern-Gerlach experiment, described in the previous section. The argument here is that there is always a unitary transformation such that the corresponding probability distribution is necessarily the constant distribution, for all initial states in the same orthogonal basis. Thus, if a black-hole erases most information about an object that comes inside of it by turning this information to random, that is not incompatible with a unitary time-evolution. We have seen an analogous case in the previous section for a 2-state phase space.

Certainly, the collapse of the wave-function is not unitary and thus the transformation on the ensemble is also not unitary. If we measure the properties of the black-hole immediately after the object comes inside, the information is erased. However since the time-evolution is unitary, if the transformation is not only about the object coming inside but about more events then the information is not necessarily lost. If such events do not affect the degrees of freedom that were erased (which is expected since a black-hole is defined by few parameters), then the information will remain erased. Only with a quantum theory for black holes we can know for sure which events can happen after an object comes inside a black-hole.

In any case, a transformation which erases information is compatible with a unitary time-evolution.

We address now a system with 4 possible states. A real normalized wave-function $\varphi_1$ can be parametrized in terms of Euler angles (i.e. standard hyper-spherical coordinates and following reference [@eulerangles]) as:

$\begin{aligned}
\varphi_{1}=&c_{1}\ l_{1}+s_{1}\ \varphi_{2}\\
\varphi_{2}=&c_{2}\ l_{2}+s_{2}\ \varphi_{3}\\
\varphi_{3}=&c_{3}\ l_{3}+s_{3}\ \varphi_{4}\\
\varphi_{4}=&l_{4}
\end{aligned}$(15)

Where $c_n=\cos(\theta_n)$ and $s_n=\sin(\theta_n)$ stand for the cosine and sine of an arbitrary angle $\theta_n$ (i.e. $\theta_n$ is an arbitrary real number), respectively; and $n$ is an integer number verifying $1\leq n<4$. The set $l_{1}, l_{2}, l_{3}, l_{4}$ are normalized vectors forming an orthonormal basis of a -dimensional real vector space.

The Euler’s formula for the corresponding density matrices is:

$\begin{aligned}
\varphi_{1}\varphi_{1}^\dagger=&\frac{1}{2}+\frac{1}{2}(l_{1}l_{1}^\dagger-\varphi_{2}\varphi_{2}^\dagger) (\cos(2\theta_{1})+J_{1}\sin(2\theta_{1}))\\
\varphi_{2}\varphi_{2}^\dagger=&\frac{1}{2}+\frac{1}{2}(l_{2}l_{2}^\dagger-\varphi_{3}\varphi_{3}^\dagger) (\cos(2\theta_{2})+J_{2}\sin(2\theta_{2}))\\
\varphi_{3}\varphi_{3}^\dagger=&\frac{1}{2}+\frac{1}{2}(l_{3}l_{3}^\dagger-\varphi_{4}\varphi_{4}^\dagger) (\cos(2\theta_{3})+J_{3}\sin(2\theta_{3}))\\
\varphi_{4}\varphi_{4}^\dagger=&l_{4}l_{4}^\dagger
\end{aligned}$(16)

Where $J_n=(l_{n}\varphi_{n+1}^\dagger -\varphi_{n+1}l_{n}^\dagger)$ plays the role of the imaginary unit in the Euler’s formula, in the subspace generated by the vectors $\{l_{n}, \varphi_{n+1}\}$. Thus, the collapse of the wave-function for a phase-space with states is a recursion of collapses of 2-dimensional real wave-functions. The conditional probabilities are given by the diagonal part of the density matrix, i.e. by taking the “real part” of the “complex numbers” $\cos(2\theta_n) +J_n\sin(2\theta_n)$:

$\begin{aligned}
P( 1 | 1 \mathrm{\ or\ above}))&=\frac{1}{2}+\frac{1}{2}\cos(2\theta_{1})\ \ \ \ \
P( (2 \mathrm{\ or\ above}) | (1 \mathrm{\ or\ above}))= \frac{1}{2}-\frac{1}{2}\cos(2\theta_{1})\\
P( 2 | 2 \mathrm{\ or\ above}))&=\frac{1}{2}+\frac{1}{2}\cos(2\theta_{2})\ \ \ \ \
P( (3 \mathrm{\ or\ above}) | (2 \mathrm{\ or\ above}))= \frac{1}{2}-\frac{1}{2}\cos(2\theta_{2})\\
P( 3 | 3 \mathrm{\ or\ above}))&=\frac{1}{2}+\frac{1}{2}\cos(2\theta_{3})\ \ \ \ \
P( (4 \mathrm{\ or\ above}) | (3 \mathrm{\ or\ above}))= \frac{1}{2}-\frac{1}{2}\cos(2\theta_{3})\\
P( 4 | (4 \mathrm{\ or\ above}))&=1
\end{aligned}$(17)

where $P( 2 | (2 \mathrm{\ or\ above}))$ stands for probability for the state to be $n=2$ knowing that the state is either $n=2$, or $n=3$, ... or $n=4$. Note that these conditional probabilities are arbitrary, i.e. for any probability $p$ there is an angle $\theta_n$ such that the cosine $c_n=\cos(\theta_n)$ of that angle verifies $c_n^2=p$.

The fact that the previous conditional probabilities are arbitrary, implies that the probability distribution is arbitrary, since for any probability distribution we have:

$\begin{aligned}
P(1)=& P(1 | (1 \mathrm{\ or\ above}))\\
P(2)=& P( (2 \mathrm{\ or\ above}) | (1 \mathrm{\ or\ above}))*\\
& P(2 | (2 \mathrm{\ or\ above}))\\
P(3)=& P( (2 \mathrm{\ or\ above}) | (1 \mathrm{\ or\ above}))*\\
& P( (3 \mathrm{\ or\ above}) | (2 \mathrm{\ or\ above}))*\\
& P(3 | (3 \mathrm{\ or\ above}))\\
P(4)=& P( (2 \mathrm{\ or\ above}) | (1 \mathrm{\ or\ above}))*\\
& P( (3 \mathrm{\ or\ above}) | (2 \mathrm{\ or\ above}))*\\
& P( (4 \mathrm{\ or\ above}) | (3 \mathrm{\ or\ above}))*\\
& P(4 | (4 \mathrm{\ or\ above}))
\end{aligned}$(18)

Moreover, two wave-functions are always related by a rotation. Thus we can confirm that any probability distribution for states, can be reproduced by the Born rule for some wave-function:

$\begin{aligned}
P(n)=&|\varphi^\dagger l_n|^2\\
P(1)=&(c_{1})^2\\
P(2)=&(s_{1}c_{2})^2\\
P(3)=&(s_{1} s_{2}c_{3})^2\\
P(4)=&(s_{1} s_{2}s_{3})^2
\end{aligned}$(19)

A probability distribution can be discrete or continuous. A continuous probability distribution is a probability distribution that has a cumulative distribution function that is continuous. Thus, any partition of the phase-space (where each part of the phase-space has a non-null Lebesgue measure) is countable.

Consider now a countable (possibly infinite) partition of the phase-space. The corresponding countable orthonormal basis for the separable Hilbert space is $\{l_n\}$, where each index $n>0$ corresponds to an element of the partition of the phase-space. We can parametrize a normalized vector in the Hilbert space [@eulerangles], as $v_n=c_n l_n+s_n v_{n+1}$, where $c_n=\cos(\theta_n)$ and $s_n=\sin(\theta_n)$ stand for the cosine and sine of an arbitrary angle $\theta_n$ (i.e. $\theta_n$ is an arbitrary real number), respectively; and $n>0$ is an integer number. The first vector $v_1$ is the wave-function of the full phase-space. Note that the parametrization is valid for infinite dimensions, because in the recursive equation all we need to assume about the vector $v_{n+1}$ is that it is normalized and orthogonal to $\{l_1, l_2, ... l_n\}$, which is a valid assumption in infinite dimensions. Then we define $v_{n+1}$ in terms of $v_{n+2}$ in the same way, and so on. The recursion does not need to stop.

Then, the projection to the linear space generated by $v_n$ is:

$\begin{aligned}
v_nv_n^\dagger=&\frac{1}{2}+\frac{1}{2}(l_{n}l_{n}^\dagger-\varphi_{n+1}\varphi_{n+1}^\dagger) (\cos(2\theta_{n})+J_{n}\sin(2\theta_{n}))
\end{aligned}$(20)

Where $J_n=(l_{n}\varphi_{n+1}^\dagger-\varphi_{n+1}l_{n}^\dagger)$ plays the role of the imaginary unit in the Euler’s formula, in the subspace generated by the vectors $\{l_{n}, \varphi_{n+1}\}$. Thus, the collapse of the wave-function for a generic phase-space is a recursion of collapses of 2-dimensional real wave-functions. The conditional probabilities are given by the diagonal part of the density matrix, i.e. by taking the “real part” of the “complex numbers” $\cos(2\theta_n) +J_n\sin(2\theta_n)$: The operator $v_n v_n^\dagger$ is a projection thanks to the off-diagonal4 terms $c_ns_n(l_n v_{n+1}^\dagger+v_n l_{n+1}^\dagger)$.

Defining $(n\mathrm{\ or\ above})=\{k : k\geq n\}$ as the event which contains all parts of the phase-space with index starting at $n$, we can write the probability distribution as:

$\begin{aligned}
\label{eq:cond}
P(n)&=P((n\mathrm{\ or\ above}))P(n| (n \mathrm{\ or\ above}))\\
&=\left(\prod\limits_{k=1}^{n-1} P((k+1 \mathrm{\ or\ above})|(k \mathrm{\ or\ above}))\right)P(n| (n \mathrm{\ or\ above}))\end{aligned}$(21)

That is, as a product of the probabilities

$\begin{aligned}
&P(n|(n \mathrm{\ or\ above}))\mathrm{\ and\ }P((n+1 \mathrm{\ or\ above})|(n \mathrm{\ or\ above}))\mathrm{,\ which\ verify}\\
&P(n|(n \mathrm{\ or\ above}))\ +\ \
P((n+1 \mathrm{\ or\ above})|(n \mathrm{\ or\ above}))=1.\end{aligned}$(22)

If the off-diagonal terms are suppressed (collapsed), we obtain a diagonal operator which represents the probability distribution $P(n)$ in the Hilbert space:

$\begin{aligned}
\mathrm{diag}(v_nv_n^\dagger)=c_n^2 l_nl_n^\dagger+s_n^2 v_{n+1}v_{n+1}^\dagger\end{aligned}$(23)

That is, $P(n)=\mathrm{tr}(\mathrm{diag}(v_1v_1^\dagger) l_nl_n^\dagger)$ and $P(O)=0$ for operators $O$ with null-diagonal. Note that $c_n^2=P(n| (n \mathrm{\ or\ above}))$ and $s_n^2=P((n+1 \mathrm{\ or\ above})| (n \mathrm{\ or\ above}))$ and these probabilities are arbitrary, i.e. for any probability $p$ there is an angle $\theta_n$ such that the co $c_n=\cos(\theta_n)$ of that angle verifies $c_n^2=p$.

The fact that these conditional probabilities are arbitrary, implies that the probability distribution is arbitrary, since the probability distribution can be written in terms of these conditional probabilities as shown in Equation [eq:cond].

While the parametrization with a real wave-function is always possible, it may not be the best one. As we have seen, the wave-function parametrization allows us to apply group theory to the states of the ensemble, since unitary transformations (i.e. a multi-dimensional rotation) preserve the properties of the parametrization (in particular the conservation of total probability).

The union of a set of projection operators and the unitary representation of a group, is a set of normal operators. Suppose that there is no non-trivial closed subspace of the Hilbert space left invariant by this set of normal operators. The (real version of the) Schur’s lemma [@realpoincare; @Oppio:2016pbf; @realoperatoralgebras][@realpoincare; @Oppio:2016pbf; @realoperatoralgebras][@realpoincare; @Oppio:2016pbf; @realoperatoralgebras] implies that the set of operators commuting with the normal operators forms a real associative division algebra—such division algebra is isomorphic to either: the real numbers, the complex numbers or the quaternions.

If we do a parametrization by a real wave-function and consider only expectation values of operators that commute with a set of operators isomorphic to the complex or the quaternionic numbers, then we can equivalently define wave-functions in complex and quaternionic Hilbert spaces [@realQM; @realpoincare; @Oppio:2016pbf][@realQM; @realpoincare; @Oppio:2016pbf][@realQM; @realpoincare; @Oppio:2016pbf].

Let us consider the quaternionic case (it will be then easy to see how is the complex case). We have a discrete state space defined by two real numbers $n,m$, with $1\leq m\leq 4$ and we only consider the probabilities for $n$ independently on $m$, $P(n)=\sum_{m=1}^4 P(n,m)$.

Then a more meaningful parametrization—reflecting by construction the restriction on the operators we are considering—uses a quaternionic wave function $v_1$. Let $\{l_n\}$ be an orthonormal basis of quaternionic wave-functions and we have:

$\begin{aligned}
v_n v_n^\dagger=c_n^2 l_nl_n^\dagger+s_n^2 v_{n+1}v_{n+1}^\dagger+c_ns_n(l_n v_{n+1}^\dagger+v_{n+1} l_{n}^\dagger)\end{aligned}$(24)

Note that there is a basis where $l_nl_n^\dagger$ is real diagonal and thus upon collapse $v_nv_n^\dagger$ becomes real diagonal as well.

The complex case is just the above case with complex numbers replacing quaternions and a state space which is the union of 2 identical spaces. The continuous case is analogous, since there is a partition of the phase-space which is countable.

Quantum Mechanics is not a generalization of probability theory, but it is definitely a generalization of classical mechanics since it involves non-deterministic transformations to the state of the system. For instance, the time evolution may be non-deterministic unlike in classical mechanics.

There are three major metaphysical views of time [@bebecome]: presentism, eternalism and possibilism. The possibilism consists in considering the presentism for the future and the eternalism for the past, so it is inconsistent with a time translation symmetry. The presentism view coincides with the Hamiltonian formalism of physics, that the state of the system is defined by a point in the phase space. When the time evolution of the system is deterministic it traces a phase space trajectory for the system, however the definition of the state of the system does not involve time, i.e. only the present exists 5. The eternalism view coincides with the Lagrangian formalism of physics, that the state of the system is defined by a function of time. When the time evolution of the system is deterministic, this function of time coincides with the phase-space trajectory of the classical Hamiltonian formalism and so which metaphysical view of time we use is irrelevant from an experimental point of view (in the deterministic case).

But when the time-evolution of the system is non-deterministic, we may have a hard time studying the time-evolution from the Lagrangian formalism and/or eternalism metaphysical view. The key fact about Quantum Mechanics which makes it incompatible with the eternalism/Lagrangian point of view is that the time-evolution is not necessarily a stochastic process, i.e. there is not necessarily a collection of random events indexed by time6. We only apply one non-deterministic transformation of the state of the system, however there are many different transformations we can choose from and the set of choices is indexed by a parameter we call time, which is fine from the presentism/Hamiltonian point of view since only the present exists.

Note that a random experiment always involves a preparation followed by a measurement. For instance, we shake a dice in our hand and throw it over a table until it stops (preparation), then we check the position where it stopped (measurement).

If we just throw the dice without shaking our hand, the probability distribution for the measurement outcome is different than if we shake our hand. There is nothing mysterious about this: two different preparations lead to two different probability distributions. Whether or not we actually do the measurement does not change anything, what changes the probability distribution is the preparation.

Then we can think about a preparation which is function of an element of a symmetry group, for instance translation in time. From the point of view of probability theory or experimental physics, this is a valid option. However, it is important to note that this preparation function of time is not a stochastic process in time. A stochastic process in time is a set of random experiments indexed by time, while in the preparation which is function of time we have a single random experiment dependent on the parameter time. As an example, consider a) throwing the dice 10 times, one time per minute during 10 minutes and b) shake the dice in our hand for a number of minutes $T$ between 0 and 10 and then throw the dice once. The preparation in b) is dependent on the time parameter $T$, while in a) the time selects the one of the many identically prepared experiments which was done at the selected time.

Note that the experiments a) and b) above are different but can be combined: we could do many random experiments, each of them would be dependent on a parameter. This fact is important in Section 12.

In the remaining of this section, we comment on conditioned probability and the random walk. It is well-known that quantum mechanics can be described as the Wick-rotation of a Wiener stochastic process [@nonperturbativefoundations]. In other words, the time evolution in Quantum Mechanics is a Wiener process for imaginary time. This is the origin of the Feynman’s path integral approach to Quantum Mechanics and Quantum Field Theory.

Since the Wiener process is one of the best known Lévi processes—a Lévi process is the continuous-time analog of a random walk—this fact often leads to an identification of Quantum Mechanics with a random walk. In particular, it often leads to an identification of the probabilities calculated in Quantum Mechanics with conditioned probabilities—the next state in a random walk is conditioned by the previous state.

Certainly, the usefulness of group theory is common to both a random walk and to Quantum Mechanics and this unavoidably leads to similarities between a random walk and Quantum Mechanics. However, imaginary time is very different from real time and thus the probabilities calculated in Quantum Mechanics are not necessarily conditioned probabilities in a random walk.

In order to relate a random walk (or any other stochastic process) with Quantum Mechanics correctly, we need the probability distribution for the complete paths of the random walk. Then, we can use a wave-function parametrization of the probability distribution for the complete paths of the random walk. Finally, we can apply quantum methods to this wave-function. The result is a Quantum Stochastic Process [@qsc], which is not a generalization of a stochastic process due to the wave-function collapse, but merely the parametrization of a stochastic process with a wave-function.

Now we are able to prove one of the main results of this paper, namely that there is a group action of a Wigner’s symmetry group on the probability distribution for the state of a system, if and only if the Wigner’s symmetry group transforms deterministic (probability) distributions into deterministic (probability) distributions. A corollary is that time translation in Quantum Mechanics is a stochastic process if and only if it is deterministic. This mathematical fact is overlooked by the assumptions of both the Bell’s theorem and the Einstein-Podolsky-Rosen (EPR) paradox.

As it was discussed in Section 3, Wigner’s theorem [@2014PhLA; @Ratz1996; @wignertheorem][@2014PhLA; @Ratz1996; @wignertheorem][@2014PhLA; @Ratz1996; @wignertheorem] implies that the action of a symmetry group on the wave-function is necessarily linear and unitary. In Section 4, we showed that the action of a symmetry group on the wave-function is deterministic if and only if $P_A$ and $U_g P_B U^\dagger_g$ commute for all events $A,B$ and for all the elements $g$ of the group, where $P_A$ is a projection-valued-measure.

This means that $U$ is a deterministic transformation if and only if $U_{l a} U_{m a}^*=0$ for all $a,l,m$ such that $l\neq m$.

Now we check the necessary and sufficient conditions for the action of a symmetry group on the wave-function to correspond to an action on the corresponding probability distribution.

That is, if we start with some probability distribution $\mathrm{diag}(\rho_1)$, then the action of each element $g$ of the group on the wave-function will produce (after the collapse) a different probability distribution $\mathrm{diag}(\rho_g)$. The composition of the actions of two group elements $g,h$ on the probability distribution is given by the succession of the two random experiments corresponding to $g$ and $h$: $P(A)=\mathrm{tr}(\mathrm{diag}(\rho_g)U_h P_A U_h^\dagger)$.

However, Wigner’s theorem [@2014PhLA; @Ratz1996; @wignertheorem][@2014PhLA; @Ratz1996; @wignertheorem][@2014PhLA; @Ratz1996; @wignertheorem] implies that the action of a symmetry group on the wave-function is necessarily linear and unitary, thus $P(A)=\mathrm{tr}(\rho_g U_h P_A U_h^\dagger)$.

Thus there is a group action of the symmetry group on the probability distribution if and only if $\mathrm{tr}(\mathrm{diag}(\rho_g)U_h P_A U_h^\dagger)=\mathrm{tr}(\rho_g U_h P_A U_h^\dagger)$ for any pure density matrix $\rho_g$ and any event $A$ and group element $h$.

The equality above is equivalent to $\sum_{k,b: k\neq b} U_{ka}^*\Psi_k\Psi_b^* U_{ba}=0$, where $U_{ba}$ are the elements of the matrix $U_h$. We can see that if $U$ is a deterministic transformation, then the equality is satisfied, since $U_{ja}^* U_{jl}=0$ for all $a,l,j$ such that $a\neq l$. On the other hand, if $U$ is a non-deterministic transformation then for some $a,l,m$ such that $l\neq m$, we have $U_{m a}^* U_{l a}\neq 0$. Then for $\Psi_k=\frac{1}{\sqrt{2}}(\delta_{km}+\delta_{kl})$, we get $\sum_{k,b: k\neq b} U_{ka}^*\Psi_k\Psi_b^* U_{ba}=U_{m a}^* U_{la}\neq 0$, i.e. there is no group action of the symmetry group on the probability distribution.

The concept of (ir)reversible process from thermodynamics also needs a careful discussion in quantum mechanics. A non-deterministic symmetry transformation, when acting on a deterministic ensemble increases the entropy of the ensemble after the wave-function collapse and therefore must be an irreversible transformation. Yet, a symmetry transformation always has an inverse symmetry transformation, because it is included in a symmetry group, so it must be considered reversible in some sense.

The way out of this apparent contradiction is the role of time in the quantum formalism, which was discussed in Sections 11 and 12. In the ensemble interpretation, the individual system is entirely defined by a standard phase-space, which implies that the time plays no fundamental role in quantum mechanics nor in classical Hamiltonian mechanics. Then, time-evolution in quantum mechanics is not a stochastic process unless it is deterministic. Therefore, there is not a probability distribution for each time (or for other parameter corresponding to the symmetry group).

If we consider a stochastic process with only two probability distributions corresponding to the initial and final times, then the complete symmetry transformation is irreversible (if it is non-deterministic and it acts in a deterministic ensemble). However, this does not imply that it is a “bad” symmetry, because no stochastic process can be defined in between the initial and final times. On the other hand, if the symmetry group contains only deterministic transformations then a stochastic process can be defined in between the initial and final times and such process is reversible, as expected.

The Einstein-Podolsky-Rosen (EPR) main claim [@epr] (namely, that Quantum Mechanics is an incomplete description of physical reality), is defended by reducing to absurd the negation of the main claim, i.e. by reducing to absurd that position (Q) and momentum (P) are not simultaneous elements of reality. In the EPR article it is stated: “*one would not arrive at our conclusion if one insisted that two or more physical quantities can be regarded as simultaneous elements of reality only when they can be simultaneously measured or predicted.*[...]* This makes the reality of P and Q depend upon the process of measurement carried out on the first system, which does not disturb the second system in any way. No reasonable definition of reality could be expected to permit this.*”

The reduction to absurd of the negation of the claim, could only be a satisfactory argument if the claim itself (namely, the quantities position and momentum of the same particle are simultaneous elements of reality, despite they cannot be simultaneously measured or predicted) would not be absurd as well. But the claim itself raises eyebrows to say the least, once we remember that (in Quantum Mechanics, by definition) measuring the position with infinite precision completely erases any knowledge about the momentum of the same particle.

In Quantum Mechanics as in classical Hamiltonian mechanics, the state of an individual system is a point in a phase space, and the phase space is both the domain and image of the deterministic physical transformations. As in any statistical theory, we may know only the probability distribution for the state of the individual system, instead of knowing the state of the individual system. The relation between quantum mechanics and a statistical theory is clear: the wave-function is a parametrization for any probability distribution [@parametrization].

There are two kinds of incompleteness in a non-Markov stochastic process. The two kinds of incompleteness are in correspondence with the two concepts: stochastic and non-Markov, respectively.

1) Stochastic: From the point of view of (classical) information theory [@info], the root of probabilities (i.e. non-determinism) is by definition the absence of information. Statistical methods are required whenever we lack complete information about a system, as so often occurs when the system is complex [@bertinstatistical]. Thus we can convert a deterministic process to a stochastic process unambiguously (using trivial probability distributions); but we cannot convert a stochastic process into a deterministic process unambiguously since we need new information 7.

2) non-Markov: any non-Markov stochastic process can be described as a Markov stochastic process where some variables defining the state of the system are hidden (i.e. unknown) [@allmarkov; @non_markov_examples][@allmarkov; @non_markov_examples]. Conversely, by definition any irreducible 8 Markov process where some variables defining the state of the system are hidden will give rise to a non-Markov process. For instance, the physical phenomena which generates examples of Brownian motion is deterministic and thus Markov, but real-world Brownian motion is often non-Markov (because we cannot measure the state of the system completely [@brownian; @brownian2][@brownian; @brownian2]) despite the fact that the Brownian motion is one of the most famous examples of a Markov process.

In reference [@reality] (authored by A. Einstein and contemporary of the EPR paradox) the two kinds of incompleteness are clearly distinguished:

"**[...]* I believe that the *[quantum]* theory is apt to beguile us into error in our search for a uniform basis for physics, because, in my belief, it is an incomplete representation of real things, although it is the only one which can be built out of the fundamental concepts of force and material points (quantum corrections to classical mechanics). The incompleteness of the representation is the outcome of the statistical nature (incompleteness) of the laws. I will now justify this opinion.*"

The incompleteness of the representation corresponds to the non-Markov kind, while the incompleteness of the laws corresponds to the stochastic kind. By definition, in Quantum Mechanics any sequence of measurements is a Markov stochastic process (thus it has the stochastic kind of incompleteness) 9. Note that any non-Markov stochastic process can be described as a Markov stochastic process where some variables defining the state of the system are hidden (i.e. unknown) [@allmarkov; @non_markov_examples][@allmarkov; @non_markov_examples].

Since Quantum Mechanics does not have the non-Markov kind of incompleteness, position and momentum can only be simultaneous elements of reality in another theory very different from Quantum Mechanics. That both the claim and its negation are absurd, is strong evidence that some of the assumptions leading to the Einstein-Podolsky-Rosen (EPR) paradox [@epr] do not hold.

So, why did the author tried to justify (using the EPR paradox [@epr], among other arguments) that in Quantum Mechanics the stochastic kind of incompleteness necessarily leads to a non-Markov kind of incompleteness?

The following paragraph from the same reference [@reality] suggests that the author was trying to favor the cause that any future theoretical basis should be deterministic, not just Markov (since statistical mechanics is often Markov).

“*There is no doubt that quantum mechanics has seized hold of a beautiful element of truth, and that it will be a test stone for any future theoretical basis, in that it must be deducible as a limiting case from that basis, just as electrostatics is deducible from the Maxwell equations of the electromagnetic field or as thermodynamics is deducible from classical mechanics. However, I do not believe that quantum mechanics will be the starting point in the search for this basis, just as, vice versa, one could not go from thermodynamics (resp. statistical mechanics) to the foundations of mechanics.*”

However and as discussed in Section 4, there is no mathematical argument that suggests that in general a deterministic model is more fundamental than a stochastic one, quite the opposite. Since the wave-function is merely a possible parametrization of any probability distribution [@parametrization], we also cannot claim that a deterministic model is more fundamental than Quantum Mechanics. Thus, the stochastic kind of incompleteness is harmless.

So, the EPR paradox appears as an attempt to justify a mathematical statement (that a deterministic model is more fundamental than Quantum Mechanics) with arguments from physics (trying to link to the non-Markov kind of incompleteness), for which no mathematical arguments could be found. Note that a statement referring to any future theoretical basis is essentially a mathematical statement because the physical model is any (since the theoretical basis is any).

However, it is a failed attempt because it missed the fact discussed in Section 12, that the time evolution is a stochastic process if and only if it is deterministic.

In the EPR paradox, there is no probability distribution for the state of system after the spatial separation of the entangled particles and before the transformation involved in the measurement takes place, because the time evolution (being in this case non-deterministic) is not a stochastic process. We can only consider the probability distribution for the state of system after the spatial separation of the entangled particles and after the transformation involved in the measurement takes place. This is overall a non-local physical transformation since it involves the spatial separation of the entangled particles. But it does not violate relativistic causality, since both the spatial separation of the entangled particles and the transformation involved in the measurement do not by themselves violate relativistic causality, so their composition does not violate causality either.

Unlike many popular no-go arguments [@nogo], we are not arguing against the requirement that a physical theory should be complete, in fact we claim that Quantum Mechanics is a complete statistical theory (as defined by EPR).

Note that Bohr already declared Quantum Mechanics as a “complete” theory, however he did it at the cost of a radical revision of the classical notions of causality and physical reality [@bohrcomplete]. He wrote: “Indeed the finite interaction between object and measuring agencies conditioned by the very existence of the quantum of action entails —because of the impossibility of controlling the reaction of the object on the measuring instruments if these are to serve their purpose—the necessity of a final renunciation of the classical ideal of causality and a radical revision of our attitude towards the problem of physical reality.” [@bohrcomplete] Such notion of a “complete” theory mostly favors the EPR claim: the only way that Quantum Mechanics could be complete is if it is incompatible with the classical notions of causality and physical reality. Thus from a logic point of view, there is no disagreement between Einstein and Bohr, their disagreement is about what basic features an acceptable theory should have, whether or not it should be compatible with the classical notions of causality and physical reality.

In contrast, the fact—that the time evolution is a stochastic process if and only if it is deterministic—which was overlooked is perfectly compatible with the classical notions of physical reality (because Quantum Mechanics has a standard phase-space) and causality (as we will show in Section 15). We claim that Quantum Mechanics—being non-deterministic and thus a generalization of classical mechanics—does not entail a radical departure from the basic features that an acceptable theory should have, according to EPR [@epr]. In fact in Quantum Mechanics and in classical Hamiltonian mechanics, the state of an individual system is a point in a phase space, and the phase space is both the domain and image of the deterministic physical transformations.

The only known theory consistent with the experimental results in high energy physics [@pdg] is a quantum gauge field theory which is mathematically ill-defined [@prize]. Due to the mathematically illness, the relation of such a theory with Quantum Mechanics is still object of debate and it will be addressed soon in another article by the present author.

In the mean time we will have to consider a free system, which suffices to address the EPR paradox. For a free system, we know well what is relativistic Quantum Mechanics [@realpoincare]. The time evolution of the wave-function is described by the Dirac equation for a free particle, which is a real (i.e. non-complex) equation.

Relativistic causality is satisfied in relativistic Quantum Mechanics, meaning that there is a propagator which vanishes for a space-like propagation [@realpoincare]. In other words, the probability that the system moves faster than light is null.

A deterministic theory compatible with relativistic Quantum Mechanics is one which when applied to an ensemble of free systems, will reproduce the statistical predictions of Quantum Mechanics.

Since in relativistic Quantum Mechanics the probability that the system moves faster than light is null, then no system (described by the deterministic theory) in the ensemble moves faster than light. Thus any deterministic theory compatible with relativistic Quantum Mechanics necessarily respects relativistic causality. The question we left open here and address in the next section, is whether one such deterministic theory exists.

Does a deterministic theory—consistent with the non-deterministic time evolution of Quantum Mechanics—exists?

The answer is yes, and we will build one example of such deterministic theory in this section.

In an experimental setting, we always have a discrete set of possible outcomes and thus Quantum Mechanics always predicts a cumulative distribution function. This allows us to apply the inverse-transform sampling method [@sampling] for generating pseudo-random numbers consistently with the probability distribution predicted by Quantum Mechanics.

An experiment in Quantum mechanics always involves the repetition of an experimental procedure many times. In the deterministic theory however, each time we execute the experimental procedure we are not executing exactly the same experimental procedure. We consider a number (any number will do) which will be the seed of the pseudo-random number generator and then we generate pseudo-random numbers consistently with the probability distribution predicted by Quantum Mechanics. The experimental procedure is: 1) generate one pseudo-random number and 2) modify the state of the system accordingly with the pseudo-random number.

In the case of relativistic Quantum Mechanics, the probability of violating relativistic causality is null. Thus, the experimental procedure never violates relativistic causality. The modifications of the state of the system are however necessarily not infinitesimal since the phase space of the experimental setting is discrete. This doe not violate relativistic causality, since the finite modifications to the state of the system occur in finite intervals of time.

We can however consider intervals of time as small as we like and thus modifications to the state of the system as small as we like. The only requirement for this is that the computational resources involved in the pseudo-random number generation are as large as needed (which is valid from a logical point of view). Note that since time evolution in quantum mechanics is not necessarily a stochastic process, we will often have that a sequence of experimental procedures executed at regular and small intervals of time produces different statistical data than than just one experimental procedure executed at once after the same total time has passed (e.g. in the double-slit experiment). But this cannot be considered a radical departure of the classical notion of physical reality, since in the (very old) presentism view of classical Hamiltonian mechanics, the phase space (i.e. the physical reality) does not involve the notion of time [@bebecome]. Moreover when the time evolution is deterministic then it is a stochastic process, therefore if we study only deterministic transformations then we can recover the eternalism view of classical Lagrangian mechanics without any conflict with relativistic causality. For instance, this implies that in the double-slit experiment we can in principle reconstruct the trajectory of each particle and conclude about which slit the particle has went through.

From a logical point of view, this deterministic theory is valid and by definition it always agrees with the experimental predictions of Quantum Mechanics, thus it is experimentally indistinguishable from Quantum Mechanics.

From the metaphysics point of view, this deterministic theory is unacceptable, since it involves pseudo-random number generation. For instance, in the double-slit experiment we (or some super-natural entity) would need to somehow “program” each particle to follow a different path determined by a different number, which is absurd. However, the present author has no interest in building a nice deterministic theory compatible with Quantum Mechanics, for the reasons exposed in Section 4.

Note that this deterministic theory is not super-deterministic, i.e. the experimental physicists are free to choose which measurements and which transformations of the state of the system to do [@superdeterminism][43]. However, an experimental procedure involves a symmetry transformation of the state of the system. Since the symmetry transformation in this deterministic theory is reproduced by the pseudo-random number generation, then when we apply the inverse-transform sampling method we need to know already what is the symmetry transformation. Thus there is a kind of conspiracy between the symmetry transformation and the pseudo-random generator, but such conspiracy is part of the definition of the deterministic symmetry transformation itself. There are assumptions about freedom of choice in the literature which exclude our deterministic (but not super-deterministic) theory, because the authors erroneously consider that an experimental procedure which involves a transformation of the state of the system is instead an observation without consequences to the system [@superdeterminism][43].

The ensemble interpretation does not give any explanation as to why it looks like the electron’s wave-function interferes with itself in the Young’s double-slit experiment [44][45][46]—that would imply that the wave-function describes (in some sense) an individual system. We will fill that gap in this section.

The key to understand the results of the double-slit experiment is the role of time in the quantum formalism, which was discussed in detail in Section 12. In the ensemble interpretation the individual system is entirely defined by a standard phase-space, which implies that the time plays no fundamental role in quantum mechanics nor in classical Hamiltonian mechanics. Moreover, the time-evolution in quantum mechanics is not a stochastic process unless it is deterministic. Therefore, there is not a probability distribution for each time (or for other parameter corresponding to the symmetry group).

In the double-slit experiment, the time-evolution of the electron after being fired (S1) is a product of two non-deterministic symmetry transformations: first, going through one or another slit with a 50/50 probability (S2); and second, a non-deterministic propagation from (S2) until (F). If at least one of these two symmetry transformations would be deterministic, then we could define a stochastic process including the 3 instants in time (S1), (S2) and (F). But since both transformations are nondeterministic, the only stochastic process that can be defined only includes the 2 instants in time (S1) and (F), and the corresponding transformations from (S1) to (S2) and from (S2) to (F) have never occurred.

The only “mystery” that needs to be clarified is the fact that the non-deterministic propagation of the electron from (S2) until (F) is such that it appears that the electron interferes with itself, just like a classical wave would do. To simplify the discussion we will only consider the electrons that reach the detector along 2 different angles $\sin(\theta_1)=\frac{2 \pi}{p d}$ and $\sin(\theta_2)=\frac{\pi}{p d}$ where $p$ is the electron’s linear momentum. So, a selected electron can only go through one of these 2 angles, the electrons that go through other angles are discarded.

The wave-function at (S1) is $\Psi=\left[\begin{smallmatrix} 1 \\ 0\end{smallmatrix}\right]$. The time-evolution from (S1) until (S2) may be the identity matrix $U=\left[\begin{smallmatrix} 1 & 0 \\ 0 & 1 \end{smallmatrix}\right]$ or $U=\frac{1}{\sqrt{2}}\left[\begin{smallmatrix} 1 & 1 \\ 1 & -1 \end{smallmatrix}\right]$, depending on whether the second slit is closed or open, respectively. If the second slit is open, then $U\Psi=\frac{1}{\sqrt{2}}\left[\begin{smallmatrix} 1 \\ 1 \end{smallmatrix}\right]$ meaning that the electron may go through both slits with equal probability.

The time-evolution from (S2) until (F) is given by the unitary transformation $U^{'}=\frac{1}{\sqrt{2}}\left[\begin{smallmatrix} 1 & 1 \\ 1 & -1 \end{smallmatrix}\right]$, that is, it sums the wave-functions from both slits for the first angle and it subtracts the wave-functions from both slits for the second angle.

Thus, if the second slit is closed, we have at (F) the wave-function $\Psi=\frac{1}{\sqrt{2}}\left[\begin{smallmatrix} 1 \\ 1 \end{smallmatrix}\right]$ meaning that the electron may come along angles 1 or 2 with equal probability10. But if the second slit is open, we have at (F) the wave-function $\Psi=\left[\begin{smallmatrix} 1 \\ 0\end{smallmatrix}\right]$ meaning that the electron will only come along angle 1; since the electron would have come through both slits with equal probability if we would see what happened at (S2), it appears that from (S2) until (F) it interferes with itself constructively(destructively) along the angle 1(2) respectively.

The “mystery” is therefore similar to the probability clock 5: How is it possible that a $50/50$ probability becomes $100/0$? It is possible because precisely because the time plays no fundamental role in quantum mechanics nor in classical Hamiltonian mechanics. There is not a probability distribution for each time (or for other parameter corresponding to the symmetry group). The symmetry transformation $U'U$ is different from a stochastic process where the symmetry transformations $U'$ and then $U$ are applied, and there is no reason why it should not be different.

The Bell inequalities [@bell] do not hold—since Quantum Mechanics cannot be distinguished from a complete statistical theory—because the assumptions of the Bell inequalities overlooked the fact that time-evolution is a stochastic process if and only if it is deterministic. As long as the time-evolution of the phase-space is a symmetry and it respects relativistic causality, there is no reasonable argument why a complete statistical theory should be a stochastic process. The whole point of the Bell inequalities is to distinguish Quantum Mechanics from a “standard” statistical theory, but a “standard” statistical theory means that the theory is completely defined by a probability distribution in a phase-space (which is the case of Quantum Mechanics and classical statistical mechanics).

One could argue instead that the inequalities do hold, but there is an implicit assumption that the theory which is being compared to Quantum Mechanics has a time-evolution which is a stochastic process. Even in that case (see Section 12) we have that for any set of experimental results supporting relativistic Quantum Mechanics, there is a deterministic theory (and so the time-evolution is a stochastic process) which is also compatible with the same experimental results. So, to save the Bell inequalities we would need now to find fundamental arguments against such deterministic theory. But, which arguments? Such deterministic theory is compatible with any experimental test about relativistic causality and it is not super-deterministic. These arguments would need to be somehow against the existence of pseudo-random number generators in Nature, but such generators *do* exist in Nature because we humans built some of them and we are part of Nature.

To be sure, the present author does not expect that a reasonable deterministic theory will in the future replace Quantum Mechanics. But once it is established that Quantum Mechanics is a complete statistical theory, the idea that we can rule out a reasonable deterministic theory, is also an absurd: it would imply affirming the Bayesian point of view and ruling out the Frequentist point of view. Two logical constructions can always be mutually incompatible, despite being both consistent when considered independently of each other (e.g. the Bayesian and Frequentist points of view). In the Bayesian point of view, the probability expresses a degree of belief, and so the probability *is* an entity which exists by itself. In the Frequentist point of view the root of probabilities is the absence of deterministic information that does exist somehow and is revealed through events. But if such information exists, then we cannot rule out that there is a reasonable deterministic theory which describes such information.

In summary, either we can say that the Bell inequalities do not hold or instead, we can also say that the Bell inequalities (despite being mathematically valid inequalities) involve unrealistic assumptions which render them innocuous.

A probability distribution can be discrete or continuous (or a linear combination of discrete and continuous probability distributions). A continuous probability distribution is a probability distribution that has a cumulative distribution function that is continuous.

In the case of continuous probability distributions, each and every single point in the phase-space has null probability. This is fortunate for the wave-function parametrization, since in the linear space of square-integrable functions ($L^2$), the point evaluation is not a continuous linear functional (that is, $L^2$ is not a reproducing kernel Hilbert space). In fact, $L^2$ is an Hilbert space of equivalence classes of functions that are equal almost everywhere (that is, up to sets with null Lebesgue measure, and null Lebesgue measure implies null probability in the context of continuous probability distributions).

But it is not obvious how to extend the wave-function parametrization to conditioned probabilities of continuous probability distributions. A conditioned probability distribution is in itself a probability distribution and so it admits a wave-function parametrization. However, the original probability distribution also admits a wave-function parametrization and the question we address now is how to relate the parametrization of the conditioned probability with the parametrization of the original probability distribution.

When deriving the continuous probability distribution from the wave-function parametrization, the value of the probability distribution at a single point of the phase-space is ambiguous and thus we cannot calculate the conditioned probability without ambiguity. Is not obvious because in the conditioned probability, we may know that an event has happened, even if the probability of such event was null (e.g. a single point in the phase space). We could argue that the conditioned probability could be only an intermediate calculation, but this would clash with the Bayesian point of view where there are only conditioned probabilities. Also from a classical mechanics point of view, a single point in the phase space does have a meaning. This ambiguity is also at the root of the need for the renormalization process in Quantum Field Theory 11.

The conditioned probability is a particular case of a constrained system and the ambiguity described above also appears in constrained systems in general, whenever we want to define a wave-function parametrization of a probability distribution on a subset of the phase-space defined by constraints. The constraints are from a technical point of view, a representation of an ideal by the zero number. By an ideal we mean an ideal in the algebraic sense. Regarding the normalization of the conditional probability distribution, it is automatic since the wave-function parametrization is defined independently from the ideal.

The correspondence between geometric spaces and commutative algebras is important in algebraic geometry12. It is usually argued that the phase space in quantum mechanics corresponds to a non-commutative algebra and thus it is a non-commutative geometric space in some sense [@connesnoncommutative]. However, after the wave-function collapse, only a commutative algebra of operators remains (see Section 1). Thus, the phase space in quantum mechanics is a standard geometric space and the standard spectral theory (where the correspondence between geometric spaces and commutative algebras plays a main role [@spectralhistory]) suffices.

It suffices to constrain to zero the Casimir operators of the (eventually non-commutative) Lie algebra of constraints. This imposes the constraints without the need for the constraints to be part of the commutative algebra, only the Casimir operators are included in the commutative algebra.

Once non-determinism is taken into account, then non-commutative operators can be taken into account and the constraints are the generators of a gauge symmetry group. In case the Lie group is infinite-dimensional, there is some ambiguity in its definition [@infinitelie; @infinitelie][@infinitelie; @infinitelie]. We consider the $C^*$-algebra [@realoperatoralgebras] generated by the unitary operators on an Hilbert space of the form $e^{i\int d^4 x \theta(x) G(x)}$ where $G(x)$ is a constraint and $\partial_{\mu}\theta(x)$ is a square integrable function of space-time $x$ (see also Section 20).

Note that the algebra of observable operators already conserves the constraints (i.e. it is a trivial representation of the gauge symmetry), so the Hilbert space does not need to verify the constraints (i.e. it may be a non-trivial representation of the gauge symmetry). In fact, in many cases it would be impossible for the cyclic state of the Hilbert space to verify the constraints, as it was noted long ago:

“So we have the situation that we cannot define accurately the vacuum state. We therefore have to work with a standard ket $|S>$ which is ill-defined. One can, however, do many calculations without using the accurate conditions *[vacuum verifies constraints]* and the successes of quantum electrodynamics are obtained in this way.”

Paul Dirac (1955) [@Dirac:1955uv]

Indeed, there are some symmetries of the algebra of operators which necessarily the expectation functional cannot have (see also [@Klauder:2000gu]), since the expectation functional is a trace-class operator (the expectation of the identity operator is 1) and its dual-space is bigger (the space of bounded operators).

For instance, consider an infinite-dimensional discrete basis $\{e_k\}$ of an Hilbert space (indexed by the integer numbers $k$) and the symmetry group generated by the transformation $e_k\to~e_{k+1}$ (translation). There is no normalized wave-function (and thus no expectation functional) which is translation-invariant, while there is a translation-invariant algebra of bounded operators (starting with the identity operator).

We define gauge-fixing as comprehensive whenever it crosses all possible gauge-orbits at least once. On the other hand, we define gauge-fixing as complete whenever it crosses all possible gauge-orbits at most once, i.e. when there is no remnant gauge symmetry. The Dirac brackets require the gauge-fixing to be both comprehensive and complete, which is not possible in general due to the Gribov ambiguity [@henneaux1992quantization]. In a non-abelian gauge-theory, the Gribov ambiguity forces us to consider a phase-space formed by fields defined on not only space but also time. This is related to the fact that in a fibre bundle (the mathematical formulation of a classical gauge theory) the time cannot be factored out from the total space because the topology of the total space is not a product of the base-space (time) and the fibre-space, despite that the total space is *locally* a product space. Thus, the Hamiltonian constraints cannot be interpreted literally, that is, as mere constraints in a too large phase-space whose “non-physical” degrees of freedom need to be eliminated. Moreover, this picture makes little sense in infinite-dimensions: the gauge potentials can be fully reconstructed from the algebra of gauge-invariant functions, apart from the gauge potential and its derivatives at one specific arbitrary point in space-time [@wilsonloops]; thus the number of “non-physical” degrees of freedom would be finite at most which clearly does not match with the uncountable infinite number of constraints.

If we consider instead a commutative C*-algebra and its spectrum, such that any non-trivial gauge transformation necessarily modifies the spectrum while conserving the commutative C*-algebra (e.g. the gauge field $A_\mu$ which is a function of space-time), then one point in the spectrum is one example of a complete non-comprehensive gauge-fixing. The gauge-fixing is non-comprehensive because the action of the gauge group on the spectrum is not transitive. Such commutative algebra has the crucial advantage that the constraints are necessarily excluded from the algebra, so that it can be used to construct a standard Hilbert space which is compatible with the constraints because the relevant operators of the commutative algebra are the ones commuting with the constraints, saving us the need to eliminate the “non-physical” of degrees of freedom.

In the absence of constraints, we also consider a (particular) commutative C*-algebra: the AW*-algebra. A commutative AW*-algebra is a commutative C*-algebra whose projections form a complete Boolean algebra. Conversely, any complete Boolean algebra is isomorphic to the projections of some commutative AW*-algebra [55]. Therefore, the notion of probability is a particular case of a functional on a commutative C*-algebra, such notion only arises in the absence of constraints.

Thus, the Hamiltonian constraints are in fact a tool to define an (effective) probability measure for a manifold with a non-trivial topology (a principal fibre bundle for the gauge group) [@gaugewhy], because a phase-space of gauge fields defined *globally* on a 4-dimensional space-time (i.e. a fibre bundle with a trivial topology, when the base space is the Minkowski space-time) produces well-defined expectation functionals for the gauge-invariant operators acting on a fibre bundle with a non-trivial topology [@gaugewhy] 13. On the other hand, setting non-abelian gauge generators to zero in the wave-function would require to solve a non-linear partial differential equation with no obvious solution [@gaussYM; @integralYM; @globalYM; @dressYM][@gaussYM; @integralYM; @globalYM; @dressYM][@gaussYM; @integralYM; @globalYM; @dressYM][@gaussYM; @integralYM; @globalYM; @dressYM] 14.

Note that it is crucial that the C*-algebra in the gauge-fixing is commutative and it is conserved by the gauge transformations. While this is not possible in the canonical quantization, it is possible with the quantization due to time-evolution [@pedro_1442442]. Note also that since only gauge-invariant operators are allowed, we must distinguish between the concrete manifold appearing in the phase-space and the family of manifolds (obtained from the concrete manifold through different choices of transition maps between local charts) to which the expectation values correspond.

The gauge symmetry is different from anomalies. An anomaly is a failure of a symmetry of the wave-function to be restored in the limit in which a symmetry-breaking parameter (usually introduced due to the mathematical consistency of the theory) goes to zero. We only consider symmetries of the Hamiltonian as candidate symmetries of the wave-function, since only these are respected by the time-evolution.

On the other hand, the constraints (which generate the gauge symmetry) cannot modify the wave-functions of the Hilbert space. Since in the case of a gauge symmetry there is no way to introduce a symmetry-breaking parameter, we can never observe an anomaly.

The ideal (gauge generator) in the gauge mechanics system is the charge operator:

$\begin{aligned}
Q(t)=-\dot{p}_\lambda(t)+\pi(t)\phi(t)+\pi^*(t)\phi^*(t)\end{aligned}$(25)

For consistency with General Relativity, we also impose a constraint for the observables to be translation invariant in the coordinate $t$. As it will be discussed below, the cyclic vector defining the Hilbert space needs not be translation-invariant, just the operators corresponding to observables need to commute with the translation operator:

$\begin{aligned}
T(\tau)=e^{i \frac{\tau}{2} \int dt \left[p_\lambda(t)\partial_t \lambda(t)+\pi(t)\partial_t \phi(t)-i\psi^\dagger(t)\partial_t \psi(t)+\mathrm{h.c.}\right]}
\label{eq:translation}\end{aligned}$(26)

In the Hamiltonian formalism, the constraints are from a technical point of view, a representation of an ideal by the zero number. By an ideal we mean an ideal in the algebraic sense.

We need to separate the ideal (gauge generator) from the gauge-invariant algebra. That is, not only the gauge-invariant algebra must commute with the ideal, but also the ideal cannot be included in the gauge-invariant algebra. This is guaranteed by non-comprehensive gauge-fixing: the gauge-invariant algebra is the sub-algebra of the commutative algebra with spectrum15 given by the fields $\phi(t),\phi^*(t),\lambda(t)$, such that the sub-algebra commutes with the constraints. The conjugate field $p_\lambda(t)$ or its derivative $\dot{p}_\lambda(t)$ or the gauge generator are not part of the gauge-invariant algebra, since they do not commute with the corresponding field $\phi(t)$ which is included in the commutative algebra.

weinberg pp 204, quantum theory of fields (flavor symmetry)

While there is a mathematically rigorous definition of classical field theory [@cftmath], so far the definition of a (classical) statistical field theory [@mussardosft] is tied to the definition of a quantum field theory [@Lang:1985nw], which involves a lattice spacing necessary to regularize and renormalize the ultraviolet divergences of the field theory. The notion of continuum limit in a discrete lattice is that for a large enough energy scale the predictions of the theory are independent from the type of discrete regularization used [@Lang:1985nw], thus the lattice in the regularized theory is always discrete. The regularization and renormalization are related to the decomposition of a field defined in the continuum through discrete wavelets and it is roughly the translation of the products of fields into products of wavelet components [@battle1999wavelets] (a related approach involves a semivariogram [@spatialdata]). Such translation of products of fields only allows polynomial Hamiltonians; in particular when using the Fock space without regularization (which is a possible way to implement a continuous tensor product [@continuoustensorp]), only quadratic Hamiltonians are allowed (i.e. for free fields).

This excludes a rigorous definition of the classical statistical version of many classical field theories (such as General Relativity), since so far there is no reason why the Hamiltonian of a classical field theory should be polynomial in the fields, not to mention the problems with Quantum Gravity [@Katanaev:2005xd]. This is unacceptable: for most classical field theories, the definition of the corresponding classical statistical field theories should be straightforward, because the real-world measurements are never fully accurate.

The above is an indication that an alternative definition of Statistical Field Theory which allows the definition of non-polynomial Hamiltonians should not be too hard to find. Indeed, the essential obstruction to an infinite-dimensional Lebesgue measure is its $\sigma$-finite property (to be the countable direct sum of finite measures) [@baker1991lebesgue; @baker2004lebesgue][@baker1991lebesgue; @baker2004lebesgue]. Once we drop the $\sigma$-finite property, several relatively simple candidates exist [@baker1991lebesgue; @baker2004lebesgue][@baker1991lebesgue; @baker2004lebesgue]. In our case, we are not looking for an infinite-dimensional Lebesgue measure (no one expects the probability measure itself to be translation-invariant), but only for a translation-invariant time-evolution of the probability measure (i.e. the time-evolution is an operator, not a real number) and thus there is no reason to expect such operator to be $\sigma$-finite until it is evaluated against a probability measure when it becomes another probability measure—which cannot be translation-invariant.

The time-evolution for any quantum system is a (unitary) linear operator. This is only possible because the linear space is infinite-dimensional, this allows non-linear equations to be converted into linear equations. In the case of field theory, while only free fields are allowed in Fock-spaces without renormalization [@Petz:1990gb], nothing prevents us from defining a free field over $\mathbb{R}^n$ with the number of dimensions $n$ finite but as large as needed.

There is an important theorem about bosonic Fock-spaces, which is useful for our case [@indicator; @partitionfock; @skeide; @indicator2][@indicator; @partitionfock; @skeide; @indicator2][@indicator; @partitionfock; @skeide; @indicator2][@indicator; @partitionfock; @skeide; @indicator2] stating that the closed linear span of exponential vectors is the bosonic Fock-space $\Gamma(L^2(\mathbb{R}^+))$:

$\begin{aligned}
&\biggl\{\exp(\int\limits_{0}^{+\infty} \mathop{}\!\mathrm{d}t\, (\chi_{[s_1,t_1]}+\cdot\cdot\cdot+\chi_{[s_n,t_n]})(t)a^\dagger(t))\left|0\right\rangle:\\
&0 \leq s_1 \leq t_1 \leq \cdot\cdot\cdot s_n \leq t_n\}\end{aligned}$(27)

is $\Gamma(L^2(\mathbb{R}_+))$, where $n$ is a natural number, $\chi_{[s_n,t_n]}(t)$ is the indicator function in the interval $[s_n,t_n]$, $a^\dagger(t), a(t)$ are the creation and annihilation operators respectively and $\left|0\right\rangle$ is the vacuum state of the bosonic Fock-space. Note that the norm of the above defined exponential vector $\phi$ verifies $\left\langle 0\middle| \phi\right\rangle=1$.

The theorem can be extended to $\Gamma(L^2(\mathbb{R}^n))$ [@indicator] with indicators of Borel sets. Since for the fermionic Fock-space there are only two possible values for the field at each point of $\mathbb{R}^n$, the fermionic Fock-space is also the closed linear span of exponential vectors with indicator functions. It remains to address the cartesian product of a discrete space with $\mathbb{R}^n$: the case of a finite discrete space is equivalent to introducing a finite number of flavours of free fields and thus the resulting Hilbert space is also the closed linear span of exponentials of indicators. The generalization to an infinite discrete space must be consistent with the finite case, in particular when selecting an arbitrarily large finite subset. Therefore, the resulting Hilbert space is also the closed linear span of exponentials of indicators.

We conclude that for the case of a discrete space only (without the cartesian product with $\mathbb{R}^n$), the appropriate generalization of the Fock-space is not the Quantum Harmonic Oscillator, but instead it is the closed linear span of exponentials of indicators. In particular, when an ideal selects a single point of $\mathbb{R}^n$, the corresponding Hilbert space at that point is only 2-dimensional and not the Hilbert space of a Quantum Harmonic Oscillator. Thus, the exponentials of indicators are a complete basis of the Fock-space, unlike the coherent states in general which are overcomplete. The Quantum Harmonic Oscillator is only related to the Fock-space in the sense that in the case of $\mathbb{R}^n$, if we do a discrete wavelet transform we obtain an infinite discrete number of Quantum Harmonic Oscillators, one Oscillator per element of the wavelet basis. Note that each discrete wavelet is necessarily non-local. Thus the widespread myth that a quantum field is an infinite set of quantum harmonic oscillators with one oscillator at each space-time point, is severely misleading at best because it ignores the crucial fact that a point in $\mathbb{R}^n$ has null Lebesgue measure. For the same reason, we cannot expect lattice field theory to provide a solid mathematical foundation to Quantum Field Theory beyond being a numerical approximation to a discrete wavelet transform of the Fock space in $\mathbb{R}^n$.

Thus, the (bosonic or fermionic) Fock-space can be used as the wave-function parametrization of an arbitrary probability distribution of a classical field over space (for instance the 3D space), in the following way. A commutative AW*-algebra is a commutative C*-algebra whose projections form a complete Boolean algebra. Conversely, any complete Boolean algebra is isomorphic to the projections of some commutative AW*-algebra [@awalgebras]. For an Hamiltonian involving at most second-order derivatives of the fields, the projections of the commutative AW*-algebra of a (non-free) field at an arbitrary discrete finite number $n$ of points in space are parametrized by the Fock-space over $S=\mathbb{R}^{d*(1+2*m+(d-1)*d/2)+m}$, where $d$ is the number of space dimensions and $m$ is the number of bosonic fields (to include fermionic fields we replace $\mathbb{R}$ by the discrete set $\{0,1\}$). The projection corresponding to a proposition is:

$\begin{aligned}
&\pi(\phi(x_1)\cap ...\cap \phi(x_n))=
\psi(\phi(x_1)\cap ...\cap \phi(x_n))\psi^\dagger(\phi(x_1)\cap ...\cap \phi(x_n))\\
&\psi(\phi(x_1)\cap ...\cap \phi(x_n))=a^\dagger(\phi_1,x_1)...a^\dagger(\phi_n,x_n)|0>\end{aligned}$(28)

A maximal ideal is defined by the exponential vector of indicator functions and the Fock-space parametrizes the space of probability distributions, such that the probability corresponding to a maximal ideal is the modulus squared of the complex number corresponding to the maximal ideal which appears in the expansion of the wave-function in terms of exponential vectors of indicator functions. Therefore, the maximal ideal takes care of what happens sequentially along several indexes, while the probability distribution takes care of what happens in parallel at different elementary events.

The above is consistent with the fact that a complete physical system is also a free system. The free field associated to a free system can be made an orthogonal(fermion)/symplectic(boson) real representation of the Poincare group depending on whether its spin is semi-integer/integer respectively, regardless of the interactions occurring within the free system [@spinstatistics; @wigner][@spinstatistics; @wigner]. Thus in field theory the wave-function parametrization includes a free field parametrization.

In (quantum or classical) statistical field theory, the problem we want to solve is about a probability distribution so it is about an eigenvalue problem and diagonalizing a time-evolution operator, because the eigenfunction needs not even exist and we use ideals instead (see the previous section). On the other-hand, in classical field theory (including in numerical calculations such as the finite element method [@sobolevfem; @loggfem][@sobolevfem; @loggfem]) it is about the fields themselves and so the solution must be part of an Hilbert space (because completeness of the space is crucial for the existence proofs) and we need an alternative to the $L^2$ measure since the differential operator is unbounded with respect to the $L^2$ measure: such alternative is the Sobolev Hilbert space [@sobolev].

The fact that the Hamiltonian only involves local interactions allows us to introduce an *-homomorphism where a finite number of points of the continuum space is selected. Then we can do a wave-function parametrization, which allows for a non-deterministic (infinitesimal) time-evolution for these selected points. Moreover, the (deterministic or non-deterministic) time-evolution of this finite number of points can be determined independently of the full probability distribution of the initial state, which may be a complex problem because it may involve an infinite-dimensional Sobolev phase-space with some correlation between the points due to differentiability requirements [@ringstrom2009cauchy]. This allows us to know approximately the probability distribution of the initial and final state through numerical methods for partial differential equations and regression (Gaussian process regression [@gpr] or statistical finite element method [@statfem], for instance).

The fact that we are dealing with a *commutative* algebra is key to allow the selection of only a finite number of points of the continuum space. This is only possible because we make use of the wave-function parametrization only when it is convenient, in this case only after the selection of a finite number of points of the continuum space. We can do it because the wavefunction really is just a parametrization, without a physical counterpart.

If we had assumed that there is an infinite-dimensional canonical commutation relation algebra [@Petz:1990gb] from the beginning (as in most literature about Quantum Field Theory, instead of the commutative algebra we considered) then the *-homomorphism where a finite number of points of the continuum space is selected would not be possible. So our formalism includes the Fock-space (i.e. free fields), but not the other way around. Therefore, the Hamiltonian is quadratic in the creation/annihilation operators and no further regularization is needed (the free field parametrization can be considered a regularization by itself).

Moreover in the classical statistical field theory case where the time-evolution is deterministic16, the wave-function parametrization is crucial to define an expectation functional and its time-evolution which are mathematically well defined. Without the wave-function parametrization, the selection of only a finite number of points of the continuum space is much harder already for classical field theory [@finitecft].

The cost of the free field parametrization is that we need to implement derivatives and coordinates in continuum space as an extra structure at the local level using constraints, which allows well-defined products of fields and its derivatives at the same point in the continuum space. For an Hamiltonian which depends on the field derivatives up to second order), the constraints are $iD_x-i\partial_x=0$ where:

$\begin{aligned}
% &\phi^{(0)} i \partial_x-\phi^{(1)}=0\\
% &\phi^{(1)} i \partial_x-\phi^{(2)}=0\\
% &p=p_{(0)}+i \partial_x p_{(1)}+(i\partial_x)^2 p_{(2)}
% &\int d\phi dy (a^\dagger(\phi,y)iD_y a(\phi,y)-i\partial_x=0\\
%&\int d\phi dx
%(a^\dagger(\phi,x)[iD_x,H]a(\phi,x)-[i\partial_x,a^\dagger(\phi,x)[iD_x,H]a(\phi,x)])=0\\
&[p_{(j)},\phi^{(k)}]=i\delta_j^k\\
&[p_{(j)},\phi^{(2)}(x)]=i\delta_j^2\\
&[\phi^{(j)},p_{(0)}(x)]=-i\delta_0^j\\
&p(x)=p_{(0)}(x)\\
&\phi(x)=\phi^{(0)}+\phi^{(1)}x+\frac{1}{2}\phi^{(2)}(x)x^2\\
&D_x=[\partial_x,p_{(0)}(x)]\phi^{(0)}+\sum_{j=1}^{2} p_{(j-1)}\phi^{(j)}+p_{(2)}[\partial_x,\phi^{(2)}(x)]\\
&[iD_x-i\partial_x, H]=0\end{aligned}$(29)

Crucially, due to the Bianchi identity and that the Hamiltonian is translation invariant, we have:

$\begin{aligned}
&\int d\phi dx a^\dagger(\phi,x)[[iD_x,\phi(x)],H]a(\phi,x)=-\int d\phi dx a^\dagger(\phi,x)[iDx,[H,\phi(x)]]a(\phi,x)\\
&\int d\phi dx a^\dagger(\phi,x)[[iD_x,p(x)],H]a(\phi,x)=-\int d\phi dx
a^\dagger(\phi,x)[iDx,[H,p(x)]]a(\phi,x) \end{aligned}$