All articles published by MDPI are made immediately available worldwide under an open access license. No special
permission is required to reuse all or part of the article published by MDPI, including figures and tables. For
articles published under an open access Creative Common CC BY license, any part of the article may be reused without
permission provided that the original article is clearly cited. For more information, please refer to
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature
Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for
future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive
positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world.
Editors select a small number of articles recently published in the journal that they believe will be particularly
interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the
most exciting work published in the various research areas of the journal.
Functional analysis is a well-developed field in the discipline of Mathematics, which provides unifying frameworks for solving many problems in applied sciences and engineering. In particular, several important topics (e.g., spectrum estimation, linear prediction, and wavelet analysis) in signal processing had been initiated and developed through collaborative efforts of engineers and mathematicians who used results from Hilbert spaces, Hardy spaces, weak topology, and other topics of functional analysis to establish essential analytical structures for many subfields in signal processing. This paper presents a concise tutorial for understanding the theoretical concepts of the essential elements in functional analysis, which form a mathematical framework and backbone for central topics in signal processing, specifically statistical and adaptive signal processing. The applications of these concepts for formulating and analyzing signal processing problems may often be difficult for researchers in applied sciences and engineering, who are not adequately familiar with the terminology and concepts of functional analysis. Moreover, these concepts are not often explained in sufficient details in the signal processing literature; on the other hand, they are well-studied in textbooks on functional analysis, yet without emphasizing the perspectives of signal processing applications. Therefore, the process of assimilating the ensemble of pertinent information on functional analysis and explaining their relevance to signal processing applications should have significant importance and utility to the professional communities of applied sciences and engineering. The information, presented in this paper, is intended to provide an adequate mathematical background with a unifying concept for apparently diverse topics in signal processing. The main objectives of this paper from the above perspectives are summarized below: (1) Assimilation of the essential information from different sources of functional analysis literature, which are relevant to developing the theory and applications of signal processing. (2) Description of the underlying concepts in a way that is accessible to non-specialists in functional analysis (e.g., those with bachelor-level or first-year graduate-level training in signal processing and mathematics). (3) Signal-processing-based interpretation of functional-analytic concepts and their concise presentation in a tutorial format.
The concept of functional analysis is built upon normed vector spaces and particularly inner product spaces, which are merged with diverse notions of topology and geometry, linear algebra, probability theory, and real and complex analysis (see, e.g., [1,2,3,4,5]). Topics in functional analysis include various concepts such as Banach spaces and Hilbert spaces, and linear operators and their spectral theory, as well as group and semigroup theory. Knowledge of these mathematical structures is often essential for understanding and solving a variety of analytical problems in signal processing and related fields, as well as in mathematics itself . For example, in functional analysis, objects like functions are considered as elements or points in a space of functions , and hence the name functional analysis.
Results, generated from functional analysis, form key concepts in the frameworks of advanced scientific and engineering disciplines that include the fields of statistical signal processing and adaptive signal processing. Although adaptive signal processing can be viewed as a branch of statistical signal processing , the special properties of this field and their roles in engineering applications have led many specialists to consider them as two separate fields. Therefore, in many universities and research institutions around the world, statistical signal processing and adaptive signal processing are taught as independent graduate courses in engineering and applied sciences, and many textbooks have been devoted to study these important fields individually (e.g., see [8,9] and references therein). Nevertheless both statistical signal processing and adaptive signal processing form the backbone of the so-called modern signal processing, in which signals are generally considered as random processes. Modern signal processing covers many topics of current interest, such as signal modeling and estimation, signal prediction, signal compression, adaptive lattice filtering, adaptive joint process estimation, recursive least squares lattice filtering, and spectrum estimation. The issues, related to processing of both deterministic and random signals, are further discussed below.
While an estimation error may typically converge to zero for deterministic signals, this is generally not the case for random signals . Therefore, in statistical and adaptive signal processing, it is a common practice for random signals to make them unbiased (i.e., expectation of the estimation error converging to zero). As explained later, this type of convergence is of a special kind, which is known in functional analysis as weak convergence (see, for example, [10,11]). Therefore, many important results in functional analysis are obtained in terms of weak convergence and weak topology, which potentially have significant implications to the subfields of statistical signal processing and adaptive signal processing. Moreover, it is usually desirable in estimation theory to identify optimal filters, which bridges the discipline of signal processing to that of optimization theory. To this end, researchers in modern signal processing often deal with random processes for which optimization problems become more challenging, and the usage of advanced mathematical tools is justified.
From a historical perspective, the names of some of the spaces used in functional analysis are those of early-time mathematicians who had originally developed the theories of these spaces. Indeed much of the theoretical work has been associated with the names of eminent mathematicians (e.g., Gauss, Lagrange, Euler, and Kolmogorov). In fact, the Hilbert space, which is a central topic in functional analysis, is one of the most commonly used mathematical frameworks of signal processing and the associated optimization . The unique features of Hilbert spaces are explained in the paper from these perspectives. However, the names of other well-known spaces (e.g., metric spaces and normed spaces) were given based on the technical properties of these spaces; many of the spaces, frequently used in functional analysis, have been named based on quite different historical backgrounds.
We have presented a concise and focused review of key concepts of functional analysis in this paper, which have strong relevance to modern signal processing. The most important spaces from the perspectives of functional analysis, considered in this paper, are metric/topological spaces, Banach spaces, and Hilbert spaces. The relations among these and other spaces are illustrated in Figure 1. Other relevant vector spaces like summable (), Lebesgue-integrable (), and Hardy () spaces are also introduced in the paper.
The paper is organized in four sections, including the current section, and an Appendix A. Section 2 introduces Banach spaces and their relevant theorems, where special emphases are laid on the / spaces, spaces, spectral factorization, and weak topology in the setting of Banach spaces. Section 3 presents Hilbert spaces and their relevant features (e.g., Fourier series expansion and the orthogonality principle) along with some applications to signal processing and detection theory, such as wavelets, Karhunen-Loéve (KL) expansion, and reproducing-kernel Hilbert spaces (RKHS). Section 4 summarizes and concludes the paper. The Appendix A in this paper introduces elementary concepts and definitions in real analysis, probability theory, and topological spaces, which should be helpful for understanding the fundamental principles of functional analysis as applied to various concepts of signal processing; however the readers, who are familiar with these concepts, may only selectively refer to the Appendix A.
2. Banach Spaces for Signal Analysis
This section deals with Banach spaces for general applications to signal processing; it also introduces the concepts of Hardy spaces especially for digital signal processing. Further details on Banach spaces are provided in standard books on functional analysis such as Bachman and Narici  and Naylor and Sell .
2.1. Introduction to Banach Spaces
We start this subsection with the definition of a Banach space which is a complete normed space as defined below.
(Banach Spaces).Let a vector space X be defined over a field , where examples of are the field of real numbers and the field of complex numbers . Let a function , called norm and denoted as , have the following properties:
(positivity) , and if and only if .
(homogeneity) and .
(triangular inequality) .
Then, is called a normed vector space, where the norm serves as a metric. A real (resp. complex) normed linear space that is complete (i.e., where every Cauchy sequence converges in the space) is called a real (resp. complex) Banach space.
The spaces of sequences, , form an important class of Banach spaces, which are extensively used in digital signal processing. These are linear vector spaces of all real (resp. complex) sequences such that , where the -norm is defined as:
Some of the theorems on spaces , which are extensively used in the analyses of discrete-time signals, are presented below.
(Hölder Inequality ).Let and . If and , then and .
It is noted that Lebesgue-integrable versions of spaces, for applications to continuous signal processing, are called spaces . In spaces, Hölder inequality and Minkowski inequality are similar to their respective -versions in Theorems 1 and 2.
Next we focus on a few systems-theoretic applications of Banach spaces, which would require the operation of convolution.
(Convolution inequality ).For the sequences for and , the convolution product and .
Let us assume that is true. Then, there exists a subsequence bounded below by a real number , which implies that is bounded below by so that as . This contradicts the assertion . □
Let a linear discrete-time dynamical system with an impulse response be excited by an input signal to yield an output signal .
(BIBO-stability).A system is said to be bounded-input-bounded-output (BIBO)-stable if every . More generally, the system is called -stable if , where .
For a linear shift-invariant (LSI) system, the impulse response takes the form , where the output is given by the convolution as :
Using Theorem 3, if and for some , then it follows that :
It is noted that is a sufficient condition for the system to be -stable. Furthermore, using Lemma 1, it follows that if for some , then as . This information is useful, for example, in the design of a linear shift-invariant estimation system, where the output signal represents the estimation error. If the system impulse response is , then the estimation error is bounded and converges asymptotically to zero if the input signal for some .
(Adaptive Filtering).In a general setting, let us consider an adaptive filtering problem in Figure 2, where a measurement vector is used to construct an estimate, , of the desired signal by a linear shift-variant filter . Then, the task is to synthesize an adaptive algorithm to update the filter such that the estimation error as . Using Lemma 1, this could be achieved if for some in the adaptive algorithm.
If a dynamical system at any time n does not depend on the future (i.e., the system is only dependent on the past and the present) input(s), then the system is said to be causal  and the convolution in Equation (1) reduces to
If, in addition, , then it follows that
2.2. Hardy Spaces and Spectral Factorization for Signal Processing
This subsection introduces the concept of Hardy spaces , , which constitute a class of Banach spaces with a special structure; this structure is very useful for digital signal processing . In particular, and spaces are of importance in robust control theory and it will be seen later in this section that the space also plays an important role for power spectrum factorization in digital signal processing.
Recalling that, for a linear shift-invariant system with an impulse response and input , the output is obtained by convolution  as: . Then, by setting where is the frequency in radians, the z-transform of the impulse response is defined as:
which is known as the system transfer function (The one-dimensional z-transform of the discrete-time impulse response is the ratio of two polynomials: , where the degree of is less than or equal to that of for physically realizable systems. However, for the multi (i.e., n)-dimensional z-transform, where , the resulting transfer function is given as the ratio of the numerator and denominator multinomials:
The analysis of multi-dimensional z-transform (e.g., in signal processing of spatio-temporal processes) is significantly more complicated than that of one-dimensional z-transform, because the fundamental theorem of algebra may not be applicable to multinomials while it is always applicable to polynomials.) in the z-domain.
The system is stable if the sum in Equation (4) converges, and the region of convergence (ROC) is called the stablity region, where all poles of are located inside the unit circle with its center at zero in the complex z-plane. The system is said to be minimum-phase if all zeros of are located inside the unit circle. If all zeros of are located outside the unit circle, then the system is called maximum-phase , and the system is called non-minimum-phase if at least one zero of is located outside the unit circle.
(Analytic Functions).Let be the open disc of radius with center at . A complex-valued function , where , is said to be analytic in if the derivative of exists at each point of .
Given , the Hardy space is a set of analytic functions with bounded -norm defined as:
The following theorem, due to Paley and Wiener , presents a fundamental result in the -space, which is important for spectral factorization in signal processing and for innovation representation of random processes.
(Paley-Wiener).Let be a complex-valued function of the complex variable z. If , then there exists a real positive constant and a complex-valued function corresponding to a causal stable system with a causal stable inverse such that
where the superscript “her" indicates the Hermitian, i.e., complex conjugate of transpose of a vector/matrix, and is the complex conjugate of z. If, in addition, is a rational polynomial, the above factors and are minimum-phase and maximum-phase components, respectively. This is called the Paley-Wiener condition.
The proof of the Paley-Wiener Theorem is given in details by Therrian . □
It follows from Equation (1) that, for a linear shift-invariant stable system with a deterministic LSI impulse response and a wide sense stationary (WSS) input signal , the expected value of the output is:
Since the input is WSS, expected values, and , of the output and input , respectively, are related as:
Autocorrelation of a random vector is denoted as , and the cross-correlation between the output and the input is given by
The above equation leads to the following important relations between correlation functions :
where the superscript her indicates the Hermitian, i.e., the complex conjugate of transpose of a vector/matrix.
The Fourier transform of for a WSS random sequence is called the power spectral density function , defined as:
and its inverse Fourier transform, which is equal to the autocorrelation function, is obtained as:
The z-transform of the autocorrelation function for a WSS random sequence is called the complex spectral density function and is defined as:
and its inverse is given by the contour integral
Since the autocorrelation function of a zero-mean white noise with variance is given by , the power spectral density is a constant for a stationary white noise.
Using the property that the convolution in the time domain is a product in the Fourier transform domain and using Equation (9), it follows that
where is the system transfer function (i.e., the Fourier transform of ). A few algebraic computations yield the following relation :
In a similar manner, the following relations are obtained for the complex spectral density
Let us consider a WSS random sequence whose complex spectral density satisfies the Paley-Wiener condition:
Then, by Theorem 4, there exists a real positive constant and a complex-valued transfer function of a causal stable system with a causal stable inverse such that
A process, whose (complex) spectral density satisfies Equation (17), is called a regular process (see [7,14]). The spectral density factorization given by Equation (18) has important applications in signal processing. This includes what is called innovations representation of the random process , in view of which, any regular process can be realized as the output of a causal linear filter driven by a white noise with variance as shown in Figure 3.
It is worth-mentioning that this type of process covers a wide range of random processes. In particular, any process whose complex spectral density is a rational function of z is a regular process.
().Consider a random sequence with a complex spectral density function:
which could be re-written as:
Using Paley-Wiener Theorem, can be realized as the output of a causally stable system, given by:
excited by a zero-mean white noise with unit variance . It is important to note that since is a rational polynomial, should be minimum-phase. This is the case for the one given by Equation (19).
Since the function can be factored as:
a possible pitfall here is to choose
The term in Equation (20) is not minimum-phase because it has a zero at . Moreover, the inverse is not causal. Therefore, the spectral factorization with given by Equation (20) is not physically realizable for the given random sequence .
As mentioned before, any random process whose complex spectral density is a rational polynomial is a regular process, and therefore it satisfies the Paley-Wiener condition. However, this is not a necessary condition for being a regular process as seen in the following example.
().Let a random sequence have a complex spectral density .
Then, the corresponding power spectral density satisfies the Paley-Wiener condition that is given as:
Therefore, the given random sequence is regular and has an innovations representation. The spectral factorization can be done as follows:
Then, the causal factor is given by
which converges everywhere except at . The impulse response of the filter is: , where is the unit (discrete) step function, because
So, the given random sequence can be realized as the output of a system, with a transfer function given by Equation (21), which is driven by a zero-mean white noise with a unit variance (i.e., ).
In fact, a regular process is related to the corresponding predictable process that can be predicted with zero error. The relation between these two processes are given by the following fundamental theorem .
(Wold Decomposition Theorem).A general random sequence can be written as the sum of two processes as:
where is a regular process and is a predictable process, with being orthogonal to , i.e., .
It follows from the Appendix A that an appropriate collection of open sets in a metric space defines its topology, and such a topology is called a metric topology or strong topology. In fact, a base for the strong topology on a Banach space X is the collection of all open balls, i.e., sets of the form:
where the center g is a vector/function in X and the radius r is a positive real number. In this topology, convergence of a sequence, , of functions in X to a limit g in X is referred to as strong convergence, which implies that and is denoted by . Besides strong convergence, other notions of convergence (e.g., weak convergence and uniform convergence) have been introduced in the literature, which play significant roles in the theory of Banach algebra .
We now introduce the notions of weak convergence and weak topology. Given a Banach space X over a field , let be a set of bounded linear functionals (A functional is a mapping of a vector space X into its field . Then, the set of all linear bounded (equivalently, linear continuous) functionals in X is called the dual space .) on X, i.e., each is an element in the dual space and hence . Given an and a vector/function , let us define the set:
A class of such sets is obtained by varying in Equation (24) to establish the notions of weak convergence and weak topology. Some of these convergence concepts in the space of linear bounded operators are briefly explained in the following definitions, which are introduced for different notions of convergence of sequences of bounded linear operators in Banach spaces.
(Convergence in operator norm or uniform convergence).Let be a bounded linear operator from V into V. Then, the sequence converges to some in the operator norm (also called uniform convergence) if the induced norm , which is denoted as: .
(Strong convergence).Let be a bounded linear operator from V into V. Then, the sequence converges strongly to some if , which is denoted as .
(Weak convergence).Let be a bounded linear operator from V into V. Then, the sequence converges weakly to some if
which is denoted as .
(Convergence in operator norm).⇒ (Strong convergence) ⇒ (Weak Convergence). The converse is not true, in general.
To show () ⇒ (), we proceed as:
, implies that, given , i.e., , it follows that .
To show () ⇒ ():, we proceed as:
. Let ; then, it follows from linearity and boundedness of the functional f that . Therefore,
We demonstrate the falsity of the converse by two counterexamples, one for each case.
(Strong convergence) ⇏ (Convergence in operator norm): Let us define and a sequence of bounded linear operators as:
Therefore, is a bounded linear operator, i.e., . Since , it follows that
However, the limit may not converge in the induced norm, as seen by choosing with .
Therefore, (Strong convergence) ⇏ (Convergence in operator norm).
(Weak convergence) ⇏ (Strong convergence): Let us define a sequence of bounded linear operators as:
where . It is given that is a sequence of bounded linear operators, i.e., each . Furthermore, in this Hilbert space setting, it follows from the Riesz Representation Theorem that every can be represented as:
It is noted that, for finite-dimensional vector spaces, the notions of strong convergence and weak convergence are indistinguishable. Equivalently, we make the following statement:
In a finite-dimensional Banach space V, the weak topology generated by is the same as the strong topology generated by V.
However, in the analysis of stochastic processes, we deal with infinite-dimensional spaces of signal functions, which may not have the same criteria for weak convergence and strong convergence. This is especially applicable to statistical signal processing, where the expectation of the estimation error is required to weakly converge to zero without having the strong convergence of the error signal itself to zero.
Based on the concept of weak convergence, weak topology is defined as follows:
(Convergence in weak topology).Given a Banach space X, let there be a class of bounded linear functionals , and let be the topology in X generated by . Then, for a given vector/function , a sequence is said to converge to g in the weak topology , denoted as in , provided that converges strongly to , denoted as .
Weak convergence in Definition 7 is a generalization of weak convergence as introduced in the functional analysis literature, which implies that a sequence converges weakly to some if .
The concept of topological spaces and weak topology are important for learning using statistical invariants (LUSI). In a machine learning paradigm, learning machines often compute statistical invariants for specific problems with the objective of reducing the expected values of errors in a such way that preserves these invariants. In contrast to classical machine learning that employs the mechanism of strong convergence for approximations to the desired function, LUSI can significantly increase the rate of convergence by combining the mechanisms of strong convergence and weak convergence . Furthermore, the notion of weak topology is also important when dealing with shift spaces for signal analysis that uses symbolic dynamics, as explained in [18,19].
3. Hilbert Spaces for Signal Processing
This section introduces the concept of Hilbert spaces, which forms the backbone in the disciplines of signal processing and other fields of engineering. Details are provided in many textbooks such as Naylor and Sell .
(Hilbert Spaces).Let a vector space X be defined over a field , which is either or . A function is called an inner product if, for and , the following conditions hold:
(positive definiteness) when ;
Then, is called a inner product space or a pre-Hilbert space, and a complete inner product space (i.e., where every Cauchy sequence converges in the space) is called a real (resp. complex) Hilbert space, depending on whether the vector space is defined over (resp. ).
The following two properties are immediate consequences of the four properties in Definition 8:
It is also noted that every inner product space is a normed space with the norm .
An example of a Hilbert space is the space of square-summable sequences. Given two sequences and in , the inner product is given by . Two vectors x and y in a Hilbert space H are said to be orthogonal if . Given a subspace , its orthogonal complement is denoted as: ; consequently,
Hilbert spaces have many common interesting properties that make them to be important in optimization theory . As we will see in the sequel, these properties form the core of many fundamental results in adaptive and statistical signal processing, and they are established through the following theorem.
(Riesz Representation Theorem [2,5]).Let H be a a Hilbert space. Then, for every bounded linear functional , there exists a unique such that .
For the Hilbert spaces (resp. ), this result can be obtained by using a theorem , which states that, given , (resp. ) is isometrically isomorphic to the dual space of (resp. ) provided that , where q is called the conjugate of p. Since the conjugate of is , it follows that is isometrically isomorphic to , and similar relations hold for and (for example, see ); hence, and are reflexive. Generalization of this fact is stated as the following theorem.
Every Hilbert space is reflexive, i.e., H is isometrically isomorphic to its dual space .
The proof of this theorem is given in many textbooks on functional analysis (e.g., [5,10,20]).
Another important property of Hilbert spaces, which is widely used in signal processing in combination with the previous two properties (see Theorens 6 and 7), is given as the following theorem :
(Orthogonal Projections).Let H be a Hilbert space, and let be a closed subspace of H, implying that V is also a Hilbert space. Then, it follows that
(i) . That is, given , there exists a unique pair and such that .
(ii) is the unique vector in V having minimal distance from a vector , while is the unique vector in having minimal distance from x.
(iii) The orthogonal projections and are linear continuous operators, with norm .
It follows from Theorem 8 that the decomposition in Equation (22) in Section 2 is indeed unique. Based on this fact, any random process generally consists of two unique orthogonal components; a predictable component and an unpredictable component. That is, if one wants to predict by using N past observations , then let denote an optimal linear prediction of . Such a prediction can be obtained by applying the orthogonality principle, where the prediction error is given by , and the process can be expressed as:
Hence, the part represents the predictable part of , which corresponds to in Wold decomposition Theorem in Equation (22), while the error represents the unpredictable part of , which corresponds to in Wold decomposition Theorem. That is, the regular process represents the difference between the random process and its optimal prediction. Therefore, the output of represents only the new part of information, brought by , which cannot be extracted from the past observations. Therefore, the output of is called innovations process as depicted in Figure 3b.
Another interesting result on Hilbert spaces is stated in the following theorem [2,10].
(Bessel Inequality).Let H be a Hilbert space and let be a subspace of H. If denotes the orthogonal projection of elements in H into V, then, the Bessel inequality
holds for every . Moreover,
This theorem has an important implication to signal processing as explained in the following subsection.
3.1. Fourier Series Expansion in a Hilbert Space
In the Hilbert space , which is the space of all square-integrable periodic functions with with a period , an inner product is defined as:
Let the set of functions , , where
Then, it follows by setting that
where is called the Kronecker delta. Moreover, it turns out that is dense in , i.e., the completion (see ). Therefore, given any , there exists a sequence of scalars such that
Therefore, is in fact an orthonormal basis of . The infinite sum is called the Fourier series expansion of f, where are the Fourier coefficients. Using Equation (29) and taking the inner product of both sides of Equation (30) by , for any , yields
Hence, Equation (30) can be rewritten in the following form:
Moreover, it follows from Equation (26) in Theorem 9 that
In view of Equation (32), Fourier expansion of any square integrable signal can be decomposed as a linear combination of harmonic modes with frequencies , where each Fourier coefficient represents the signal’s component associated with each mode . Furthermore, Equation (33) reveals how signal’s energy is distributed over the signal’s components and demonstrates an important fact that, for each component , the value represents a part of the signal’s energy contributed by the component . This fact plays a central role in signal compression, where a signal is approximated by using as few Fourier coefficients as possible; this is accomplished with a minimum approximation error by considering those values of with large magnitudes and by discarding those coefficients with small magnitudes.
Now we summarize the main results of Fourier series expansion of periodic functions as a theorem.
(Fourier Series Theorem).Let be an orthonormal set in a Hilbert space H. Then, the following statements are equivalent:
is an orthonormal basis of H, i.e., is a complete orthonormal set in H.
(Fourier series expansion) Any vector can be expanded as: . Note: The inner products are called Fourier coefficients of the vector x.
(Parseval Equality) For any two vectors , the inner product:
Let U be a subspace of H such that U contains the sequence . Then, U is dense in H, i.e., .
3.2. Fourier Transform and Inverse Fourier Transform
For decomposition by Fourier series expansion, a function needs to be periodic as seen in Section 3.1. To extend this analysis to non-periodic functions, we first consider square-integrable periodic functions and let so that the restriction of periodic functions can be removed. Then, a combination of Equations (28) and (31) yields:
and we define:
Having and as , it follows that
Now, we have Fourier transform of a signal . Since is the completion , we impose a mild restriction: to be both absolute-value integrable and square-integrable. Nevertheless, this restriction is satisfied if f is an analytic function .
To obtain the inverse Fourier transform, we substitute Equations (28) and (35) into Equation (30), which yields
By defining , we have . Then, substitution of into Equation (37) for yields
In the limits and , Equation (38) becomes the inverse Fourier transform by using the Riemann sum :
This formula shows that a signal has, at any given time t, (possibly) uncountably many harmonic components distributed over the frequency range , and the magnitude of the harmonic component at a frequency is given by the signal’s Fourier transform . By taking the limits and , it follows from Equations (33) and (34) that
The above relation is known as Plancherel’s theorem , which implies that the total energy of the signal, obtained in the time domain is re-distributed over the frequency domain such that the energy density at each frequency is . It is worth-mentioning that the inner products of two functions f and g in the time domain and the frequency domain is related by:
Therefore, a sufficient condition that guarantees the DTFT to be well-defined is that . The original sequence can be recovered from its DTFT by the inverse discrete-time Fourier transform (IDTFT)
Expressing the frequency from radians/sec to Hertz, i.e., by setting , it follows that
Although the Fourier transform plays a central role in signal analysis, it considers the time-averaged frequency behavior of the signal by integrating over the entire time domain . This property reduces the capability of capturing abrupt (i.e., rapid) changes which may occur in the signal; capturing such rapid changes in the signal is crucial in many applications, such as detection of faults and anomalies. In order to remedy this shortcoming of Fourier transform, the signal is integrated over a time window, instead of integrating over the entire time domain. This gives rise to the so-called windowed Fourier transform (WFT) which augments Fourier transform with a time-localization property that would provide information about the signal simultaneously in time and frequency domains [21,22]; a quantum-mechanics-based explanation of time-frequency localization is briefly explained in .
3.3. Windowed Fourier Transform in a Hilbert Space
A function is said to have a compact support , and we say if g vanishes outside its compact domain B, i.e., . Given a function and a , let us define
where is the complex conjugate of , and for a positive real number T. Hence is a localized version of f and . Then, the windowed Fourier transform (WFT) of f is the Fourier transform of , which is given as:
and the inverse WFT is obtained as:
By following Equation (48) and Equation (49), an inner product is defined as:
Using Parseval’s identity in Equation (42), we have
An example of the window function g is:
As mentioned in a previous subsection, if the signal is discrete, its DTFT is used to provide a frequency representation of the signal. Although the signal in this case is discrete in the time domain, its DTFT is a continuous function of the frequency . However, most of the devices used in signal processing are digital, and therefore it is more convenient to deal with a discrete frequency representation of the signal. Moreover, the discrete signal in many cases represent a measurement data provided by some sensors, and such signals are usually of finite length. The discrete Fourier transform (DFT) is a useful tool in signal processing that accommodates for these two issues as explained below.
Given a finite-length discrete signal , the DFT is given as:
The inverse discrete Fourier transform IDFT is:
Returning to the continuous WFT, if a function f is windowed over a time interval, the resulting WFT would have a time-localization property as seen earlier in the end of Section 3.2. Moreover, Equation (51)) shows that WFT of f localizes to a neighborhhod of . Therefore, WFT has both time and frequency localizations. However, due to the uncertainty principle, these two kinds of localization have different physical interpretations and are mutually exclusive in the sense that making a WFT sharper in time makes it more flat in frequency and vice versa [21,23]. Moreover, WFT is not efficient in scanning signals that involve time intervals much shorter or much longer than the window length T . To address these issues, the notion of wavelet transform has been introduced in Section 3.4, which produce an efficient analysis tool to capture signal features occurring over short and long intervals.
3.4. Wavelet Transform in a Hilbert Space
The continuous wavelet transform (CWT) of a function is defined as:
where , called the wavelet, is a scaled and translated version
of what is called a mother (or basic) wavelet ; and is the complex conjugate of .
It is noted from Equation (55) that when , is a stretched version of , and when , is a compressed version of . Moreover, if then is a reflected version of . For example, these stretching, compression, and reflection processes can be conveniently done on the time axis. The exponent term p in Equation (55) is a real number that stretches or compresses along the vertical axis. The idea of using p in Equation (55) is to keep a desired norm unchanged when scaling the wavelet . For example, if , then both and have the same norm; and if , then and have the same norm .
Using Parseval’s identity, Equation (54) can be written as
where is the Fourier transform of . This equality shows that wavelets transform localizes signals in both time and frequency domains, where sharpness of these localizations is controlled by the scaling factor s and the choice of the mother wavelet .
Morlet wavelet is a (frequency-modulated) mother wavelet which is given in the time domain as:
whose Fourier transform is
where is the center frequency around which the signal is localized in the frequency domain.
Various forms of the mother wavelet have been reported in the wavelet literature [21,22]. All of these wavelet forms should satisfy the admissibility condition:
where is the Fourier transform of .
At a fixed scale s, the CWT of a signal yields information relevant to the feature contained in the signal at the scale s, and the behavior of this feature over time is captured by translating over t. Then this process is repeated for different scales by changing s to capture other signals’ features that are relevant to different scales.
Given a CWT of a signal , the original signal f can be reconstructed by
where is a constant depending on the wavelet , and Equation (60) shows that any signal can be represented as a superposition of shifted and dilated wavelets .
For a discrete signal , the discrete wavelet transform (DWT) is used with a discrete wavelet as:
where s is the scaling parameter and t is the shifting parameter. The most commonly used discrete wavelets have the following values of the parameters:
where j is an integer that controls the scaling parameter and specifies the level of wavelet decomposition of the signal, and k is another integer which controls the shifting parameter. Substitution of these values into Equation (61) yields the most common form of the discrete wavelet
Notice that large values of j result in large scaling parameters which stretch the wavelet function and let the DWT capture low-frequency features in the signal. On the other hand, small values of j would make the DWT more capable of capturing high-frequency features by decreasing the scaling parameter [21,22].
Given a wavelet level j, the DWT of a sequence consists of the following two parts:
The average coefficients are given by:
and the detail coefficients are described by:
where the scaling function is associated with the wavelet function ; full details are given in .
Let us now consider a special case of DWT, where the analyses (i.e., computation of (see Equation (50)) or (see Equation (54)) or their discrete samples) are made directly from relevant integration with necessary values of time-frequency or time-scale parameters, Around 1980, a new method for performing DWT was created, which is known as Multiresolution Analysis (MRA). This method is completely recursive and is therefore ideal for computation, as succinctly described below.
In MRA, we may think of level-1 DWT of as the output of two filters connected in parallel, consisting of a low-pass filter with the impulse response g and a high-pass filter with the impulse response h, as seen in Figure 4. This is known as the filter bank implementation of DWT, consisting of different levels j. The cutoff frequency of each filter in the filter bank equals to a half of the bandwidth of the respective input signal. Hence, the output of each filter has a half of the bandwidth of the original sequence so that it is subsampled by 2. That is,
Therefore, given a level-j DWT of a discrete-time signal , if in the sequence of average coefficients is passed through a parallel combination of identically structured filters g and h, then the output is a sequence of level- DWT of as seen in Figure 4. The features associated with different frequency components of the signal can be captured by using a multilevel wavelet decomposition of via iterative implementation of filter banks in the setting of time and frequency localization (see, for example, [21,23,24]).
Let us consider a function having a wavelet transform , which can be interpreted as the “details" contained at fixed scales . This interpretation is especially useful in the discrete case for understanding the principles of MRA as seen below.
Let be a zero-mean unit-variance probability density function, which has the following properties:
Assuming that , i.e., ϕ is at least n times differentiable, where ., it follows that . Now letting , we have
Thus, satisfies the admissibility condition in Equation (59) and hence can be used to define a CWT.
For and , let and . Then, is a probability density with mean t and standard deviation ; and is qualified to be a wavelet family by setting in Equation (61).
As a numerically explicit example, let ϕ represent the zero-mean unit-variance Gaussian density, i.e., . Since , n can be taken to be any positive integer. For instance, and , and so on. Because of the shape of the graph, is popularly known as the Mexican hat mother wavelet, which is often used in engineering applications.
3.5. Karhunen-Loéve Expansion of Random Signals
Karhunen-Loéve (K-L) expansion is a powerful tool that generalizes Fourier series expansion for analysis of random time-dependent signals. The K-L expansion is frequently used in statistical signal processing and detection theory by using deterministic time-dependent orthonormal functions and random-variable coefficients.
(Karhunen-Loéve Expansion).Let be a zero-mean, second-order random process, defined over where , with a continuous covariance function . Then, it follows that
where the (countable) sequence of (deterministic) functions is a complete orthonormal set of solutions to the following integral equation:
and the random coefficients are mutually statistically orthogonal, i.e.,
The deterministic functions are orthonormal in the following sense:
(K-L expansion of white noise).Let the covariance function of zero-mean stationary white noise be . Then, the orthonormal functions satisfy the K-L integral equation, for all , as:
It is also true that , which implies that . Thus, the choice of these orthonormal functions is arbitrary and all ’s are identically equal to . It is concluded that, for any zero-mean white noise, the K-L expansion functions can be any set of orthonormal functions with all eigenvalues .
(K-L expansion as an application to detection theory ).Let us assume that a waveform is observed over a finite time interval to decide whether it contains a recoverable signal buried in noise, or the signal is completely noise-corrupted (i.e., the signal cannot be recovered). In this regard, we formulate a binary hypothesis testing problem with the hypothesis of having a recoverable signal and the hypothesis of complete noise capture, i.e.,
where the signal is a deterministic function of time, and the noise is modeled as zero-mean, unit-variance, white Gaussian. Using the K-L expansion, we simplify the above decision problem by replacing the waveform with a sequence , which reduces to a sequence of simpler problems as:
where and are the respective (at most countably many) K-L coefficients of the signal and noise .
Now we take the K-L transform (instead of Fourier transform) of the received signal , where the transform space is the space of sequences of K-L coefficients that are mutually statistically orthogonal random variables. By taking advantage of the facts that the noise is zero-mean Gaussian and that the K-L coefficients are mutually statistically orthogonal, the random variables become jointly independent., i.e., is a sequence of independent and identically distributed (iid) random variables. By selecting the first orthonormal function as:
we can complete the rest of the orthonormal set in a valid way. We also notice that all of the random coefficients , with the exception of , will be zeros, i.e., is affected by the presence or absence of the recoverable signal. Thus, the distributed detection problem is reduced to the following scalar detection problem:
We note that the scalar can be computed as:
which is commonly referred to as a matching operation. In fact, this operation can be performed by sampling the output of a filter whose impulse response is:
where the parameter T should be chosen sufficiently large to make the impulse response causal. The output of the physically realizable filter at time T is then . This filter is called a matched filter and is widely used in the disciplines of communications and pattern recognition.
3.6. Reproducing Kernel Hilbert Spaces
This subsection develops the concept of reproducing kernel Hilbert spaces (RKHS) , in which each point in the space is a linear bounded (equivalently, a linear continuous) functional. The continuity (or boundedness) of linear functions implies that if two linear bounded functions f and g are close to each other (i.e., is small in the function space), then f and g are also close to each other pointwise, i.e., is also small for all t.
The RKHS has many engineering and scientific applications, including those in harmonic analysis, wavelet analysis, and quantum mechanics. In particular, functions from RKHS have special properties that make them useful for function estimation problems in high-dimensional spaces, which is critically important in the fields of statistical learning theory and machine learning . In fact, every function in RKHS that minimizes an empirical risk functional can be expressed as a linear combination of the kernel functions evaluated at the training points. This procedure potentially simplifies the handling of the problem from infinite-dimensional to finite-dimensional.
We now present a formal definition of reproducing kernel Hilbert spaces (RKHS). The presented theory is often applied to real-valued Hilbert spaces and can be extended to complex-valued Hilbert spaces; examples of complex-valued RKHS are spaces of analytic functions.
(Reproducing Kernel Hilbert Spaces).Let T be an arbitrary non-empty set (e.g., the time domain or the spatial domain of a function) and let H be a Hilbert space of real-valued (resp. complex-valued) functions on T, equipped with pointwise vector addition and pointwise scalar multiplication, and the continuous functions in H are evaluated at each point . Then, H is defined to be a reproducing kernel Hilbert space (RKHS) if there exist a positive real and a continuous linear functional on H such that . [Note: Although is constrained to be a positive real, it is possible that .]
Definition 9 is rather a weak condition to ensure the existence of an inner product and the evaluation of every functional on H at every point in the domain T. From the application perspectives, a more useful definition would be to construct an inner product of a given function with another function , which is the so-called reproducing kernel function for the Hilbert space H; the RKHS has taken its name from here.
To make Definition 9 more useful for many applications, we make use of Reisz representation theorem (Theorem 6) which states that there exists a unique with the following reproducing property for each , which takes values at any given as:
Since, for a given , the function takes values in (resp. ) and having another associated with the parameter and a corresponding functional on H, it follows that
The above situation can be interpreted as follows: is a time translation of from t to if the set T is the time domain of the functions in the Hilbert space. This allows us to redefine the reproducing kernel of the Hilbert space H as a function as: .
(Bandlimited approximation of Dirac delta function in the RKHS setting).Let us consider the space of continuous signals that are also band-limited with frequencies under the compact support, i.e., in the range of , where the cutoff frequency . It is noted that is a bandlimited version of the Dirac delta function, because converges to the delta distribution, expressed as in the weak sense, as the cutoff frequency Ω tends to infinity.
Let us define and , where is the space of continuous functions whose domain is T, and the Fourier transform of f is: and the inverse Fourier transform of is: . Then, it follows by Cauchy-Schwarz inequality and Plancherel theorem that:
It follows from the relation: , established earlier, that the functional and the RKHS kernel function are bounded. Therefore, H is indeed an RKHS.
By choosing the kernel function in this case as: , and by taking , it follows that as , the Fourier transform of the kernel becomes
This is a consequence of frequency modulation due to the time-shifting property of Fourier transform. Then, it follows by using Plancherel theorem that:
Thus, the reproducing property of the kernel is established as the cutoff frequency .
4. Summary and Conclusions
This paper sheds light on some of the key concepts from functional analysis, which provide a unified mathematical framework for solving problems in engineering and applied sciences, and especially in modern signal processing. Additionally, the simple (and yet elegant) way, by which this framework facilitates formulation of different topics in signal processing along with several relevant examples, enables solving many problems in science and engineering by utilizing the concepts from the discipline of functional analysis. Some of the important results from functional analysis can find their ways to contribute for further advances in statistical signal processing and adaptive signal processing. Nevertheless one of the main difficulties in doing so is the existing gap in the terminologies and technological languages used in these two (apparently different) fields; and this paper attempts to (at least partially) bridge this gap.
Conceptualization, N.F.G., A.R. and W.K.J.; methodology, N.F.G., A.R. and W.K.J.; software, N.F.G. and A.R.; formal analysis, N.F.G. and A.R.; model preparation and validation, N.F.G. and A.R.; data curation, N.F.G. and A.R.; writing—original draft preparation, N.F.G., A.R. and W.K.J.; writing—review and editing, N.F.G., A.R. and W.K.J.; funding acquisition, A.R. All authors have read and agreed to the published version of the manuscript.
The reported work has been supported in part by the U.S. Air Force Office of Scientific Research under Grant No. FA9550-15-1-0400, by the U.S. Army Research Office under Grant No. W911NF-20-1-0226, and by the U.S. National Science Foundation under Grant no. CNS-1932130. Findings and conclusions or recommendations, expressed in this publication, are those of the authors and do not necessarily reflect the views of the sponsoring agencies.
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
The data presented in this study are available on request from the corresponding author.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Appendix A. Preliminary Concepts
This appendix introduces several preliminary (but critical) concepts from real analysis, probability theory, and topology, mainly taken from Naylor and Sell , Royden , Stark and Woods , Ash , and Munkres . These references are expected to be helpful, along with their cited materials therein, for appropriate understanding of key concepts that are frequently encountered in the discipline of modern signal processing.
Appendix A.1. Metric Spaces and Topological Spaces
This subsection introduces rudimentary concepts of metric and topological spaces. Details are available in the afore-mentioned standard textbooks.
(Metric Spaces).Let X be a non-empty set. A function , where is the space of real numbers, is called a metric (or a distance function) on X if the following conditions hold for all :
Positivity: , and iff ;
Triangular Inequality: .
The pair is called a metric space; if there is no ambiguity on ρ, the metric space is denoted only by X.
A well-known example of a metric space is the n-dimensional Euclidean space , where n is a positive integer, for every vector and in .
The set X, upon which the metric ρ can operate, is an arbitrary nonempty set. The conditions (i)–(iii) in Definition A1 are obvious if ρ operates on . However, in general, there can be other types of metric operators and X may not be ; an example is the Hamming distance defined on sets of symbol sequences, which is widely used in error correction theory to measure the distance between two code words.
(Open and Closed Sets).A set in a metric space is called open if, for all , there exists an open ball , which is of radius with center at y. A set is called closed if the complement is open in .
(Cauchy Sequence).A sequence in a metric space is called a Cauchy sequence if . In other words, as .
(Completeness of a metric space).A metric space is called complete if every Cauchy sequence converges in the metric space,
(Sequential Compactness).A metric space is said to be sequentially compact if every sequence of points in contains a convergent subsequence , and it is called sequentially precompact if every sequence of points in contains a Cauchy subsequence (see Definition A3 and also see ).
In contrast to metric spaces, where a distance function on a set X is used to introduce key concepts (e.g., neighborhood, open set, closed set, and convergence), a more powerful approach is to specify a system in terms of open sets to introduce such properties; this leads to the notion of a topological space (e.g., ).
(Topological Spaces).A topology on a nonempty set X is a collection ℑ of subsets of X, which has the following properties:
ℑ must contain the empty set ∅ and the set X.
Any union of the members in ℑ, belonging to an arbitrary (i.e., finite, countable, or uncountable) (A set is defined to be countable if it is bijective to the set, , of positive integers; and a finite or a countably infinite set is often called at most countable. An infinite set, which is not countable, is called uncountable. For example, the set of integers, , is countable, while an interval is uncountable. These concepts lead to the fundamental difference between “continuous-time (CT) analog" and “discrete-time (DT) digital” signal processing.) subcollection of sets must be contained in ℑ.
The intersection of the members of any finite subcollection of ℑ must be contained in ℑ.
Then, the pair is called a topological space, and the members of ℑ are called open sets of ; if B is an open set in , the complement of B (i.e., ) is called a closed set in . If there is no confusion regarding ℑ, then ℑ is often omitted from and only X is referred to as a topological space.
(Topological Basis).A basis for a topology is a collection of open sets in ℑ, called basis elements, if the following two conditions hold:
For each , there exists at least one basis element such that .
If , where , then there exists a basis element such that and .
Appendix A.2. Random Variables and Stochastic Processes
This subsection introduces rudimentary concepts of random variables and stochastic processes. Details are available in textbooks such as [25,27,30].
(algebra and -algebra).Let be a (non-empty) collection of subsets of a (non-empty) set Ω having some or all of the following properties.
If then , where .
If then .
If then ,
Then, is called an algebra (or a field ) if the properties (a), (b) and (c) are true. If, in addition, the property (d) is true, then is called a σ- algebra (or a σ- field ).
The largest σ-algebra of a nonempty set Ω is the collection of all subsets of Ω, which is the power set . On the other hand, the smallest σ-algebra consists of two sets ϕ and Ω, i.e., the indiscrete σ-algebra .
(Borel Sets).Given a non-empty collection of subsets of Ω, the smallest σ-algebra containing is called the σ-algebra generated by . The Borel -algebra is the σ-algebra generated by the collection of all open intervals in the usual topology of . Members of are called Borel sets.
(Measure).A countably additive measure μ on a σ-algebra is a non-negative, extended real-valued function on such that if forms an at most countable (i.e., finite or countably infinite) collection of disjoint sets in , then . A measurable space is a pair ), and a measure space is a triple , where Ω is a non-empty set, is a σ-algebra of subsets of Ω, and μ is a measure on . The sets in are called measurable sets.
Let , where and the Borel set is the associated σ-algebra. Then, is called the n-dimensional Lebesgue measure, and is called the n-dimensional Lebesgue measure space. For , i.e., in the 1-dimensional real space , given an interval , the measure is the length of the interval S. Similarly, for two-dimensional (i.e., ) and three-dimensional (i.e., ) Lebesgue measures, denotes the area and volume measures, respectively.
(Probability Spaces).If , then μ is called a probability measure, usually denoted by P, and the triplet is called a probability space.
(Measurable Functions).Let and be two measurable spaces. A function is called measurable if the inverse image . If and then f is said to be Borel measurable.
(Random Variables and Random Sequences).A random variable X on a probability space is a Borel measurable function from Ω to . Similarly, a sequence of random variables is called a discrete random process.
A function is continuous in the usual topology if is open for every open set . Therefore, any continuous function is Borel measurable. Furthermore, a function which is continuous almost everywhere on (i.e., except on a set of measure zero) is also Borel measurable. As another example, the unit step function on a compact set , which is discontinuous in in the usual topology, is also a Borel-measurable function.
Now, we introduce the concept of the expected value of a random variable in a probability measure space . Given a random variable X, the expected value of X is denoted as (see  or ). Along this line, two random variables X and Y on are said to be equal in the mean square (ms) sense, denoted as: if (see ). Similarly, two random variables X and Y on are said to be equal in the almost sure (as) sense, denoted as: , if is allowed such that .
Given a random process , the autocorrelation is defined as:
and the autocovariance is defined as
where the superscript , called Hermitian, indicates the complex conjugation of a complex variable, or the complex conjugation of transpose of a complex vector/matrix.
A random process x(t) is called stationary (in the strict sense) if its statistics are not affected by a time translation , i.e., x(t) and x() have the same statistics for any real number . A random process x(t) is said to be wide-sense stationary [7,25] if
The expected value is a constant for all t;
The autocorrelation depends only on the difference , not explicitly on both t and .
Bachman, G.; Narici, L. Functional Analysis; Academic Press: New York, NY, USA, 1966. [Google Scholar]
Naylor, A.; Sell, G. Linear Operator Theory in Engineering and Science, 2nd ed.; Springer-Verlag: New York, NY, USA, 1982. [Google Scholar]
Rudin, W. Real and Complex Analysis; McGraw-Hill: Boston, MA, USA, 1987. [Google Scholar]
Royden, H. Real Analysis, 3rd ed.; Macmillan: New York, NY, USA, 1989. [Google Scholar]
Kreyszig, E. Introductory Functional Analysis with Applications; John Wiley & Sons: Hoboken, NJ, USA, 1978. [Google Scholar]
Bobrowski, A. Functional Analysis for Probability and Stochastic Processes; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
Hayes, M. Statistical Digital Signal Processing and Modeling, 1st ed.; Wiley: Hoboken, NJ, USA, 1996. [Google Scholar]
Haykin, S. Adaptive Filter Theory, 4th ed.; Prentice Hall: Englewood Cliffs, NJ, USA, 2002. [Google Scholar]
Farhang-Boroujeny, B. Adaptive Filters Theory and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
Bressan, A. Lecture Notes on Functional Analysis with Applications to Linear Partial Differential Equations; American Mathematical Society: Providence, RI, USA, 2013. [Google Scholar]
Reed, M.; Simon, B. Methods of Modern Mathematical Physics Part 1: Functional Analysis; Academic Press: Cambridge, MA, USA, 1980. [Google Scholar]
Luenberger, D. Optimization by Vector Space Methods; John Wiley & Sons: Hoboken, NJ, USA, 1969. [Google Scholar]
Desoer, C.; Vidyasagar, M. Feedback Systems: Input-Output Properties; Academic Press: Cambridge, MA, USA, 1975. [Google Scholar]
Therrien, C. Discrete Random Signals and Statistical Signal Processing; Prentice Hall: Englewood Cliffs, NJ, USA, 1992. [Google Scholar]
Proakis, J.; Manolakis, D. Digital Signal Processing: Principles, Algorithms, and Applications, 3rd ed.; Macmillan Publishing Company: New York, NY, USA, 1998. [Google Scholar]
Oppenheim, A.; Schafer, R. Discrete-Time Signal Processing; Prentice Hall: Englewood Cliffs, NJ, USA, 1989. [Google Scholar]
Vapnik, V.; Izmailov, R. Rethinking statistical learning theory: Learning using statistical invariants. Mach. Learn.2019, 108, 381–423. [Google Scholar] [CrossRef][Green Version]
Ghalyan, N.F.; Ray, A. Symbolic Time Series Analysis for Anomaly Detection in Measure-invariant Ergodic Systems. J. Dyn. Syst. Meas. Control.2020, 142, 061003. [Google Scholar] [CrossRef]
Ghalyan, N.F.; Ray, A. Measure invariance of symbolic systems for low-delay detection of anomalous events. Mech. Syst. Signal Process.2021, 159, 107746. [Google Scholar] [CrossRef]
Lorch, E. Spectral Analysis; Oxford University Press: New York, NY, USA, 1962. [Google Scholar]
Kaiser, G. A Friendly Guide to Wavelets; Birkhauser: Boston, MA, USA, 1994. [Google Scholar]
Mallat, S. A Wavelet Tour of Signal Processing: The Sparse Way, 3rd ed.; Academic Press: Amsterdam, The Netherlands, 2009. [Google Scholar]
Ray, A. On State-space Modeling and Signal Localization in Dynamical Systems. ASME Lett. Dyn. Syst. Control.2022, 2, 011006. [Google Scholar] [CrossRef]
Vetterli, M.; Kovacevic, J. Wavelets and Subband Coding; Prentice-Hall, Inc.: Hoboken, NJ, USA, 1995. [Google Scholar]
Stark, H.; Woods, J. Probability and Random Processes with Applications to Signal Processing; Prentice-Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
Helstrom, C. Elements of Signal Detection and Estimation; Prentice Hall: Englewood Cliffs, NJ, USA, 1995. [Google Scholar]
Ash, R. Real Analysis and Probability; Academic Press: Boston, MA, USA, 1972. [Google Scholar]
Munkres, J. Topology, 2nd ed.; Prentice-Hall: Upper Saddle River, NJ, USA, 2000. [Google Scholar]
Shilov, G. Elementary Real and Complex Analysis; Dover Publication Inc.: Mineola, NY, USA, 1996. [Google Scholar]
Papoulis, A. Probability, Random Variables, and Stochastic Processes, 2nd ed.; McGraw-Hill, Inc.: Boston, MA, USA, 1984. [Google Scholar]
Relationship among different spaces in functional analysis.
Relationship among different spaces in functional analysis.
An adaptive filter consisting of a shift-variant filter h with an adaptive algorithm for updating the filter coefficients.
An adaptive filter consisting of a shift-variant filter h with an adaptive algorithm for updating the filter coefficients.
Innovations representation of a random process. (a) Signal model. (b) Inverse filter.
Innovations representation of a random process. (a) Signal model. (b) Inverse filter.
Implementation of level-j and level- MRA filter banks.
Implementation of level-j and level- MRA filter banks.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely
those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or
the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas,
methods, instructions or products referred to in the content.