Preimage Problem Inspired by the F-Transform

Janeček, Jiří; Perfilieva, Irina

doi:10.3390/math10173209

Open AccessArticle

Preimage Problem Inspired by the F-Transform

by

Jiří Janeček

^1,*,†

and

Irina Perfilieva

^2,*,†

¹

Department of Mathematics, Faculty of Science, University of Ostrava, 30. dubna 22, 701 03 Ostrava, Czech Republic

²

Institute for Research and Applications of Fuzzy Modeling, University of Ostrava, 30. dubna 22, 701 03 Ostrava, Czech Republic

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2022, 10(17), 3209; https://doi.org/10.3390/math10173209

Submission received: 10 June 2022 / Revised: 23 August 2022 / Accepted: 1 September 2022 / Published: 5 September 2022

(This article belongs to the Special Issue Fuzzy Natural Logic in IFSA-EUSFLAT 2021)

Download Versions Notes

Abstract

:

In this article, we focus on discrete data processing. We propose to use the concept of closeness, which is less restrictive than a metric, to describe a certain relationship between objects. We establish a fuzzy partition of a given set of objects in a way that admits a closeness space to emerge. The fuzzy (F-) transform is a tool that maps objects with common characteristics to the same discrete image—the direct F-transform. We are interested in the inverse (preimage) problem: How can we describe the class of all functions mapped onto the same direct F-transform? In this manuscript, we focus on this preimage problem, formulated accordingly. Its solution is presented from three different points of view and shows which functions belong to the same class determined by a given image (by the direct F-transform). Conditions under which a solution to the preimage problem is given by the inverse F-transform over the same fuzzy partition, or by transforming a given image using a new system of basic functions, are formulated. The developed theory contributes to a better understanding of ill-posed problems that are typical for machine learning. The appendix contains illustrative numerical examples.

Keywords:

closeness; closeness matrix; closeness space; function similarity; fuzzy partition; fuzzy transform; preimage problem; singular value decomposition

MSC:

15A29

1. Introduction

In this paper, we are focused on spaces with closeness that are characterized using weighted graphs, as in [1], or local neighborhoods of elements as in [2]. Formally, closeness can be described by the graph adjacency matrix that contains weights—a higher value of weight is assigned to the edge that connects closer objects. Those matrices were always square as they contained mutual closeness values of all objects in the dataset.

In our previous research [1,3], the notion of closeness was used in the new approach to dimensionality reduction for which fuzzy transform (F-transform for short, [4]) proved to be useful. Otherwise, the concept of closeness can be observed in multiple contexts and under different names, where it serves auxiliary purposes. For example, in [5], the phase space of a dynamical system is described by a network (graph) where each point represents one state of the system and closeness-describing weights (between two points) are equal to the frequency with which a transition occurred between the two states. There exists a related concept of closeness centrality (aggregation of closeness-describing weights with respect to all neighbors except oneself computed by arithmetic or harmonic mean) that describes the data density around one particular point in a network, used, e.g., in [6]. In the area of image processing (image compression, image segmentation, image retrieval, e.g., [7], etc.), the related concept of proximity space is widely used. It originated from [8] and then evolved—currently, it is used, e.g., to describe the similarity between pixels or patches.

In this research, we consider closeness determined by a fuzzy partition of a universe of discourse. This particular space structure is used in the theory of F-transforms [4,9]. The notion of F-transform was introduced in [4], where we explained modelling with fuzzy IF-THEN rules as a specific transformation. From this point of view, F-transform bridges fuzzy modelling and the theory of linear (in particular, integral) transforms. The generalization to a higher degree version was proposed in [9]. Generally speaking, F-transform performs a transformation of the original universe of functions into a universe of their skeleton models (vectors or matrices of F-transform components), for which further computations are easier. In [4], the approximation property of F-transform was described, and in [10], the effect of the shapes of basic functions on the approximation quality was demonstrated. F-transform has many other useful properties and great potential in various applications, such as special numerical methods as well as solutions to ordinary and partial differential equations with fuzzy initial conditions, mining dependencies from numerical data, signal processing, compression and decompression of images [11] and image fusion.

Among the recent applications of F-transform, we refer to an image classification problem, in which data is scarce [12], a numerical solution to fuzzy integral equations [13] and improving the JPEG compression algorithm in the cases where a high ratio compression is required [11].

Similarly to F-transform, which, in its direct phase, aggregates a large number of functional values of all points into a small number of components associated with selected nodes, we consider that within our (possibly large) dataset, there are several selected points (nodes) with known closeness values with respect to all the other points and that closeness among other points is undefined, which gives rise to a rectangular closeness matrix that describes the closeness space. Nodes can be thought of as the most prominent data points because the dataset can be sparsely represented as a collection of their neighborhoods.

Following the methodology of F-transform, we aim to demonstrate that a partial knowledge about the mutual position of all points in the dataset is sufficient for a successful retrieval of a discrete function (up to non-significant differences that distinguish particular functions from each other within the same equivalence class) from its F-transform components. Thus, we present a not just theoretical tool for demarcating the set of such functions that can be mutually replaced without losing an important feature in the space.

This allows us to express similarity between functions defined on spaces with closeness. In specific cases, a representative function is given by the inverse F-transform. Moreover, we contribute to the general theory of F-transforms in aspects described in this section. The task of reconstructing an element of an input space (or finding an approximate solution if it does not exist) based on (a lower-dimensional representation of) an element of a feature space has been studied in various machine learning domains, e.g., in kernel-based methods (e.g., [14] where the nonlinear dimensionality reduction is performed using the kernel trick on an input image and the task is to recover the image from its denoised version in the feature space, or [15] where algebraic tools based on the simple relationship of distances in the input space and in the feature space induced by a kernel are used to find the preimage of a feature vector) and graph-based methods (where the task is to recover structured data from a single point of a feature space, e.g., to find a representative graph based on features of a found data cluster). In these applications, the concept of preimage problem was already used. This paper is focused on finding all functions that share a set of features (given by the F-transform components) utilizing the above-described concept. We do not consider that these features might be noisy. On the other hand, in [16], we used a similar initial setting of the input space and we assumed that the input signal (function placed in this space) is noisy in a certain sense. The signal was processed in the same manner (based on closeness) and we proposed how to denoise the signal by finding the appropriate closeness parameters. The inverse F-transform is known to reduce the noise of the input signal (provided a suitable fuzzy partition) as was in the continuous case shown, e.g., in [17].

As explained in [18], preimage problems are useful not only in (pattern) denoising and Kernel Dependency Estimation but also in signal compression, where the kernel technique serves as an encoder and the preimage technique as a decoder. Moreover, in [18], a technique to learn the preimage without the need to solve a difficult nonlinear optimization problem is presented. Additionally, in [15], the authors consider that a noisy pattern is mapped to a feature space using the Kernel Principal Component Analysis and then the approximate preimage is found using their technique. To summarize the main stream of preimage problems in machine learning, we can say that a nonlinear optimization, nonlinear iteration, or learning method is used to find a function (the space of all possible functions comprises an input space), s.t. its image in a specified direct mapping has the minimal squared distance from the given point in a feature space. The existence of a precise solution is considered to be a coincidence. The purposes of computing the preimage in machine learning are various (to denoise a pattern, reconstruct a signal, compress an image, find a representative example of a data cluster, or learn a general mapping between the input space and the feature space) but all of them are highly application-oriented, and, hence, the initial settings are adjusted accordingly. This limits their transferability to another task.

In contrast, our approach pays close attention to the structure of the input space and this structure is quite flexible. We are focused on finding the precise solution (we ensure that it always exists) to the special case of the preimage problem. Using the assumption of the finiteness of the universe and endowing the input space with a fuzzy partition and the corresponding closeness, we solve it by the means of linear algebra. The consequent mapping forms a special case of direct mappings mentioned above. Since our theoretical work is not aimed at solving any specific task, we compute the whole class (usually infinite) from which we do not need to choose a particular solution (all of them are equivalent).

In Section 2, we give the details about the notion of closeness considered in this article. After that, the fuzzy partition and transform are recalled and modified for the discrete case. We show that a certain fuzzy partitioned space (space with a fuzzy partition) is a special case of the space with closeness. Section 3 discusses the formulation of the main topic of the paper—called preimage problem—that takes place in the space introduced in Section 2. Preimage problem belongs to basic problems in algebra (e.g., in [19]) where it is considered between certain two structured spaces (we propose to use the space with closeness and the space of scaled F-transform components). The solution to this problem is described in three different ways. Theorem 1 is based on a commonly used solution to the set of algebraic equations. In Section 4, we examine the conditions when the inverse F-transform forms the solution to this problem. In Section 5, we propose a technique consisting of checking whether a solution to the preimage problem can be obtained by a certain transformation of the given element of the space of scaled F-transform components given by a new set of fuzzy partition units. Appendix A provides the numerical examples related to four preceding sections, except for the Conclusions (Section 6).

Relationship of Closeness, Metric and Similarity

A space with closeness

(X, w)

is more general than a metric space

(X, ρ)

as they share the axioms of symmetry and non-negativity but metric is more restrictive—it requires two additional axioms (

ρ (x, y) = 0

iff

x = y

and the triangle inequality

ρ (x, y) \leq ρ (x, z) + ρ (y, z)

for all

x, y, z \in X

). As closeness is a relaxed version of a reciprocal distance metric, it is more suitable for the description of data with graph structure (where the triangle inequality is generally non-enforceable). Another example of a context where closeness is more suitable is formed by the data that is assumed to lie on a topological manifold as there is no straightforward way how to establish a metric there. Similarly to metric, closeness encodes mutual relations between data points.

There is a close connection between closeness and similarity measures—each of these concepts applies in different contexts. An algebraic, fuzzy, or probabilistic background is usually assumed as a similarity requirement. There is no standard axiomatization of its definition. Similarly to closeness, a similarity value is higher for more similar objects (that are, e.g., more correlated, have a higher intersection, smaller cross-entropy, or belong to the same cluster), its values can be negative, it can be non-symmetric, etc. Similarity spaces emerging in various applied fields are even more general than spaces with closeness. A solution to a particular type of problem is often associated with a certain type of similarity measure.

This explains why we introduce the notion of closeness and why we give it a preference over that of metric (too-restrictive axioms) and similarity (too-loose axiomatization). This good trade-off allows us to express various established concepts, e.g., of graph theory (edge weights used, e.g., in the dimensionality reduction technique of Laplacian eigenmaps [2]), fuzzy logic (values of biresiduum of truth values of two formulae), clustering problems (class membership degrees in, e.g., KNN clustering algorithm), etc., in terms of closeness and incorporate a theoretical apparatus upon that. Therefore, as the closeness values can be extracted from the data in various ways, the closeness space creates a platform that enables both data-driven and model-driven approaches to be utilized.

2. Preliminaries

In this section, we describe the mathematical background by giving some basic definitions and properties. Throughout the manuscript, all settings are discrete.

2.1. Closeness Space

We are interested in the notion of closeness on finite sets that, as described below, agrees with [1,2]. To emphasize its general applicability, we express it using the language of classical set theory. We show that it can be also expressed in the languages of fuzzy set theory and graph theory.

Definition 1 (Closeness).

Let

X = {x_{i} | i = 1, \dots, L}

be a set,

L \in N

, then closeness on X is any symmetric, non-negative function

w : X^{2} \to R

, where symmetry means

\forall x, y \in X : w (x, y) = w (y, x)

and

X^{2} = X \times X = {(x, y) | x, y \in X}

is the Cartesian product of X with itself. The pair

(X, w)

is called a space with closeness or simply a closeness space.

We provide the following semantic interpretation of closeness: objects that are closer in a certain sense (as explained in the Introduction, in some cases, metric cannot be defined) have a greater closeness value (with respect to natural order on

R

) than the less-close ones. Non-close objects have a closeness value equal to zero, and, vice versa, if the closeness value of a pair of objects is greater, we consider these objects closer in this sense than a pair of objects with a smaller closeness value. The closeness value hence quantifies a certain quality of mutual alikeness between objects.

Example 1.

Let

X = {0, 10, 20, 30, 40, 50}

and for all

x_{i}, x_{j} \in X : w (x_{i}, x_{j}) = \frac{1}{1 + | x_{i} - x_{j} |}

, then w is closeness on X and

(X, w)

is a space with closeness.

Example 2.

Let ρ be a metric and

(X^{'}, ρ)

a metric space. Let moreover,

X \subseteq X^{'}

and for all

x_{i}, x_{j} \in X

,

w_{1} (x_{i}, x_{j}) = \frac{1}{1 + ρ (x_{i}, x_{j})}

and

w_{2} (x_{i}, x_{j}) = e^{- ρ (x_{i}, x_{j})}

hold true. Then,

(X, w_{1})

and

(X, w_{2})

are closeness spaces.

Definition 2 ((Weakly) Reflexive Closeness).

Let

X = {x_{i} | i = 1, \dots, L}

be a set,

L \in N

, and

w : X^{2} \to R

be closeness on X. Then, w is weakly reflexive closeness on X if there exists a positive real number

\bar{w}

, s.t.

\forall x, y \in X : w (x, y) \leq w (x, x) = \bar{w}

. If, moreover,

\bar{w} = 1

, w is called reflexive closeness on X. The pair

(X, w)

is then a space with (weakly) reflexive closeness or simply a (weakly) reflexive closeness space.

Remark 1.

In this paper, we consider only finite closeness domain X, hence there always exists a constant

m = max {w (x_{i}, x_{j}) | (x_{i}, x_{j}) \in X^{2}}

, in case

1 < m < + \infty

, it would be more appropriate to define strong reflexivity by

\bar{w} = m

but later, we work with closeness

w : X^{2} ⇀ [0, 1]

.

Example 2 contains two reflexive closeness-defining functions. Examples of weakly reflexive closenesses derived from an arbitrary metric

ρ

include

\frac{1}{2 + ρ (x_{i}, x_{j})}

and

\frac{1}{2} e^{- ρ (x_{i}, x_{j})}

, both with

\bar{w} = \frac{1}{2}

.

Let

(X, w)

be a space with closeness where

X = {x_{i} | i = 1, \dots, L}

is a set,

L \in N

. Let us denote

\forall x_{i}, x_{j} \in X : w_{i j} = w (x_{i}, x_{j}),

(1)

and consider a matrix that contains the closeness values

W = {(w_{i j})}_{i, j = 1}^{L} \in R^{L \times L},

(2)

then the matrix W is called the matrix of closeness on X or simply the closeness matrix. Functional values of closeness w for the elements of

X^{2}

can be arranged into a matrix W—the matrix of closeness on X.

2.2. Fuzzy Set, Relation and Partition

Definition 3 (Fuzzy Set).

Let X be a non-empty universe and

K : X \to [0, 1]

be a function. Fuzzy set K is the set of pairs

{(x, K (x)) | x \in X}

. The function K is a membership function and its functional value

K (x)

is called the membership degree of the element x to K. Fuzzy set is conventionally identified with its membership function. We call it fuzzy subset of X which is denoted by

K \underset{\sim}{\subset} X

. We say that the point

x \in X

is covered by the fuzzy set K if

K (x) > 0

.

Remark 2.

Note that if

K : X \to {0, 1}

, then K is a characteristic function of an ordinary set. The membership function

K : X \to [0, 1]

is a generalization of the characteristic function of the set

K \subseteq X

given by a function

χ_{K} : X \to {0, 1}

. If the universe of a (fuzzy) set is clear, we can talk about (fuzzy) set K without referring to it.

Definition 4 (Fuzzy Relation).

Let V and X be non-empty sets, then binary fuzzy relation r is any fuzzy subset of

V \times X

, i.e.,

r \underset{\sim}{\subset} V \times X

.

Hence, the binary fuzzy relation r is identified with a membership function

r :

V \times X \to [0, 1]

.

Definition 5 (Convex Fuzzy Set).

Let

X \subset R

and

K \underset{\sim}{\subset} X

. Then K is a convex fuzzy set, if

\forall x_{c}, x_{d}, x_{e} \in X, x_{c} < x_{d} < x_{e} : K (x_{d}) \geq min {K (x_{c}), K (x_{e})}

.

Following [4], we recall the notion of fuzzy partition and modify it for a discrete case. Consecutively, we set fuzzy transform in Section 2.6.

Definition 6 (Fuzzy Partition of a Real Interval with Nodes).

Let

[y_{0}, y_{l + 1}]

be a real interval and

y_{1}, \dots, y_{l} \in R

, s.t.

y_{0} < y_{1} < y_{2} < \dots < y_{l} < y_{l + 1}

and

0 < l < + \infty

. Then the fuzzy partition of

[y_{0}, y_{l + 1}]

is given by a set of fuzzy sets

{A_{1}, \dots, A_{l}}

, if:

1.: $\forall i = 1, \dots, l : A_{i} : [y_{0}, y_{l + 1}] \to [0, 1]$ ,
2.: for all $i = 1, \dots, l$ : $A_{i}$ is continuous with respect to topologies on $[y_{0}, y_{l + 1}], [0, 1] \subset R$ induced by the natural topology on $R$ ,
3.: $\forall i = 1, \dots, l : A_{i} (x) > 0$ iff $x \in (y_{i - 1}, y_{i + 1})$ ,
4.: for all $i = 1, \dots, l$ : $A_{i}$ strictly increases in $(y_{i - 1}, y_{i}]$ ,
5.: for all $i = 1, \dots, l$ : $A_{i}$ strictly decreases in $[y_{i}, y_{i + 1})$ .

Elements of the set

{A_{1}, \dots, A_{l}}

are called basic functions (fuzzy partition units). Based on condition 3.,

A_{1}, \dots, A_{l}

are associated with points

y_{1}, \dots, y_{l}

, respectively. These points are called nodes of the fuzzy partition.

In Definition 6, conditions 3–5 ensure that each basic function is a convex fuzzy set. Moreover, condition 3 ensures that each basic function covers at least one point. This might not hold in the case of a discrete universe which motivated the following definition.

Definition 7 (Sufficient Density with respect to Fuzzy Partition).

Let X be a set, s.t.

X \subset [y_{0}, y_{l + 1}] \subset R

, and

0 < l < + \infty

. Let

P = {A_{1}, \dots, A_{l}}

be a fuzzy partition of the interval

[y_{0}, y_{l + 1}]

. Then X is sufficiently dense with respect to

P

, if

\forall i = 1, \dots, l \exists x \in X : A_{i} (x) > 0

.

Definition 7 states that a set is sufficiently dense with respect to a given fuzzy partition if every basic function covers at least one point of that set. This notion allows us to define the fuzzy partition of a discrete set.

Definition 8 (Fuzzy Partition of a Discrete Subset of a Real Interval with Nodes).

Let

X = {x_{1}, \dots, x_{L}}

be a set, s.t.

X \subset [y_{0}, y_{l + 1}] \subset R

, and

0 < l \leq L < + \infty

. Let

P^{'} = {A_{1}^{'}, \dots, A_{l}^{'}}

be a fuzzy partition of the interval

[y_{0}, y_{l + 1}]

with nodes in

Y = {y_{1}, \dots, y_{l}} \subseteq X

and X be sufficiently dense with respect to

P^{'}

. Then

P = {A_{1}, \dots, A_{l}}

is the fuzzy partition of X with nodes in Y, if

\forall i = 1, \dots, l : A_{i} = A_{i}^{'} |_{X}

, i.e., if each basic function is replaced by its restriction on X.

Remark 3.

Definition 8 can be extended in the converse direction: let

X = {x_{1}, \dots, x_{L}}

be a set, s.t.

X \subset [y_{0}, y_{l + 1}] \subset R

,

Y = {y_{1}, \dots, y_{l}} \subseteq X

and

0 < l \leq L < + \infty

. Let

P = {A_{1}, \dots, A_{l}}

, s.t.

\forall i = 1, \dots, l : A_{i} \underset{\sim}{\subset} X & A_{i} (y_{i}) > 0

. Then

P

will be also called the fuzzy partition of X with nodes in Y, if there exists a fuzzy partition

P^{'} = {A_{1}^{'}, \dots, A_{l}^{'}}

with nodes in Y, s.t. X is sufficiently dense with respect to

P^{'}

and for each

i = 1, \dots, l

,

A_{i}^{'}

is a continuous extension of

A_{i}

on

[y_{0}, y_{l + 1}]

, i.e.,

A_{i}^{'} \underset{\sim}{\subset} [y_{0}, y_{l + 1}]

,

A_{i}^{'}

is continuous and

A_{i} = A_{i}^{'} |_{X}

.

Below, we show how the space with closeness can be represented by a fuzzy partitioned space and by a weighted graph-structured space that, satisfying certain conditions, are examples of closeness spaces.

2.3. Closeness as Fuzzy Relation and Closeness Given by a Fuzzy Partition

Using the language of fuzzy set theory, we can say that any symmetric fuzzy relation

w \underset{\sim}{\subset} X^{2}

, where symmetry means

\forall x, y \in X : w (x, y) = w (y, x)

, is closeness on X.

Note that in the context of fuzzy relations, it holds that for weakly reflexive closeness, we have

\bar{w} \leq 1

. For

\bar{w} = 1

, we obtain a special case called reflexive closeness on X.

Lemma 1 (Closeness Given by a Fuzzy Partition – 1).

Let

{A_{1}, \dots, A_{l}}

be a fuzzy partition associated with nodes

y_{1}, \dots, y_{l}

of the set

{x_{1}, \dots, x_{L}} \subset [y_{0}, y_{l + 1}]

, s.t.

y_{0}, \dots, y_{l + 1}, x_{1}, \dots, x_{L} \in R

,

0 < l = L < + \infty

and

y_{0} < y_{1} = x_{1} < y_{2} = x_{2} < \dots < y_{l} = x_{L} < y_{l + 1}

. For all

i, j = 1, \dots, l

, let it hold that

A_{i} (y_{j}) = A_{j} (y_{i}),

(3)

then the function

w : {x_{1}, \dots, x_{L}}^{2} \to R

given by

\forall i, j = 1, \dots, L : w (x_{i}, x_{j}) = A_{i} (x_{j})

is closeness on

{x_{1}, \dots, x_{L}}

.

Proof.

The function w is obviously real, bivariate and non-negative. Using the property (3), we have

\forall i, j = 1, \dots, L : w (x_{i}, x_{j}) = A_{i} (x_{j}) = A_{i} (y_{j}) = A_{j} (y_{i}) = A_{j} (x_{i}) = w (x_{j}, x_{i})

, hence w is also symmetric. □

Lemma 1 shows that a fuzzy partition of a subset of a real interval, s.t. its every point is a node and satisfying condition (3), determines closeness on this set.

2.4. Closeness as Weighted Adjacency of Graph Vertices

In this subsection, we show that closeness can be expressed using the language of graph theory. We do not recall the basic concepts of this theory (details can be found, e.g., in [20]). Generally, closeness between any two objects

x_{i}

and

x_{j}

in X is described by the values (weights)

w (x_{i}, x_{j})

of a real, non-negative, symmetric bivariate function w.

Let

G (X, E, W)

be a weighted graph where the set of vertices corresponds to objects in X, the set of edges

E \subseteq X^{2}

and weights of the edges are given by the adjacency matrix W and in accordance with (2). The objects are non-close if and only if the corresponding vertices are not connected by a direct edge, i.e., there is a fictional edge with zero weight between them. We denote this disconnectedness or non-closeness by

≁

which is a classical (meaning non-fuzzy) symmetric binary relation on X, i.e.,

≁ \subseteq X^{2}

:

x_{i} ≁ x_{j} \Leftrightarrow w (x_{i}, x_{j}) = 0 .

(4)

In the other words, the value of closeness determines the existence of an edge.

By defining the values of closeness

{w (x_{i}, x_{j}) | x_{i}, x_{j} \in X}

, i.e., by defining the values of entries of the closeness matrix, we simultaneously define the values of entries of the adjacency matrix W which uniquely determines a weighted graph

G (X, E, W)

by (1), (2) and (4). That is why the notion of closeness can be also established from the perspective of weighted graphs: let

X = {x_{i} | i = 1, \dots, L}

be a set of vertices,

L \in N

, then, following the convention given by (4) as

E \subseteq X^{2}

, any evaluation of the set of all edges

X^{2}

described by a symmetric, non-negative function

w : X^{2} \to R

is closeness on X.

Below, we join the theory of closeness with a certain type of fuzzy partition to describe the space this paper deals with.

2.5. Initial Assumptions

Let us have a finite set of points on

R

:

X = {x_{i} | i = 1, \dots, L},

(5)

and its non-empty subset

Y \subseteq X, 0 < | Y | = l \leq L = | X | < + \infty .

Let

{A_{t} | t \in Y}

be a fuzzy partition of X with nodes in Y (in the sense of Remark 3), s.t. with each node

t \in Y

, we associate a basic function

A_{t} : X \to [0, 1]

. Let, moreover, (3) be satisfied, i.e.,

\forall t, s \in Y : A_{t} (s) = A_{s} (t)

, and let all points in X be covered by at least one basic function, i.e.,

⋃_{t \in Y} {x \in X | A_{t} (x) > 0} = X .

(6)

Let us define node degrees by

\forall t \in Y : d_{t t} = \sum_{x \in X} A_{t} (x) .

(7)

Following condition 3 in Definition 6, we know that

A_{t} (t) > 0

,

t \in Y

, which implies that each node degree

d_{t t} \in (0, \infty)

. Moreover, based on Definition 7, it means that the set Y is sufficiently dense with respect to the fuzzy partition

{A_{t} | t \in Y}

; hence, the same holds for the universe X.

It is worth noting that instead of working with the closeness on the whole universe X, given by a function on

X^{2}

, we work only with its restriction on

Y \times X

(the values of closeness on

(X ∖ Y) \times X

are undefined, as explained in the Introduction). In the other words, closeness

w : X^{2} ⇀ R

is a partial function (partial function

f : A ⇀ B

maps each element of a set A to at most one element of the set B, it is a functional binary relation with carrier set being a subset of

A \times B

) given by

\forall x_{i}, x_{j} \in X : w (x_{i}, x_{j}) = \{\begin{matrix} A_{t} (x_{j}) & t = x_{i} \in Y \\ undefined & otherwise . \end{matrix}

This description justifies the selection of nodes within the universe (Semantic description of nodes is also explained in the Introduction).

Even though we will still call the total function

w : Y \times X \to R

(total function

f : A \to B

maps each element of a set A to exactly one element of the set B, it is a classical function) closeness on X and denote the corresponding closeness space as

(X, w)

.

Lemma 2 (Closeness Given by a Fuzzy Partition—2).

Let

{A_{1}, \dots, A_{l}}

be a fuzzy partition associated with nodes

y_{1}, \dots, y_{l}

of the set

{x_{1}, \dots, x_{L}} \subset [y_{0}, y_{l + 1}]

, s.t.

y_{0}, \dots, y_{l + 1}, x_{1}, \dots, x_{L} \in R

,

y_{0} < y_{1} < y_{2} < \dots < y_{l} < y_{l + 1}

,

y_{0} < x_{1} < x_{2} < \dots < x_{L} < y_{l + 1}

,

{y_{1}, \dots, y_{l}} \subseteq {x_{1}, \dots, x_{L}}

,

0 < l \leq L < + \infty

and let condition (3) hold. Let us denote

\forall i = 1, \dots, L : κ (i) = \{\begin{matrix} j & x_{i} = y_{j} \\ l + 1 & ∄ y \in {y_{1}, \dots, y_{l}} : x_{i} = y \end{matrix}

(note that in case

x_{i} = y_{j}

, we have

y_{j} = y_{κ (i)}

). Then the function

w : {x_{1}, \dots, x_{L}}^{2} ⇀ R

given by

w (x_{i}, x_{j}) = A_{κ (i)} (x_{j})

, where

κ (i) \leq l

, is closeness on

{x_{1}, \dots, x_{L}}

.

Proof.

The function w is obviously real, bivariate and non-negative. Using the property (3), we have

\forall i, j = 1, \dots, L, κ (i), κ (j) \leq l : w (x_{i}, x_{j}) = A_{κ (i)} (x_{j}) = A_{κ (i)} (y_{κ (j)}) = A_{κ (j)} (y_{κ (i)}) = A_{κ (j)} (x_{i}) = w (x_{j}, x_{i})

, hence w is also symmetric. □

Lemma 2 shows that a fuzzy partition of a subset of a real interval, s.t. it contains all nodes and satisfies condition (3), determines closeness on this set. We will follow Lemma 2, and hence closeness is given by:

w (t, x) = A_{t} (x), where t \in Y, x \in X .

(8)

By that, the closeness space

(X, w)

is specified. The values of closeness

w : Y \times X \to R

are inserted in the closeness matrix

W \in R^{l \times L}

and given by

\forall t \in Y, x \in X : W (t, x) = w (t, x) .

(9)

It means that by fixing a set of basic functions, we determine values of the closeness-describing function

w : Y \times X \to R

.

It follows from (4), (6) and (8) that for every point x of the universe X there exists a node

t \in Y

, s.t.

w (t, x) > 0

. It means that every point is connected with at least one node.

Section 2.1 and Section 2.2 covered the basic theory of closeness and the theory of fuzzy partitions, respectively. In the following Section 2.6, we recall the theory of fuzzy transform that we need to describe a fundamental property (the F-transform identity) of the space with closeness that will be used in Section 3 to describe the preimage problem.

2.6. Fuzzy Transform

In Section 2.2, we defined the fuzzy partition and its units, basic functions. Below, we proceed by the establishment of fuzzy transform. The closeness space

(X, w)

serves as a ground on which F-transform is established as its key element, a fuzzy partition, is selected on condition (3) that allows the creation of

(X, w)

based on (8) in Section 2.5.

Definition 9

(Direct F-Transform [4]).Let X be a set,

Y \subseteq X \subset [y_{0}, y_{l + 1}] \subset R

,

0 < | Y | = l \leq | X | = L < + \infty

,

P = {A_{t} : X \to [0, 1] | t \in Y}

be a fuzzy partition of X with nodes in Y, s.t.

y_{0} < x_{1} < x_{2} < \dots < x_{L} < y_{l + 1}

, and satisfying (3), i.e.,

\forall t, s \in Y : A_{t} (s) = A_{s} (t)

. Let

d_{t t}

be a node degree given by (7) and

u : X \to R

be a function where

\forall i = 1, \dots, L : u (x_{i}) = u_{i}

. Then the direct discrete F-transform of u with respect to

P

is a vector

F [u] \in R^{l}

of F-transform components defined by

\forall t \in Y : F {[u]}_{t} = \frac{\sum_{i = 1}^{L} u_{i} A_{t} (x_{i})}{\sum_{i = 1}^{L} A_{t} (x_{i})} = \frac{\sum_{i = 1}^{L} u_{i} A_{t} (x_{i})}{d_{t t}} .

(10)

Remark 4.

Note that in [4], the direct F-transform is defined only with respect to a fuzzy partition (without closeness) and in that case, the requirement that

P

satisfies (3) is omitted. This paper, however, assumes a closeness background. That is why for every node

t \in Y

, the value

d_{t t} = \sum_{i = 1}^{L} w_{t i},

(11)

is also called a node degree which fully agrees with (7). For the corresponding space with closeness

(X, w)

where the closeness values are denoted according to (1), we equivalently define the direct F-transform of u as a vector

F [u] \in R^{l}

of F-transform components defined by

\forall t \in Y : F {[u]}_{t} = \frac{\sum_{i = 1}^{L} u_{i} w_{t i}}{\sum_{i = 1}^{L} w_{t i}} = \frac{\sum_{i = 1}^{L} u_{i} w_{t i}}{d_{t t}},

(12)

which fully agrees with (10).

Let F be a real vector-valued operator that acts on the space of all real functions on the universe X and gives us a real vector in

R^{l}

given by (10) that can be uniquely identified with a function on the set of nodes Y, an element of a functional space on Y, s.t.

\forall t \in Y : F [u] (t) = F {[u]}_{t}

. So,

F : u \mapsto F [u], where u : X \to R and F [u] : Y \to R .

Then Definition 9 describes F-transform as a vector of F-transform components, i.e., as an image of the operator F.

Lemma 3 (F-Transform Identity).

Let X be a set,

Y \subseteq X

,

0 < | Y | = l \leq | X | = L < + \infty

,

P = {A_{t} : X \to [0, 1] | t \in Y}

be a fuzzy partition of X with nodes Y satisfying the assumptions of Lemma 2 and

(X, w)

be a corresponding space with closeness where

W \in R^{l \times L}

is the closeness matrix given by (8) and (9), and

D \in R^{l \times l}

be a diagonal scaling matrix with diagonal elements

{(d_{t t})}_{t \in Y}

given by (7) and (11). Let

F [u] \in R^{l}

be the vector of F-transform components of the function

u : X \to R

with respect to

P

and

u \in R^{L}

be the vector form of the function

u : X \to R

given by

\forall i = 1, \dots, L : u_{i} = u (x_{i})

. Then the following holds:

D F [u] = W u .

(13)

Equation (13) is called the F-transform identity.

Lemma 3 characterizes the operator F in a matrix form utilizing the fact that both functional spaces contain discrete functions; hence, they can be represented as vectors (matrices of finite order):

F : R^{L} \to R^{l}, u \mapsto F [u] = D^{- 1} W u .

There is a historical reason for representing F-transform as a mapping.

Definition 10 (Inverse F-Transform).

Let X be a set,

Y \subseteq X

,

0 < | Y | = l \leq | X | = L < + \infty

,

P = {A_{t} : X \to [0, 1] | t \in Y}

be a fuzzy partition of X with nodes Y satisfying the assumptions of Lemma 2 and

(X, w)

be a corresponding space with closeness where

W \in R^{l \times L}

is the closeness matrix given by (8) and (9). Let

W^{t}

be its t-th row and

F [u]

be the direct F-transform of a function

u : X \to R

in accordance with Definition 9 and (12). Then the inverse F-transform of the function u is the vector

\hat{F} [u] = W^{⊤} F [u] = \sum_{t \in Y} F {[u]}_{t} {W^{t}}^{⊤} \in R^{L} .

Definition 10 agrees with the standard case given in [4] where it is shown that the inverse F-transform approximates the original function defined on the fuzzy partitioned space (the function is sparsely represented by the direct F-transform). Note that it is called “inverse” as it transforms the vector of direct F-transform components back to the original space. It is not an inverse mapping, though.

3. Preimage Problem

In this section, we describe a form of preimage problem (see Definition 11 below) that we would like to solve in order to describe a similarity between functions. The inverse F-transform of a function u provides a vector

\hat{F} [u]

that approximates the vector form of u. Therefore, we were motivated to solve the problem of how to find the class of functions that have the same inverse F-transform but decided to solve a more specific problem described in the following paragraph.

Let us have a fixed mapping (based on the F-transform identity in Lemma 3) that maps functions on vectors:

u \mapsto D F [u] = W u

. We would like to characterize the similarity between functions in the sense that two functions are similar if their images (scaled vectors of direct F-transform components) coincide. In that case, their inverse F-transforms coincide as well (

D F [u^{'}] = D F [\bar{u}] \Rightarrow \hat{F} [u^{'}] = \hat{F} [\bar{u}]

). We use the singular value decomposition (formalized, e.g., in [21] and commonly abbreviated as SVD) of the closeness matrix W to specify the problem and its solution.

We assume that the universe X, its subset Y of nodes and the closeness matrix

W \in R^{l \times L}

given by (8) and (9), where

0 < | Y | = l \leq | X | = L < + \infty

, are fixed. Recall that the node degrees

d_{t t}

’s are given by (7) and form the diagonal of the matrix

D \in R^{l \times l}

. In the following text, we will respect the aforegiven assumptions.

Moreover, for simplicity, we will use the language of linear algebra in the following two subsections. That is why the clarification of the correspondence between discrete functions (functional spaces) and vectors (vector spaces) follows.

From the functional viewpoint, let A be the space of all functions

u : X \to R

and let B be the space of such functions

f : Y \to R

that assign to each node

t \in Y

the value of the corresponding F-transform component multiplied by the corresponding node degree, i.e.,

f \in B \Leftrightarrow \exists u : X \to R \forall t \in Y : f (t) = d_{t t} F {[u]}_{t} .

From the viewpoint of the linear algebra, recalling that

| X | = L

,

A = R^{L}

is a vector space, its elements are later denoted by u or v. Space A contains all vector representations of functions on X. As each element of A corresponds to one function on the universe X, we can further identify X, given by (5), with the set of its indices

{1, \dots, L}

. Hence, the correspondence between the function

u : X \to R

and the vector

u \in R^{l}

is given by the following equation:

\forall x_{i} \in X : u (x_{i}) = u_{i} .

(14)

Following (8) and (9), recall that each row of the closeness matrix

W \in R^{l \times L}

defines one basic function. Then, based on the F-transform identity,

B = range W

(range, or column space, of the matrix

W \in R^{l \times L}

is a linear subspace of

R^{l}

spanned by all columns of W) is a vector subspace of

R^{l}

containing the scaled images of the operator F that are uniquely determined by W. In the other words, the set of all vectors (elements) of B,

f \in R^{l}

, is defined by

f \in B \Leftrightarrow \exists u \in A : f = D F [u] = W u .

Hence, the correspondence between the function

f : Y \to R

(satisfying

\forall t \in Y : f (t) = d_{t t} F {[u]}_{t}

) and the vector

f \in range W

is given by the following equation:

\forall t \in Y : f (t) = f_{t} .

(15)

Note that, for convenience, the vectors in

R^{l}

are indexed based on the indices of nodes in

R^{L}

, so their indices need not form an arithmetic sequence. The same convention will be also respected for the rows of the matrix W.

In the following text, we will also respect the notation of the vector spaces A and B.

3.1. Problem Formulation

In this subsection, we describe the preimage problem in terms of a mapping between two linear vector spaces.

Definition 11 (Preimage Problem).

Let

(X, w)

be a closeness space with a closeness matrix

W \in R^{l \times L}

given by

\forall t \in Y, x \in X : w (t, x) = W (t, x)

, where Y is a non-empty subset of X and

0 < l \leq L < + \infty

. Let

A = R^{L}

,

B = range W \subseteq R^{l}

and let an element

b \in B

be given. Let the direct mapping

W : A \to B

be given by the closeness matrix W, i.e.,

\forall v \in A : W (v) = W v

. Then the preimage problem is to find the set

{v \in A | W v = b}

. This set is the preimage of b.

Although the solution to the preimage problem is a set, we will call any of its elements a solution to the preimage problem. By choosing the element

b \in B

, we want to utilize the closeness on X to induce an equivalence relation on A to describe the similarity between vectors.

The closeness matrix W generally exists for any closeness (based on (1) and (2)) on a finite set. It can be also defined for the special case that originates in a certain fuzzy partition (based on (8) and (9)). Hence, the preimage problem described in Definition 11 is a general formulation of the problem. By solving the general one (described by the closeness matrix), we simultaneously solve the problem specified by F-transform.

Assume that we found an SVD (see [21] to recall this technique) of the closeness matrix W in the form of the product of three matrices

P S Z^{⊤}

, where

P \in R^{l \times l}

and

Z \in R^{L \times L}

are orthogonal (these matrices are not determined uniquely) and

S \in R^{l \times L}

is diagonal (

\forall i, j : i \neq j \Rightarrow s_{i j} = 0

) with so-called singular values

\forall i = 1, \dots, l : σ_{i} = s_{i i}

on its diagonal, s.t.

σ_{1} \geq σ_{2} \geq \dots \geq σ_{k} > σ_{k + 1} = \dots = σ_{l} = 0

where

k = dim (range W) = rank W

is the number of linearly independent rows of W (

0 < k \leq l

). Any complex-valued matrix can be represented in an SVD form where the singular values are invariant under different choices of P and Z.

Lemma 4.

Let

W \in R^{l \times L}

be a closeness matrix and

P S Z^{⊤}

be its SVD as above, then

range W = range P S Z^{⊤} = \{\sum_{i = 1}^{k} c_{i} p_{i} | c_{i} \in R\},

where

p_{i}

is the i-th column of the matrix P.

Lemma 4 states that

range W

is spanned by the left singular vectors (

p_{i}

,

i = 1, \dots, k

) corresponding to positive singular values of W.

Proof of Lemma 4.

Let

y \in range W

, i.e.,

y = \sum_{j = 1}^{L} d_{j} W_{j},

where for

j = 1, \dots, L

,

d_{j} \in R

and

W_{j}

is the j-th column of W. For any

r = 1, \dots, l

, we have

\begin{matrix} y_{r} = \sum_{j = 1}^{L} d_{j} w_{r j} = \sum_{j = 1}^{L} d_{j} {(P S Z^{⊤})}_{r j} = \sum_{j = 1}^{L} d_{j} \sum_{i = 1}^{k} p_{r i} σ_{i} z_{j i} = \sum_{i = 1}^{k} p_{r i} \sum_{j = 1}^{L} d_{j} σ_{i} z_{j i} . \end{matrix}

Let

S = \{\sum_{i = 1}^{k} c_{i} p_{i} | c_{i} \in R\} .

If for each

r = 1, \dots, k

, we set

c_{i} = \sum_{j = 1}^{L} d_{j} σ_{i} z_{j i},

this proves that

y \in S

because the value

c_{i}

does not depend on r. Hence,

range W \subseteq S

. As both of these sets are k-dimensional subspaces of

R^{l}

, they must coincide. □

3.2. Problem Solution

Firstly, we discuss conditions when the preimage problem has a solution and when this solution is unique.

As

Y \neq \emptyset

, the matrix W has a positive number of rows and, hence, a positive number of columns, so

range W \neq \emptyset

and a vector b can be always selected. Following Definition 11, we were given a vector

b \in B

. This means that the solution always exists. The solution is always unique—it is the set of all vectors

v \in A

that are mapped on b.

Let us now discuss when the solution to the preimage problem is formed by a set with exactly one element. Following the well-known Fredholm alternative, if the homogeneous system

W v_{0} = 0

has a nontrivial solution

v_{0} \in A

, then the set has more than one element. If

W v_{0} = 0

has only the trivial solution

v_{0} = 0

, then there is only one vector v solving the problem. Details can be found in Section 3.2.3.

Secondly, we characterize the solution from three perspectives: the first one is a useful characterization (Lemmas 5 and 6), the latter two (Theorem 1 together with Corollary 1, and Lemma 9) describe what the solution looks like.

3.2.1. Weighted Arithmetic Mean

Definition 12 (Weighted Arithmetic Mean of a Vector).

With respect to the real weights

{(w_{i})}_{i = 1}^{L}

that satisfy

\sum_{i = 1}^{L} w_{i} \neq 0

, the weighted arithmetic mean of a vector

{[\begin{matrix} v_{1}, \dots, v_{L} \end{matrix}]}^{⊤} \in R^{L}

is a real number

\frac{\sum_{i = 1}^{L} v_{i} w_{i}}{\sum_{i = 1}^{L} w_{i}} .

The equation of the preimage problem,

W v = b

, is a set of linear equations

W^{t} v = b_{t}

where

W^{t}

is a row of the matrix W corresponding to the index

t \in Y

, and we solve

b_{t} = \sum_{i = 1}^{L} v_{i} w_{t i} where t \in Y .

(16)

We have

l = | Y |

equations in the form (16) of

L \geq l

variables

v_{1}, \dots, v_{L} \in R

, from which stems the following lemma.

Lemma 5.

Any vector

v \in A

is a solution to the preimage problem

W v = b

if and only if for every index

t \in Y

, its weighted arithmetic mean with respect to closeness-defining weights

{(w_{t i})}_{i = 1}^{L}

is equal to

\frac{b_{t}}{d_{t t}}

, where

b_{t}

is the t-th component of the vector b and

d_{t t}

is the node degree given by (11).

Proof.

Dividing both sides of (16) by

d_{t t} = \sum_{i = 1}^{L} w_{t i}

, the claim is obvious. □

Lemma 5 states that with respect to every row of W, the weighted arithmetic means of the solutions to the preimage problem are equal. By that, Lemma 5 characterizes the set of all solutions to the preimage problem

W v = b

for a given vector

b \in B

. Hence, the similarity we are looking for is determined by the weighted mean of vectors. Moreover, it demarcates the solution as the equivalence class of vectors (discrete functions):

Lemma 6.

Let us have the equivalence relation on A given by: for all

v^{'}, \bar{v} \in A

, we have

v^{'} \equiv \bar{v} \Leftrightarrow \forall t \in Y : \frac{\sum_{i = 1}^{L} v_{i}^{'} w_{t i}}{\sum_{i = 1}^{L} w_{t i}} = \frac{\sum_{i = 1}^{L} {\bar{v}}_{i} w_{t i}}{\sum_{i = 1}^{L} w_{t i}} .

Let

b = W u

for some

u \in A

(i.e.,

b \in B

), then the solution to the preimage problem

W v = b

is the set

{v \in A | v \equiv u}

.

Proof.

This lemma is a direct consequence of Lemma 5 where for any

v \equiv u

, we have

\forall t \in Y : \frac{\sum_{i = 1}^{L} v_{i} w_{t i}}{\sum_{i = 1}^{L} w_{t i}} = \frac{\sum_{i = 1}^{L} u_{i} w_{t i}}{\sum_{i = 1}^{L} w_{t i}} = \frac{b_{t}}{d_{t t}},

and because u is a solution to the preimage problem, any

v \equiv u

is a solution.

Conversely, consider a vector

v^{*} ≢ u

, s.t.

W v^{*} = b

. Then

\exists t^{*} \in Y : \frac{\sum_{i = 1}^{L} v_{i}^{*} w_{t i}}{\sum_{i = 1}^{L} w_{t i}} \neq \frac{\sum_{i = 1}^{L} u_{i} w_{t i}}{\sum_{i = 1}^{L} w_{t i}} = \frac{b_{t}}{d_{t t}},

which, based on (16), contradicts

W v^{*} = b

. This contradiction proves that the equivalence class contains exactly all solutions to the preimage problem. □

3.2.2. Pseudoinverses

Decomposing W into its SVD by

W = P S Z^{⊤}

gives the preimage problem in the form

P S Z^{⊤} v = b

where

P \in R^{l \times l}

and

Z \in R^{L \times L}

are orthogonal and

S \in R^{l \times L}

is diagonal (i.e.,

\forall i \neq j : s_{i j} = 0

) with singular values of W,

σ_{t} = s_{t t}

,

t \in Y

, on its diagonal. The equation at hand is thus

\forall t \in Y : \sum_{q \in X} \sum_{r \in Y} σ_{r} p_{t r} {z^{⊤}}_{t q} v_{q} = b_{t},

where

{z^{⊤}}_{t q} = z_{q t}

is the element of the matrix

Z^{⊤}

corresponding to the index t in its q-th column.

Theorem 1 (Explicit Solution).

Let

W \in R^{l \times L}

be the closeness matrix given by (8) and (9), its SVD

W = P S Z^{⊤}

be given by matrices

P \in R^{l \times l}

,

S \in R^{l \times L}

and

Z \in R^{L \times L}

, s.t. rows of P, columns of P and rows of S are indexed by Y,

Y^{+} = {t \in Y | σ_{t} > 0}

. Let, moreover,

b \in B

. Then, an explicit solution to the matrix equation

W v = b

with respect to the given SVD is the vector

v = \sum_{t \in Y^{+}} \frac{b^{⊤} p_{t}}{σ_{t}} z_{t} \in A,

where

p_{t}

is the t-th column of the matrix P and

z_{t}

is the t-th column of the matrix Z.

Proof.

Let

1_{t}

denote the vector containing all zeros but 1 on the t-th coordinate and

{(σ_{t}^{- 1})}_{t}

denote the vector containing all zeros but

σ_{t}^{- 1}

on the t-th coordinate. Following the orthogonality of P, we have:

\begin{matrix} W v & = P S Z^{⊤} \sum_{t \in Y^{+}} \frac{b^{⊤} p_{t}}{σ_{t}} z_{t} = \sum_{t \in Y^{+}} \frac{b^{⊤} p_{t}}{σ_{t}} P S Z^{⊤} z_{t} = \sum_{t \in Y^{+}} \frac{b^{⊤} p_{t}}{σ_{t}} P S 1_{t} \\ = \sum_{t \in Y^{+}} b^{⊤} p_{t} P S {(σ_{t}^{- 1})}_{t} = \sum_{t \in Y^{+}} b^{⊤} p_{t} P 1_{t} = \sum_{t \in Y^{+}} b^{⊤} p_{t} p_{t} = b . \end{matrix}

□

Remark 5.

Note that the matrices P and Z in Theorem 1 are not determined uniquely, and so there is generally more than one vector

v \in A

solving

W v = b

. Following Lemma 6, the solution to the corresponding preimage problem is given by the set

{v^{'} \in A | v^{'} \equiv v}

formed by all vectors equivalent with the explicit solution v.

Another general description of the solution is expressed using the notion of right inverse matrix, also called (right) pseudoinverse. Any matrix

W^{- 1} \in R^{L \times l}

satisfying

W W^{- 1} = I

where

I \in R^{l \times l}

is the identity matrix, is the right inverse matrix of

W \in R^{l \times L}

; if we demanded that

W^{- 1} W

were symmetric,

W^{- 1}

would be unique, otherwise there can be multiple right inverse matrices.

Lemma 7.

Let

W^{- 1} \in R^{L \times l}

be any right inverse matrix of the closeness matrix W and

b \in B

, then the vector

v = W^{- 1} b

is a solution to the preimage problem

W v = b

with respect to the given right inverse matrix

W^{- 1}

.

Proof.

W v = W W^{- 1} b = I b = b

. □

The following corollary is a special case of Theorem 1 written in a matrix form.

Corollary 1.

Let

W \in R^{l \times L}

be the closeness matrix and

rank W = l

. If

W = P S Z^{⊤}

is its fixed SVD, then its right inverse is obtained as

W^{- 1} = Z S^{+} P^{⊤}

where

S^{+} \in R^{L \times l}

is the right inverse of S, i.e., a diagonal matrix with inverted singular values of S on its diagonal. Let

b \in B

, then

v = Z S^{+} P^{⊤} b

is a solution to the preimage problem

W v = b

.

Proof.

Following the orthogonality of matrices P and Z, we have

W W^{- 1} = W Z S^{+} P^{⊤} = P S Z^{⊤} Z S^{+} P^{⊤} = P S S^{+} P^{⊤} = P P^{⊤} = I

, hence,

W v = W W^{- 1} b = b

. □

Lemma 8.

If a matrix

W \in R^{l \times L}

,

0 < l \leq L

, has linearly independent rows, then

W^{- 1} = W^{⊤} {(W W^{⊤})}^{- 1}

is its right inverse.

Proof.

Following SVD of W used in the Corollary 1, full rank of W ensures the existence of

{(W W^{⊤})}^{- 1}

. Then,

W W^{- 1} = W W^{⊤} {(W W^{⊤})}^{- 1} = I

. □

Note that the condition that no singular value of W is zero is equivalent to the condition in Lemma 8 that W has linearly independent rows.

Example A1 in the Appendix A illustrates the preimage problem and shows one vector that solves it.

3.2.3. Affine Subspace

The solution to the preimage problem does not form a linear subspace because, e.g.,

W (v^{'} + \bar{v}) = 2 b

which is generally not equal to b. It is a linear subspace of A if and only if

b = 0

, i.e., if and only if it is equal to

null W

(right nullspace, or kernel, of a matrix

W \in R^{l \times L}

contains all vectors

a \in R^{L}

, s.t.

W a = 0

, it is a linear subspace of

R^{L}

). In the other words, if

b = 0

, then the preimage problem,

W v = b

, becomes a homogeneous system of linear equations, making its solution be a linear subspace of A.

Coming back to the general case where the vector

b \in B

is arbitrary, we obtain that the solution to the preimage problem forms an affine subspace of A.

Lemma 9.

Let

b \in B

and

v^{'} \in A

be a particular solution to the preimage problem

W v = b

, then the set of all vectors that solve this problem, forms an affine subspace

\{v^{'} + v_{0} | v_{0} \in null W\}

of A.

Proof.

For the vector

v^{'}

, it holds that

W v^{'} = b

and for any vector

v_{0} \in null W

, it holds that

W v_{0} = 0

. Hence, for any element

a = v^{'} + v_{0}

of the set

A = \{v^{'} + v_{0} | v_{0} \in null W\}

, it holds that

W a = W v^{'} + W v_{0} = b + 0 = b

, so every element of

A

is a solution to the preimage problem.

Conversely, consider a vector

a^{'} \in A

, s.t.

W a^{'} = b

and

a^{'} \notin A

. Then

W (a^{'} - v^{'}) = W a^{'} - W v^{'} = b - b = 0

, so the vector

a^{'} - v^{'} \in null W

. As the vector

a^{'}

can be expressed as

v^{'} + a^{'} - v^{'}

, we see that

a^{'} \in A

. This contradiction proves that the set

A

contains exactly all solutions to the preimage problem. Moreover, it proves that the set

A

is unique, i.e., if we express the solution to the preimage problem as

B = \{\bar{v} + \bar{v_{0}} | \bar{v_{0}} \in null W\}

for any vector

\bar{v} \in A

, s.t.

W \bar{v} = b

, then

A = B

.

As

null W \subseteq \subseteq A

,

A

is an affine subspace of A with the displacement vector

v^{'}

. □

Using the SVD of W as above, we can express the solution to the preimage problem by right singular vectors corresponding to zero singular values of W as

\{v^{'} + \sum_{i = k + 1}^{L} c_{i} z_{i} | c_{i} \in R\} .

The dimension of that affine subspace is then

dim (null W)

which is zero (and hence exists only one vector solving the preimage problem) if and only if

k = l = L

, i.e., the closeness matrix W is square and regular. If W is rectangular, we have infinitely many vectors

a \in A

solving the preimage problem.

Corollary 2.

Any weighted mean of elements of the set of all vectors solving the preimage problem with respect to weights with nonzero sum is again an element of this set.

Proof.

Let

n \in N

and

\forall i = 1, \dots, n : W v^{i} = b

, i.e., each

v^{i} \in A

be a solution to the preimage problem. Let

c_{1}, \dots, c_{n} \in R

, s.t.

\sum_{i = 1}^{n} c_{i} \neq 0

. Then it holds that

W (\frac{\sum_{i = 1}^{n} c_{i} v^{i}}{\sum_{i = 1}^{n} c_{i}}) = \frac{\sum_{i = 1}^{n} c_{i} W v^{i}}{\sum_{i = 1}^{n} c_{i}} = \frac{(\sum_{i = 1}^{n} c_{i}) b}{\sum_{i = 1}^{n} c_{i}} = b,

which proves the claim. □

In this subsection, we described the solution to the preimage problem from three different perspectives: using weighted means of functional values (determining an equivalence class of mutually similar functions), using right inverses of the closeness matrix (that can be obtained also by SVD) and using the notion of affine subspace (its displacement vectors can be also obtained by SVD of W, based on Theorem 1).

In the next section, we show the connection between the inverse F-transform and a solution to the preimage problem as the rows of W are formed by the basic functions of the fuzzy partition.

4. Inverse F-Transform

In this section, we omit the assumption (6) stating every point of the universe X must be covered by at least one basic function. Admitting this degenerated case allows us to express a useful characterization of the case when the inverse F-transform, given by Definition 10, provides a solution to the preimage problem.

Lemma 10.

Let the assumptions of Lemma 3 be satisfied. If

W W^{⊤} = D,

(17)

then the inverse F-transform

v = \hat{F} [u]

given by Definition 10 is a solution to the preimage problem according to Definition 11,

D F [v] = W v = W u = D F [u] = b

for any vector

u \in A

.

Proof.

We see that the inverse F-transform is a solution to the preimage problem if and only if for any

u \in A

, it holds:

W u = D F [u] = b = W v = W \hat{F} [u] = W W^{⊤} F [u] = W W^{⊤} D^{- 1} W u,

which holds if

W W^{⊤} D^{- 1} = I,

or equivalently if (17) holds true. □

Corollary 3.

Condition (17) holds true if both of the following equations are fulfilled:

1.: $\forall t \in Y : d_{t t} = \sum_{i = 1}^{L} w_{t i} = \sum_{i = 1}^{L} w_{t i}^{2}$ ,
2.: $\forall t, s \in Y, t \neq s : d_{t s} = 0 = \sum_{i = 1}^{L} w_{t i} w_{s i}$ .

Recalling that

w_{t i} = W (t, x_{i}) = A_{t} (x_{i}) \in [0, 1]

, we infer that condition 1 is equivalent to stating that

\forall t \in Y, x_{i} \in X : w_{t i} \in {0, 1}

, i.e., only simple-minded assignment of closeness is possible to be used. Condition 2 is equivalent to stating that every column of the matrix W contains at most one non-zero value—let us denote the system of all such matrices as

W

. Putting these two conditions together, we obtain the following corollary.

Corollary 4.

If the matrix

W \in W

contains only values 0 and 1, then the inverse F-transform

v = \hat{F} [u]

given by Definition 10 is a solution to the preimage problem.

This property is demonstrated in Examples A2 and A3 in the Appendix A.

5. New Set of Basic Functions

This section was inspired by the paper [17] where one of the main goals consists in finding conditions under which a noisy signal must be sampled so that it can be reconstructed from its F-transform components. The authors showed that a fuzzy partition adjoint to the partition (with the same nodes) that determined the F-transform components, ensures that the associated inverse F-transform provides the best approximation of the original, continuous function in a certain space.

We analyzed a possible connection between discrete fuzzy partitions on the closeness space and decided to address the following question: can we find a solution to the preimage problem

W v = b

by creating a new set of basic functions, s.t. they determine a linear transformation of the vector b that would solve the problem? If so, what are the conditions that this new set of basic functions must satisfy?

Throughout this section, assume that the sets

X = {1, \dots, L}

, Y and

{A_{t} : X \to [0, 1] | t \in Y}

, the matrices W and D and the vector b are related as in Lemma 3 and fixed.

We are looking for a new set of basic functions (a new fuzzy partition)

{B_{t} : X \to R | t \in Y}

, written in rows of a newly created matrix

M \in R^{l \times L}

, i.e.,

\forall t \in Y, x \in X : m_{t x} = B_{t} (x),

(18)

s.t. the vector

v = M^{⊤} b

solves the preimage problem.

Therefore, we require that

W M^{⊤} b = b

and from this, we deduce properties of the matrix M and express them in terms of basic functions

A_{t}

’s and

B_{t}

’s. Let for an arbitrary matrix N,

N^{t}

denote its t-th row,

N_{t}

denote its t-th column and

N_{t s}

denote its entry

n_{t s}

.

Hence, we require that

\forall t \in Y : {(W M^{⊤})}^{t} b = \sum_{s \in Y} {(W M^{⊤})}_{t s} b_{s} = \sum_{s \in Y} W^{t} {(M^{⊤})}_{s} b_{s} = b_{t} .

Recalling that

W_{t x} = A_{t} (x)

and

{(M^{⊤})}_{x s} = M_{s x} = B_{s} (x)

, for

t, s \in Y

,

x \in X

, we have

\forall t \in Y : \sum_{s \in Y} (b_{s} \sum_{x \in X} A_{t} (x) B_{s} (x)) = b_{t} .

(19)

Substituting the constants

c_{t s} = \sum_{x \in X} A_{t} (x) B_{s} (x), (t, s) \in Y \times Y,

in (19), we obtain that

\forall t \in Y : \sum_{s \in Y} b_{s} c_{t s} = b_{t} .

In the other words, by creating the matrix

C = W M^{⊤} \in R^{l \times l}

from the constants

c_{t s}

,

(t, s) \in Y \times Y

, we solve

C b = b

. As B can be the whole space

R^{l}

, we generally demand that C is an identity matrix I (in the special case of

rank W < l

, this requirement would be unnecessarily strong but leads again to a good choice of new basic functions

B_{t}

’s).

This means that for all

(t, s) \in Y \times Y

, we demand

1.: $\forall t = s : c_{t s} = \sum_{x \in X} A_{t} (x) B_{s} (x) = 1$ , and
2.: $\forall t \neq s : c_{t s} = \sum_{x \in X} A_{t} (x) B_{s} (x) = 0$ .

This observation motivates the following definition.

Definition 13 (Compatible Set of Basic Functions).

Let the assumptions of Lemma 3 be satisfied. We say that the set

{B_{t} : X \to R | t \in Y}

is a compatible set of basic functions with respect to

{A_{t} : X \to R | t \in Y}

in the closeness space

(X, w)

, if

\forall t \in Y : \sum_{x \in X} A_{t} (x) B_{t} (x) = 1,

and

\forall t, s \in Y, t \neq s : \sum_{x \in X} A_{t} (x) B_{s} (x) = 0 .

Theorem 2 (Solution Determined by a Compatible Set of Basic Functions).

Let the assumptions of Lemma 3 be satisfied and let

{B_{t} : X \to R | t \in Y}

be a compatible set of basic functions with respect to

{A_{t} : X \to R | t \in Y}

in the closeness space

(X, w)

. Let

b \in B

and a matrix

M \in R^{l \times L}

be defined by

M_{t x} = B_{t} (x)

,

t \in Y

,

x \in X

, then the vector

v = M^{⊤} b

is a solution to the preimage problem

W v = b

.

Proof.

W v = W M^{⊤} b \in R^{l}

and for each of its components, we have

\begin{matrix} \forall t \in Y : W^{t} v & = {(W M^{⊤})}^{t} b = \sum_{s \in Y} {(W M^{⊤})}_{t s} b_{s} = \sum_{s \in Y} W^{t} {(M^{⊤})}_{s} b_{s} \\ = \sum_{s \in Y} \sum_{x \in X} W_{t x} M_{s x} b_{s} = \sum_{s \in Y} \sum_{x \in X} A_{t} (x) B_{s} (x) b_{s} = b_{t} . \end{matrix}

This proves that

W v = b

. □

If we can find a set of new basic functions

B_{t}

’s compatible with

A_{t}

’s, then the vector

v = M^{⊤} b

can be described as an inverse F-transform of an unknown vector

u \in A

, s.t.

b = D F [u]

, with respect to

{B_{t} : X \to R | t \in Y}

written in the rows of M.

In other words, the direct mapping between the vector spaces A and B is given by W (

W : A \to B

) and any

M^{⊤}

gives an element of the induced inverse relation, i.e.,

(b, M^{⊤} b) \in W^{- 1}

where

W^{- 1} \subseteq B \times A

is the inverse relation to the mapping W. This leads to multiple solutions forming a subset of the affine subspace described in Lemma 9.

Lemma 11.

Let the assumptions of Lemma 3 be satisfied. If

rank W = l

, then there always exists a compatible set of basic functions with respect to

{A_{t} : X \to R | t \in Y}

in the closeness space

(X, w)

.

Proof.

Following Lemma 8,

W^{- 1} = W^{⊤} {(W W^{⊤})}^{- 1}

is a right inverse to W. By setting

M^{⊤} = W^{- 1}

, we found a compatible set of basic functions

{B_{t} : X \to R | t \in Y}

given by (18). □

To conclude, we found the inverse-F-transform-like procedure associated with the system of new basic functions that, applied on b, gives a solution to the preimage problem. Examples A4 and A5 in the Appendix A illustrate the proposition of this section.

6. Conclusions

In this paper, we discussed the preimage problem in the space where the relationship between objects is determined by closeness. We showed that any metric can be transformed into closeness and therefore, the latter is weaker than the former. We interlaced closeness with fuzzy partition and characterized both by the closeness matrix.

We expressed similarity between functions based on their images (coinciding with vectors of scaled F-transform components) computed using the closeness matrix. By that (and by setting the basic structure of the space without metric, or norm), we contributed to the mathematical theory in the field of functional analysis. We formulated the preimage problem using the language of matrix calculus. The preimage problem solution is given by (i) a weighted arithmetic mean, (ii) any right inverse of the closeness matrix or (iii) any element of a certain affine subspace. Singular value decomposition was applied to describe the problem and its solution.

We defined the notion of compatible set of basic functions and found conditions under which the inverse F-transform with respect to the given and compatible set of basic functions forms a solution to the preimage problem.

Theoretical results were illustrated by numerical examples. They demonstrate, e.g., that requiring reflexivity of closeness can be counterproductive.

The future research will be focused on imposing further conditions on both collections of basic functions (

A_{t}

’s and

B_{t}

’s) to reveal stronger connections between the spaces A and B.

Author Contributions

Investigation, J.J. and I.P.; resources, J.J. and I.P.; writing—original draft preparation, J.J.; writing—review and editing, J.J. and I.P.; supervision, I.P. All authors have read and agreed to the published version of the manuscript.

Funding

The support of the project SGS20/PřF-MF/2022 of the University of Ostrava is greatly appreciated.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following symbols and abbreviations are used in this manuscript:

w	closeness, closeness-describing bivariate weight function
$(X, w)$	closeness space with the carrier set (universe) X and closeness w
×	Cartesian product
W	closeness matrix storing weights $w_{i j} = w (x_{i}, x_{j})$ describing closeness between
	$x_{i}$ and $x_{j}$
$\underset{\sim}{\subset}$	fuzzy subset of a given universe
$A_{t}$	basic function associated with the node $t \in Y$
$l = \| Y \|$	number of nodes within the universe X
$L = \| X \|$	number of all points of the universe X
$d_{t t}$	degree of the node $t \in Y$ , diagonal element of the scaling (degree) matrix D
$F [u] = {[F {[u]}_{t}]}_{t \in Y}$	F-transform of the function u, vector of F-transform components
$\hat{F} [u]$	inverse F-transform of the function u
$\cdot^{⊤}$	transpose of ·
$W^{t}$	t-indexed row of the closeness matrix W
A	space of all real functions on the universe X or all corresponding
	L-dimensional vectors
$B = range W$	space of all real l-dimensional vectors that belong to the column space of the
	closeness matrix W
P, S, Z	SVD matrices: m. of left singular vectors, m. with singular values $σ_{t}$ on its
	diagonal, m. of right singular vectors, respectively
≡	equivalence relation between vectors in A
$W^{- 1}$	right inverse matrix of the closeness matrix W
$W$	system of all real $l \times L$ matrices containing at most one non-zero entry in each column
F-transform	fuzzy transform
SVD	singular value decomposition
s.t.	such that

Appendix A

This section contains numerical examples related to the previous five sections (excluding Section 6) to increase their legibility.

Example A1 relates to Section 3.2, explains the notations of matrices and shows the detailed computation of the F-transform components which will be omitted in other examples. It illustrates the preimage problem and shows one vector that solves it.

Example A1.

Let a set of nodes

Y = {3, 4, 6}

be a selected subset of

X = {1, 2, 3, 4, 5, 6, 7, 8}

, the universe, and

\forall t \in Y : A_{t} (x) = \{\begin{matrix} \frac{x - t}{3} + 1 & x \in [t - 3, t] \cap X \\ \frac{t - x}{3} + 1 & x \in [t, t + 3] \cap X \\ 0 & x \in X ∖ [t - 3, t + 3] \end{matrix}

be triangular basic functions on X associated with elements of Y that determine the closeness-describing (weight) matrix

W = [\begin{matrix} \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} & 0 & 0 & 0 \\ 0 & \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} \end{matrix}],

and the scaling (node degree) matrix

D = [\begin{matrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 3 \end{matrix}] .

Note that these two matrices, W and the derived D, are independent of the functions acting on X. Let

u = x^{2}

be a particular function on X which has a vector form

u = {[\begin{matrix} 1 & 4 & 9 & 16 & 25 & 36 & 49 & 64 \end{matrix}]}^{⊤} .

The function u is described by its values

{(u_{i})}_{i = 1}^{L} = {(u (x_{i}))}_{x_{i} \in X}

that represent registered values of a certain variable (physical, chemical, etc.) measured at all points (on all objects) of the universe (data set) X. Another possible interpretation of u consists in assuming that there is a continuous signal that spreads over a certain medium during a time frame described by an interval

[y_{0}, y_{l + 1}] \supset X

and we are able to register that signal only in discrete time steps

{(x_{i})}_{i = 1}^{L}

. The components of the direct F-transform of the function u corresponding to all

A_{t}

’s (

t \in Y

) are:

F {[u]}_{1} = (1 \cdot \frac{1}{3} + 4 \cdot \frac{2}{3} + 9 \cdot 1 + 16 \cdot \frac{2}{3} + 25 \cdot \frac{1}{3}) : (\frac{1}{3} + \frac{2}{3} + 1 + \frac{2}{3} + \frac{1}{3}) = \frac{31}{3}

F {[u]}_{2} = (4 \cdot \frac{1}{3} + 9 \cdot \frac{2}{3} + 16 \cdot 1 + 25 \cdot \frac{2}{3} + 36 \cdot \frac{1}{3}) : (\frac{1}{3} + \frac{2}{3} + 1 + \frac{2}{3} + \frac{1}{3}) = \frac{52}{3}

F {[u]}_{3} = (16 \cdot \frac{1}{3} + 25 \cdot \frac{2}{3} + 36 \cdot 1 + 49 \cdot \frac{2}{3} + 64 \cdot \frac{1}{3}) : (\frac{1}{3} + \frac{2}{3} + 1 + \frac{2}{3} + \frac{1}{3}) = \frac{112}{3} .

Writing them in the vector form, we obtain

F [u] = {[\begin{matrix} \frac{31}{3} & \frac{52}{3} & \frac{112}{3} \end{matrix}]}^{⊤} .

Obviously, it holds that

D F [u] = W u

, i.e.,

[\begin{matrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 3 \end{matrix}] \cdot [\begin{matrix} \frac{31}{3} \\ \frac{52}{3} \\ \frac{112}{3} \end{matrix}] = [\begin{matrix} \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} & 0 & 0 & 0 \\ 0 & \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} \end{matrix}] \cdot [\begin{matrix} 1 \\ 4 \\ 9 \\ 16 \\ 25 \\ 36 \\ 49 \\ 64 \end{matrix}] = [\begin{matrix} 31 \\ 52 \\ 112 \end{matrix}] .

The preimage problem given by the vector we ended up with (this ensures that

b \in B

) then has the form

b = D F [u] = [\begin{matrix} 31 \\ 52 \\ 112 \end{matrix}] = [\begin{matrix} \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} & 0 & 0 & 0 \\ 0 & \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{3} & \frac{2}{3} & 1 & \frac{2}{3} & \frac{1}{3} \end{matrix}] \cdot v = W v .

We will use SVD of W to find one of the right inverse matrices that constitute a solution.

For one particular SVD decomposition, we obtain an explicit solution

v = Z S^{+} P^{⊤} b ≐ [\begin{matrix} 10.563025 \\ 8.92437 \\ 7.285714 \\ 6.403361 \\ 29.92437 \\ 53.445378 \\ 43.764706 \\ 21.882353 \end{matrix}] .

To verify it, we compute

W v ≐ {[\begin{matrix} 31 & 52 & 112 \end{matrix}]}^{⊤},

and comparing it to b we see that we found a solution to the preimage problem.

In Example A1, we obtained the same particular solution while using the explicit formula from Theorem 1 as well as while using the right inverse given by SVD of W (Corollary 1) and the right inverse given by Lemma 8. As SVD is widely used to compute pseudoinverses, this was not too surprising and our software is no exception.

Examples A2 and A3 relate to Section 4 and show that under certain conditions, the inverse F-transform is one of the possible solutions to the preimage problem. The former example illustrates the degenerated case where not all points of the universe are covered by basic functions (are not connected with at least one node), the latter example uses non-convex basic functions.

Example A2.

Let

Y = {3, 4, 6}

be a selected subset of nodes of the set

X = {1, 2, 3, 4, 5, 6, 7, 8}

,

\forall t \in Y : A_{t} (x) = \{\begin{matrix} 1 & t = x \\ 0 & t \neq x \end{matrix}

be singleton basic functions on X associated with elements of Y that determine the closeness matrix

W = [\begin{matrix} 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \end{matrix}],

and the scaling matrix

D = [\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] .

Let u be a particular function on X with a vector form

u = {[\begin{matrix} 0 & 0 & 9 & 16 & 0 & 36 & 0 & 0 \end{matrix}]}^{⊤} .

The components of the direct F-transform of u corresponding to all

A_{t}

’s (

t \in Y

) form the vector:

F [u] = {[\begin{matrix} 9 & 16 & 36 \end{matrix}]}^{⊤} .

Obviously, it holds that

D F [u] = W u

, i.e.,

[\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] \cdot [\begin{matrix} 9 \\ 16 \\ 36 \end{matrix}] = [\begin{matrix} 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \end{matrix}] \cdot [\begin{matrix} 0 \\ 0 \\ 9 \\ 16 \\ 0 \\ 36 \\ 0 \\ 0 \end{matrix}] = [\begin{matrix} 9 \\ 16 \\ 36 \end{matrix}] .

The preimage problem given by the vector we ended up with then has the form

b = D F [u] = [\begin{matrix} 9 \\ 16 \\ 36 \end{matrix}] = [\begin{matrix} 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \end{matrix}] \cdot v = W v .

The inverse F-transform of u with respect to W has the form

\hat{F} [u] = W^{⊤} F [u] = {[\begin{matrix} 0 & 0 & 9 & 16 & 0 & 36 & 0 & 0 \end{matrix}]}^{⊤} .

Since this vector is equal to u, we obtain that

W \hat{F} [u] = W u

, so

\hat{F} [u]

is a solution to the preimage problem. Note that W is such that

W W^{⊤} = I = D

.

Example A3.

Let

Y = {3, 4, 6}

be a selected subset of nodes of the set

X = {1, 2, 3, 4, 5, 6, 7, 8}

and let the closeness be determined by the matrix

W = [\begin{matrix} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \end{matrix}],

and the corresponding scaling matrix

D = [\begin{matrix} 3 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{matrix}] .

Let u be a particular function on X with a vector form

u = {[\begin{matrix} 0 & 0 & 9 & 16 & 0 & 36 & 0 & 0 \end{matrix}]}^{⊤} .

The components of the direct F-transform of u corresponding to all

A_{t}

’s (

t \in Y

) form the vector:

F [u] = {[\begin{matrix} 3 & 8 & 12 \end{matrix}]}^{⊤} .

Obviously, it holds that

D F [u] = W u

, i.e.,

[\begin{matrix} 3 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{matrix}] \cdot [\begin{matrix} 3 \\ 8 \\ 12 \end{matrix}] = [\begin{matrix} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \end{matrix}] \cdot [\begin{matrix} 0 \\ 0 \\ 9 \\ 16 \\ 0 \\ 36 \\ 0 \\ 0 \end{matrix}] = [\begin{matrix} 9 \\ 16 \\ 36 \end{matrix}] .

The preimage problem given by the vector we ended up with then has the form

b = D F [u] = [\begin{matrix} 9 \\ 16 \\ 36 \end{matrix}] = [\begin{matrix} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \end{matrix}] \cdot v = W v .

The inverse F-transform of u with respect to W has the form

\hat{F} [u] = W^{⊤} F [u] = {[\begin{matrix} 3 & 3 & 3 & 8 & 8 & 12 & 12 & 12 \end{matrix}]}^{⊤} .

We see that

W \hat{F} [u] = [\begin{matrix} 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \end{matrix}] \cdot [\begin{matrix} 3 \\ 3 \\ 3 \\ 8 \\ 8 \\ 12 \\ 12 \\ 12 \end{matrix}] = [\begin{matrix} 9 \\ 16 \\ 36 \end{matrix}] = W u,

so

\hat{F} [u]

is a solution to the preimage problem. Note that again W is such that

W W^{⊤} = D

.

Examples A4 and A5 relate to Section 5 and illustrate that creating a compatible set of basic functions leads to a solution to the preimage problem. The latter example shows a drawback of requiring reflexivity of closeness.

Example A4.

Let

X = {1, 2, 3, 4, 5, 6, 7, 8, 9}

be the universe with a selected subset

Y = {2, 5, 8}

of nodes and let the closeness be determined by the matrix

W = [\begin{matrix} \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} \end{matrix}],

and the corresponding scaling matrix

D = [\begin{matrix} \frac{7}{4} & 0 & 0 \\ 0 & \frac{7}{4} & 0 \\ 0 & 0 & \frac{7}{4} \end{matrix}] .

Let

u = x^{2}

be a particular function on X which has a vector form

u = {[\begin{matrix} 1 & 4 & 9 & 16 & 25 & 36 & 49 & 64 & 81 \end{matrix}]}^{⊤} .

The components of the direct F-transform of the function u corresponding to all

A_{t}

’s (

t \in Y

) form the vector:

F [u] = {[\begin{matrix} \frac{32}{7} & \frac{179}{7} & \frac{452}{7} \end{matrix}]}^{⊤} .

Obviously, it holds that

D F [u] = W u

, i.e.,

[\begin{matrix} \frac{7}{4} & 0 & 0 \\ 0 & \frac{7}{4} & 0 \\ 0 & 0 & \frac{7}{4} \end{matrix}] \cdot [\begin{matrix} \frac{32}{7} \\ \frac{179}{7} \\ \frac{452}{7} \end{matrix}] = [\begin{matrix} \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} \end{matrix}] \cdot [\begin{matrix} 1 \\ 4 \\ 9 \\ 16 \\ 25 \\ 36 \\ 49 \\ 64 \\ 81 \end{matrix}] = [\begin{matrix} 8 \\ \frac{179}{4} \\ 113 \end{matrix}] .

The preimage problem given by the vector we ended up with then has the form

b = [\begin{matrix} 8 \\ \frac{179}{4} \\ 113 \end{matrix}] = [\begin{matrix} \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} \end{matrix}] \cdot v = W v .

Let us consider, e.g., the following matrix M satisfying

W M^{⊤} = I

:

M = [\begin{matrix} \frac{1}{2} & \frac{2}{3} & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & \frac{2}{3} & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{2}{3} & \frac{1}{2} \end{matrix}] .

Then the vector

v = M^{⊤} b = {[\begin{matrix} \frac{1}{2} & \frac{2}{3} & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & \frac{2}{3} & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{2}{3} & \frac{1}{2} \end{matrix}]}^{⊤} \cdot [\begin{matrix} 8 \\ \frac{179}{4} \\ 113 \end{matrix}] = [\begin{matrix} 4 \\ \frac{16}{3} \\ 4 \\ \frac{179}{8} \\ \frac{179}{6} \\ \frac{179}{8} \\ \frac{113}{2} \\ \frac{226}{3} \\ \frac{113}{2} \end{matrix}],

is a solution to the preimage problem that can be verified by

W v = [\begin{matrix} \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & \frac{3}{4} & \frac{1}{2} \end{matrix}] \cdot [\begin{matrix} 4 \\ \frac{16}{3} \\ 4 \\ \frac{179}{8} \\ \frac{179}{6} \\ \frac{179}{8} \\ \frac{113}{2} \\ \frac{226}{3} \\ \frac{113}{2} \end{matrix}] = [\begin{matrix} 8 \\ \frac{179}{4} \\ 113 \end{matrix}] = b .

Moreover, it demonstrates the fact that the sets of basic functions

A_{t}

’s (given by the rows of W) and

B_{t}

’s (given by the rows of M),

t \in Y

, are compatible.

If we assumed that for each

t \in Y

it holds

A_{t} (t) = 1

(reflexive closeness), the compatible set of convex basic functions satisfying that also for each

t \in Y

it holds

B_{t} (t) = 1

, would be uniquely formed by singletons (degenerated case). This is illustrated by Example A5.

Example A5.

Let

X = {1, 2, 3, 4, 5, 6, 7, 8, 9}

be the universe with the selected subset

Y = {2, 5, 8}

of nodes and let the closeness be determined by the matrix

W = [\begin{matrix} \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} \end{matrix}],

and the corresponding scaling matrix

D = [\begin{matrix} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{matrix}] .

Let

u = x^{2}

be a particular function on X which has a vector form

u = {[\begin{matrix} 1 & 4 & 9 & 16 & 25 & 36 & 49 & 64 & 81 \end{matrix}]}^{⊤} .

The components of the direct F-transform of the function u corresponding to all

A_{t}

s (

t \in Y

) form the vector:

F [u] = {[\begin{matrix} \frac{9}{2} & \frac{51}{2} & \frac{129}{2} \end{matrix}]}^{⊤} .

Obviously, it holds that

D F [u] = W u

, i.e.,

[\begin{matrix} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{matrix}] \cdot [\begin{matrix} \frac{9}{2} \\ \frac{51}{2} \\ \frac{129}{2} \end{matrix}] = [\begin{matrix} \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} \end{matrix}] \cdot [\begin{matrix} 1 \\ 4 \\ 9 \\ 16 \\ 25 \\ 36 \\ 49 \\ 64 \\ 81 \end{matrix}] = [\begin{matrix} 9 \\ 51 \\ 129 \end{matrix}] .

The preimage problem given by the vector we ended up with then has the form

b = [\begin{matrix} 9 \\ 51 \\ 129 \end{matrix}] = [\begin{matrix} \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} \end{matrix}] \cdot v = W v .

Let us consider the following matrix M satisfying

W M^{⊤} = I

(if we want each of its rows to form a convex fuzzy set with

M^{t} [t] = 1

, then this is the only option):

M = [\begin{matrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{matrix}] .

Note that if for any

s \in Y ∖ {t}

it holds that

A_{t} (s) > 0

, then no compatible matrix M exists.

The vector

v = M^{⊤} b = {[\begin{matrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{matrix}]}^{⊤} \cdot [\begin{matrix} 9 \\ 51 \\ 129 \end{matrix}] = [\begin{matrix} 0 \\ 9 \\ 0 \\ 0 \\ 51 \\ 0 \\ 0 \\ 129 \\ 0 \end{matrix}],

is a solution to the preimage problem that can be verified by

W v = [\begin{matrix} \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 1 & \frac{1}{2} \end{matrix}] \cdot [\begin{matrix} 0 \\ 9 \\ 0 \\ 0 \\ 51 \\ 0 \\ 0 \\ 129 \\ 0 \end{matrix}] = [\begin{matrix} 9 \\ 51 \\ 129 \end{matrix}] = b .

This again demonstrates the fact that the sets of basic functions

A_{t}

’s (given by the rows of W) and

B_{t}

’s (given by the rows of M),

t \in Y

, are compatible. More importantly, it demonstrates that requiring

A_{t} (t) = 1

and

B_{t} (t) = 1

for all node indices

t \in Y

can lead to an empty set of compatible convex basic functions.

References

Janeček, J.; Perfilieva, I. Three Methods of Data Analysis in a Space with Closeness. In Developments of Artificial Intelligence Technologies in Computation and Robotics: Proceedings of the 14th International FLINS Conference (FLINS 2020), Cologne, Germany, 18–21 August 2020; World Scientific: Singapore, 2020; pp. 947–955. [Google Scholar]
Belkin, M.; Niyogi, P. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Comput. 2003, 15, 1373–1396. [Google Scholar] [CrossRef]
Janeček, J.; Perfilieva, I. F-transform and Dimensionality Reduction: Common and Different. In International Summer School on Aggregation Operators 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 267–278. [Google Scholar]
Perfilieva, I. Fuzzy Transforms: Theory and Applications. Fuzzy Sets Syst. 2006, 157, 993–1023. [Google Scholar] [CrossRef]
St. Luce, S.; Sayama, H. Analysis and Visualization of High-Dimensional Dynamical Systems’ Phase Space Using a Network-Based Approach. Complexity 2022, 2022, 3937475. [Google Scholar] [CrossRef]
Schneider, N.; Wurm, G.; Teiser, J.; Klahr, H.; Carpenter, V. Dense Particle Clouds in Laboratory Experiments in Context of Drafting and Streaming Instability. Astrophys. J. 2019, 872, 3. [Google Scholar] [CrossRef]
Wang, J.; Wang, L.; Liu, X.; Ren, Y.; Yuan, Y. Color-Based Image Retrieval Using Proximity Space Theory. Algorithms 2018, 11, 115. [Google Scholar] [CrossRef]
Riesz, F. Stetigkeitsbegriff und abstrakte Mengenlehre. In Proceedings of the International Congress of Mathematicians, Rome, Italy, 6–11 April 1908; Volume 2, pp. 18–24. [Google Scholar]
Perfilieva, I.; Daňková, M.; Bede, B. Towards a Higher Degree F-Transform. Fuzzy Sets Syst. 2011, 180, 3–19. [Google Scholar] [CrossRef]
Stefanini, L. Fuzzy Transform with Parametric LU-Fuzzy Partitions. In Computational Intelligence in Decision and Control; World Scientific: Singapore, 2008; pp. 399–404. [Google Scholar]
Perfilieva, I.; Hurtik, P. The F-Transform Preprocessing for JPEG Strong Compression of High-Resolution Images. Inf. Sci. 2021, 550, 221–238. [Google Scholar] [CrossRef]
Hurtik, P.; Molek, V.; Perfilieva, I. Novel Dimensionality Reduction Approach for Unsupervised Learning on Small Datasets. Pattern Recognit. 2020, 103, 107291. [Google Scholar] [CrossRef]
Zakeri, K.A.; Ziari, S.; Araghi, M.A.F.; Perfilieva, I. Efficient Numerical Solution to a Bivariate Nonlinear Fuzzy Fredholm Integral Equation. IEEE Trans. Fuzzy Syst. 2019, 29, 442–454. [Google Scholar] [CrossRef]
Honeine, P.; Richard, C. Preimage Problem in Kernel-based Machine Learning. IEEE Signal Process. Mag. 2011, 28, 77–88. [Google Scholar] [CrossRef] [Green Version]
Kwok, J.Y.; Tsang, I.H. The Pre-Image Problem in Kernel Methods. IEEE Trans. Neural Netw. 2004, 15, 1517–1525. [Google Scholar] [CrossRef] [PubMed]
Janeček, J.; Perfilieva, I. Noise Reduction as an Inverse Problem in F-Transform Modelling. In Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems 2022, Milan, Italy, 11–15 July 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 405–417. [Google Scholar] [CrossRef]
Perfilieva, I.; Holčapek, M.; Kreinovich, V. A New Reconstruction from the F-Transform Components. Fuzzy Sets Syst. 2016, 288, 3–25. [Google Scholar] [CrossRef]
Bakır, G.H.; Weston, J.; Schölkopf, B. Learning to Find Pre-Images. Adv. Neural Inf. Process. Syst. 2004, 16, 449–456. [Google Scholar]
Blyth, T.S. Set Theory and Abstract Algebra; Longmans Mathematical Texts; Longman Publishing Group: Harlow, UK, 1975. [Google Scholar]
Knauer, U.; Knauer, K. Algebraic Graph Theory: Morphisms, Monoids and Matrices; De Gruyter: Berlin, Germany, 2019. [Google Scholar]
Golub, G.H.; Van Loan, C.F. Matrix Computations; Johns Hopkins University Press: Baltimore, MD, USA, 1996. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Janeček, J.; Perfilieva, I. Preimage Problem Inspired by the F-Transform. Mathematics 2022, 10, 3209. https://doi.org/10.3390/math10173209

AMA Style

Janeček J, Perfilieva I. Preimage Problem Inspired by the F-Transform. Mathematics. 2022; 10(17):3209. https://doi.org/10.3390/math10173209

Chicago/Turabian Style

Janeček, Jiří, and Irina Perfilieva. 2022. "Preimage Problem Inspired by the F-Transform" Mathematics 10, no. 17: 3209. https://doi.org/10.3390/math10173209

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Preimage Problem Inspired by the F-Transform

Abstract

1. Introduction

Relationship of Closeness, Metric and Similarity

2. Preliminaries

2.1. Closeness Space

2.2. Fuzzy Set, Relation and Partition

2.3. Closeness as Fuzzy Relation and Closeness Given by a Fuzzy Partition

2.4. Closeness as Weighted Adjacency of Graph Vertices

2.5. Initial Assumptions

2.6. Fuzzy Transform

3. Preimage Problem

3.1. Problem Formulation

3.2. Problem Solution

3.2.1. Weighted Arithmetic Mean

3.2.2. Pseudoinverses

3.2.3. Affine Subspace

4. Inverse F-Transform

5. New Set of Basic Functions

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI