Design Plan for an Evolution Study of Related Characteristics of a Population

Rodríguez-Díaz, Juan M.; Pruneda, Rosa E.; Rodríguez-Hernández, Mercedes

doi:10.3390/math10050792

Open AccessArticle

Design Plan for an Evolution Study of Related Characteristics of a Population

by

Juan M. Rodríguez-Díaz

^1,*

,

Rosa E. Pruneda

²

and

Mercedes Rodríguez-Hernández

³

¹

Department of Statistics, Universidad de Salamanca, 37008 Salamanca, Spain

²

Department of Mathematics, University of Castilla-La Mancha, 13071 Ciudad Real, Spain

³

Department of Didactic of Mathematics and Experimental Sciences, Universidad de Salamanca, 05003 Ávila, Spain

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(5), 792; https://doi.org/10.3390/math10050792

Submission received: 9 January 2022 / Revised: 22 February 2022 / Accepted: 28 February 2022 / Published: 2 March 2022

(This article belongs to the Special Issue Optimal Experimental Design and Statistical Modeling)

Download

Browse Figure

Versions Notes

Abstract

:

The objective is to study the evolution of different characteristics of a population through time. These response variables may be related for each experimental unit, and in addition, the observations for each response may as well be correlated with time, producing a complex correlation structure. The number of responses that can be observed is usually limited for budget, resources, or time reasons, and thus the selection of the most informative time points when data must be taken is quite convenient. This will be performed by using the optimal design of experiments techniques. Some analytical results will be shown, and the results will be applied to obtain the most convenient points when tests about two variables related with the capability of the resolution of mathematical problems in primary school students should be performed.

Keywords:

covariance structure; evolution study; mental representation; optimal design of experiments; problem solving

MSC:

62K05

1. Introduction

When studying several characteristics from a group of experimental units, one of the main concerns is the independence of the obtained data, since most of the usual techniques assume independence between the observations. However, this assumption may be difficult to maintain when observing different variables of a specific subject through time. In this situation, at least two kind of covariance structures may arise, one between the different types of observations taken on the same subject at every temporal point and another one between the values of a specific type of characteristic observed through time [1,2]. This may be a complex problem even when only one experimental unit is observed. Considering a group of units in the study means the addition of another layer to the structure, a problem that will be addressed in the present work.

The origin was the study of small children’s capacity of solving mathematical problems and the evolution of this skill in students through the years they spent in primary school. This variable is very often related with some others that could be observed as well in the same students in order to check the kind of relationship or estimate the model linking the variables. However, obtaining the data has certain costs, especially those coming from the time that teachers and administrative personal will spend in designing and performing the tests, correction, the translation of the outcomes to a computer program, and analyzing the results. It will be very difficult to have these tests performed very often, or even once a year. For this reason, a design plan stating the conditions under which these observations should be taken is quite convenient.

In this work, such designs will be studied for a general case in which some variables will be observed on a cohort of subjects at different time periods, trying to decide the most informative time-points where observations/tests should be collected using optimal experimental design techniques.

In Section 2, the main concepts and latest results of the optimal design of experiments theory applied to models describing several response variables for one subject will be outlined, and new analytical results for a general case with N subjects and k variables of interest will be obtained. In Section 3, the theory will be applied to the joint study of the variables ’Ability in problem solving’ (PS) and ’Mental representation capacity’ (MR) of the primary school students [3], and the results will be commented on in Section 4. Finally, the main conclusions and further applications will be addressed in Section 5.

2. Background and New Results

Let x denote the experimental conditions, with possible values in the experimental domain Χ. An exact design

ξ

is a collection of points

{x_{1}, \dots, x_{n}}

where samples are to be taken. For n observations

Y = {y_{1}, \dots, y_{n}}^{T}

of the one-response linear model

y = f {(x)}^{T} β + u

, the system can be expressed in matrix notation as

Y = X β + U,

(1)

where

β

is the m-parameter vector,

U = {u_{1}, \dots, u_{n}}^{T}

the vector of error terms, and

X = {(f (x_{1}), \dots, f (x_{n}))}^{T}

the design matrix, with

f (x) = {(f_{1} (x), \dots, f_{m} (x))}^{T}

and the

f_{i} (x)

linearly independent in Χ. When there exists a correlation structure

Σ = V a r (Y)

between the samples, the estimator of the parameters for normal-distributed errors is

\hat{β} = {(X^{T} Σ^{- 1} X)}^{- 1} X^{T} Σ^{- 1} Y

. One of the most important tools in the optimal design of experiments framework is the information matrix of the design

ξ

;

M (ξ) = X^{T} Σ^{- 1} X,

(2)

which is proportional to the inverse of the variance of

\hat{β} .

Usually the objective of practitioners is to find optimal designs that produce precise estimators of the model parameters, thus minimizing

V a r (\hat{β})

or equivalently maximizing

M (ξ)

. Different criterion functions may be considered, usually convex functions of

M^{- 1} (ξ)

. The most popular criterion is

D

-optimality, which pays attention to the determinant of

M^{- 1} (ξ)

, and that will be the criterion used in this work. When a design is

D

-optimal, it minimizes the volume of the confidence ellipsoid of the estimators of the model parameters. For non-linear models, the information matrix will depend on the unknown parameters. In this case, nominal values are needed for them, and locally optimal designs will be obtained. Reference books on the topic are, for instance, [4,5,6]. To compute optimal designs, from the analytical expression of the model the linearized version is obtained by computing the derivatives with respect to the parameters. When it is not possible to obtain the analytical expression, alternative methods for computing the derivatives can be employed [7].

When the observations are correlated, the size of the design (number of samples to be taken) should be decided in advance. Furthermore, for an evolution study, the design variable is time, and thus it will be assumed that

x_{i} \neq x_{j}

for all

i

,

j

because there is no reason to take several observations on the same subject at the same time. When different responses are of interest (multiresponse models), the usual assumption was to consider that the k responses observed on the same experimental unit under the conditions x,

y (x) = {(y_{1} (x), \dots, y_{k} (x))}^{T}

were correlated, but the measures taken at different points,

y (x)

and

y (x ’)

, were independent. However, when the design variable is time, this last assumption is no longer valid [2], and two types of correlation should be considered: a static or intra covariance structure between different responses observed at the same time and a longitudinal or inter correlation between the same type of response obtained at different times.

Thus, let us assume that we observe

k

characteristics of one subject at different times, and for each time

t

let

S (t)

denote the covariance matrix of the sample

y (t) = (y_{1} (t), \dots, y_{k} (t))

, that is,

S (t) = c o v [y_{1} (t), \dots, y_{k} (t)]

. It is usual to assume that the relation between the different variables is similar for every

t

; thus, following that, a constant covariance

S

will be considered (intra correlation). On the other hand, the covariance between the same type of observations taken at different points will be assumed to be dependent only on the distance between points, that is,

C o v [y_{i} (t_{j}), y_{i} (t_{l})] = ρ (| t_{j} - t_{l} |), i = 1, \dots, k; j, l = 1, \dots, n,

where

ρ

is a known stationary covariance kernel (different proposals of these functions can be found, for instance, in [8,9]; thus, the longitudinal covariance will be the same for the

k

different responses,

R = C o v [y_{i} (t_{1}), \dots, y_{i} (t_{n})]

\forall i

. The assumption of a known covariance function may be a controversial issue; however, it becomes more acceptable when restricting to a local consideration of the model, thus finally obtaining locally optimal designs.

In previous studies [1,2], the double (inter and intra) covariance structure has been taken into account for measures over a single experimental unit through time. Now, a multisubject scenario will be considered, observing

N

subjects at different times

t_{1}, \dots, t_{n}

, (design

ξ

). Balanced designs will be considered, that is, for each

t_{i}

in

ξ

several variables,

Y_{1}, \dots, Y_{k}

will be observed for every subject; for unbalanced designs, a procedure similar to the one employed in Example 2 of [1] can the employed. The aim will be to select the most informative design for the models describing the evolution of the variables.

The Kronecker product of the different covariance structures can be used to express the covariance matrix of the

N

k n

observations. Throughout this work, a non-trivial

R

(but constant for every variable

k

) will be assumed; therefore, for each subject it will be more convenient to use the order

y_{1} (t_{1}), \dots, y_{1} (t_{n}), \dots, y_{k} (t_{1}), \dots, y_{k} (t_{n})

, getting the covariance matrix

Σ_{0} = S \otimes R

. Furthermore, it is quite usual to assume as well that the

N

subjects are independent, thus the covariance matrix of the

N k n

observations can be expressed as

Σ = I_{N} \otimes S \otimes R = I_{N} \otimes Σ_{0},

(3)

where

I_{N}

is the identity matrix of order

N

. That is,

Σ

is the block-diagonal (BD) matrix

B D {Σ_{0}, \dots, Σ_{0}}

.

Let us now fix the notation and the models that will be studied for the case of several subjects: let $y_{i j}^{(v)} = y_{i}^{(v)} (t_{j})$ denote the value of the $i$ -th variable observed on the $v$ -th subject at time $t_{j}$ , $y_{i}^{(v)} = {(y_{i 1}^{(v)}, \dots, y_{i n}^{(v)})}^{T}$ the observations of type $i$ taken on the $v$ -th subject, and $y^{(v)} = {(y_{1}^{(v)}^{T}, \dots, y_{k}^{(v)}^{T})}^{T}$ all the observations taken on the $v$ -th subject, ordered by type. Then $Σ$ in Equation (3) denotes the covariance matrix of the $N$ $k n$ observations vector $Y = {(y^{(1)}^{T}, \dots, y^{(N)}^{T})}^{T}$ . It seems convenient to remember here some properties of the Kronecker product of matrices that will be used later; they can be checked, for instance, in [10]:
$(A \otimes B) \otimes C = A \otimes (B \otimes C)$ .
$(A \otimes B) (C \otimes D) = (A C) \otimes (B D)$ .

In particular,

{(A \otimes B)}^{- 1} = (A^{- 1} \otimes B^{- 1})

and

{(A \otimes B)}^{⊤} = (A^{⊤} \otimes B^{⊤})

If $A$ and $B$ are squared matrices of respective dimensions $n_{A}$ and $n_{B}$ , then

$\det (A \otimes B) = \det {(A)}^{n_{B}} \det {(B)}^{n_{A}}$
$(A \otimes B) = P (B \otimes A) Q$ for certain permutation matrices $P$ and $Q$ .

If both

A

and

B

are square matrices, then

Q = P^{⊤}

.

with

A

,

B

,

C

,

D

,

P

, and

Q

having the right dimensions for the product of matrices.

The aim is to obtain D-optimal designs for fitting the models of the

k

variables involved, assuming

m

-parameter linear models with the same structure. The setup is somehow similar to that in [11], where the double covariance structure was applied to a compositional-response model. That case was roughly equivalent to a one-subject model, and the samples of the same variable were assumed independent, while here a non-trivial correlation is considered between the different observations of the same variable, and multiple subjects are observed. Two scenarios will be considered:

Model I.—Different models of the variables for each subject (

N k

models):

{\hat{y_{i}}}^{(v)} (t) = f {(t)}^{T} β_{i}^{(v)},

(4)

with

β_{i}^{(v)} = {(β_{1}^{(v)}, \dots, β_{m}^{(v)})}^{T}

i = 1, \dots, k

,

v = 1, \dots, N

.

Model II.—The model of each variable is valid for all the subjects (

k

models):

{\hat{y_{i}}}^{(v)} (t) = f {(t)}^{T} β_{i},

(5)

with

β_{i} = {(β_{1}, \dots, β_{m})}^{T}

i = 1, \dots, k

,

v = 1, \dots, N

.

In the following, analytical results will be obtained for the two scenarios, beginning with Model I:

Theorem 1.

The

D

-optimal designs for the individual models of each variable in each subject are also

D

-optimal for Model I, given by Equation (4).

Proof.

If

n

observations are taken for each variable and subject at times

t_{1}, \dots, t_{n}

, the model can be expressed as (1) with

β = {(β^{(1)}^{T}, \dots, β^{(N)}^{T})}^{T}

,

β^{(v)} = {(β_{1}^{(v)}^{T}, \dots, β_{k}^{(v)}^{T})}^{T}

, and

β_{i}^{(v)} = {(β_{i 1}^{(v)}, \dots, β_{i m}^{(v)})}^{T}

), and the design matrix is

X = B D {X_{0}, \dots, X_{0}} = I_{N k} \otimes X_{0},

where

X_{0} = {(f {(t_{1})}^{T}, \dots, f {(t_{n})}^{T})}^{T}

is the design matrix for each individual model. Then, the information matrix (2) is given by

\begin{array}{r} M & = & X^{T} Σ^{- 1} X \\ = & (I_{N} \otimes I_{k} \otimes X_{0}^{T}) (I_{N} \otimes S^{- 1} \otimes R^{- 1}) (I_{N} \otimes I_{k} \otimes X_{0}) \\ = & I_{N} \otimes S^{- 1} \otimes M_{0}, \end{array}

where

M_{0} = X_{0}^{T} R^{- 1} X_{0}

is the information matrix of the model of each variable for each subject. Thus

\det [M] = {(\det [S]^{- m} \det [M_{0}]^{k})}^{N}

, which finishes the proof since

S

does not depend on the design. □

Theorem 2.

The parameters of the individual models (4) can be estimated independently and do not depend on

S

. However, the variance of the set of the parameter estimators for each subject does depend on

S

because

V a r (β^{(v)}) = S \otimes M_{0}^{- 1}

.

Proof.

The estimator of the parameter vector is:

\begin{array}{r} \hat{β} & = & {(X^{T} Σ^{- 1} X)}^{- 1} X^{T} Σ^{- 1} Y \\ = & (I_{N} \otimes S \otimes M_{0}^{- 1}) (I_{N} \otimes I_{k} \otimes X_{0}^{T}) (I_{N} \otimes S^{- 1} \otimes R^{- 1}) Y \\ = & (I_{N} \otimes I_{k} \otimes W) Y, \end{array}

where

W = M_{0}^{- 1} X_{0}^{T} R^{- 1}

is such that

β_{i}^{(v)} = W y_{i}^{(v)}

for all

i = 1, \dots, k

,

v = 1, \dots, N

, and

V a r (\hat{β}) = M^{- 1} = I_{N} \otimes S \otimes M_{0}^{- 1}

. □

Let us now pay attention to Model II given in (5). In this case the parameter vector is

\tilde{β} = {({\tilde{β}}_{1}^{T}, \dots, {\tilde{β}}_{k}^{T})}^{T}

, with

{\tilde{β}}_{i} = {({\tilde{β}}_{i 1}, \dots, {\tilde{β}}_{i m})}^{T}

. There are only

k

x

m

parameters since the model of each variable

Y_{i}

is the same for all the subjects. For this reason, it will be convenient to place together all the observations of each variable, that is, consider the following observations vector:

\tilde{Y} = {(\underset{\underset{o b s . v a r i a b l e 1}{⏟}}{y_{1}^{(1)}^{T}, \dots, y_{1}^{(N)}^{T}}, \dots, \underset{\underset{o b s . v a r i a b l e k}{⏟}}{y_{k}^{(1)}^{T}, \dots, y_{k}^{(N)}^{T}})}^{T}

The following results can be derived, the first one similar to that of Theorem 1, but now consider the second model:

Theorem 3.

The

D

-optimal designs for the individual models of each variable in each subject are also

D

-optimal for Model II, given by Equation (5).

Proof.

The design matrix

\tilde{X}

corresponding to the ordering given by

\tilde{Y}

will be

\tilde{X} = (\begin{matrix} X_{0} & 0 & \dots & 0 \\ ⋮ & ⋮ & \dots & ⋮ \\ X_{0} & 0 & \dots & 0 \\ 0 & X_{0} & \dots & 0 \\ ⋮ & ⋮ & \dots & ⋮ \\ 0 & X_{0} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & X_{0} \\ ⋮ & ⋮ & \dots & ⋮ \\ 0 & 0 & \dots & X_{0} \end{matrix}) = A \otimes X_{0},

with

A = I_{k} \otimes 1_{N}

, where

1_{N}

denotes a column vector of ‘1′s of length

N

.

Let

P

be the

N k n

permutation matrix that turns

Y

into

\tilde{Y}

, that is,

P Y = \tilde{Y}

. Since

P

permutes in fact the vectors of observations

y_{i}^{(v)}

, it can be expressed as

P = Q \otimes I_{n}

, with

Q

the

N k

permutation matrix expressed by vector

v_{Q}

,

\begin{array}{r} v_{Q} & = & (1, k + 1, 2 k + 1, \dots, (N - 1) k + 1, \\ 2, k + 2, 2 k + 2, \dots, (N - 1) k + 2, \\ \dots \dots \dots \\ k, 2 k, 3 k, \dots, N k), \end{array}

meaning that for every row

i

of

Q

(

i = 1, \dots, N k

), we have

Q_{i, v_{Q} (i)} = 1

and

Q_{i, j} = 0

if

j \neq v_{Q} (i)

.

Now the covariance matrix of

\tilde{Y}

is

\begin{array}{r} \tilde{Σ} & = & Σ_{\tilde{Y}} = Σ_{P Y} = P Σ_{Y} P^{T} \\ = & (Q \otimes I_{n}) [(I_{N} \otimes S) \otimes R] (Q^{T} \otimes I_{n}) \\ = & [Q (I_{N} \otimes S) Q^{T}] \otimes R \\ = & S \otimes I_{N} \otimes R, \end{array}

and the information matrix can be computed as

\begin{array}{r} \tilde{M} & = & {\tilde{X}}^{T} {\tilde{Σ}}^{- 1} \tilde{X} \\ = & (A^{T} \otimes X_{0}^{T}) [(S^{- 1} \otimes I_{N}) \otimes R^{- 1}] (A \otimes X_{0}) \\ = & [A^{T} (S^{- 1} \otimes I_{N}) A] (X_{0}^{T} R^{- 1} X_{0}) \\ = & N S^{- 1} \otimes M_{0}, \end{array}

With the last expression obtained using that

A = I_{k} \otimes 1_{N}

. Then,

\det [\tilde{M}] = N^{k m} \det [S]^{- m} \det [M_{0}]^{k}

. □

Theorem 4.

Regarding Model II given by (5), the estimation of the parameters of the

i

-th response is the average of the corresponding estimations for each subject and does not depend on

S

(but their covariance matrix, which is proportional to the inverse of the information matrix, does depend on

S

).

Proof.

The estimator of the parameter vector is

\begin{array}{r} \hat{\tilde{β}} & = & {({\tilde{X}}^{T} {\tilde{Σ}}^{- 1} \tilde{X})}^{- 1} {\tilde{X}}^{T} {\tilde{Σ}}^{- 1} \tilde{Y} \\ = & (S / N \otimes M_{0}^{- 1}) (A^{T} \otimes X_{0}) [(S^{- 1} \otimes I_{N}) \otimes R^{- 1}] \tilde{Y} \\ = & [S / N A^{T} (S^{- 1} \otimes I_{N})] \otimes (M_{0}^{- 1} X_{0}^{T} R^{- 1}) \tilde{Y} \\ = & (I_{k} \otimes 1_{N}^{T} / N \otimes W) \tilde{Y}, \end{array}

where the last equality comes from

(S / N) A^{T} = (S \otimes I_{1} / N) (I_{k} \otimes 1_{N}^{T}) = (S \otimes 1_{N}^{T} / N) .

Let us have a close look at this expression. It means that

\begin{array}{r} \hat{\tilde{β_{i}}} & = & (1_{N}^{T} / N \otimes W) {(y_{i}^{(1)}^{T}, \dots, y_{i}^{(N)}^{T})}^{T} \\ = & \frac{1}{N} (W, \dots, W) {(y_{i}^{(1)}^{T}, \dots, y_{i}^{(N)}^{T})}^{T} \\ = & \frac{1}{N} \sum_{v = 1}^{N} W y_{i}^{(v)}^{T} \end{array}

which, taking into account that

W y_{i}^{(v)}^{T}

is the estimation of the parameters of the model of the variable

Y_{i}

for the

v

-th subject, finishes the proof. □

3. Optimal Designs for Evolution of MR and PS

There is a general agreement that problem solving should be the main objective in school mathematics instruction [12,13]. It seems clear that this ability may be related with the capability of understanding the semantic structure of the mathematical problems statements and increases with age. Children could be able to solve real world problems at early age [14], but the acquisition of academic language comes later, after they dominate everyday language [15]. The relation between problem solving and linguistic comprehension is explored in [16]. However, apart from understanding the statement of the problem, the students must be able to construct a mental representation of it [17,18,19].

All these facts are discussed in [3], where a test on additive problems using expressions close to practical everyday language is proposed. That work studied the influence of mental representation (

M R

) in the ability of the resolution of additive problems (PS) for children from

6

to

12

years old and found a model relating the two variables. Once this relationship has been stablished, the next step should be to find a convenient design plan for a follow-up study of the evolution of both characteristics. To this end, the results of Section 2 will be employed for obtaining an optimal design plan for the observation of these two variables through time. The design points

t_{i}

can denote any convenient temporal unit (school level, semester, etc.).

A constant variance

σ^{2}

will be assumed for the observations of the two variables. Should they have different variances

σ_{1}^{2}

and

σ_{2}^{2}

, the covariance structure would depend (by a constant term) on the ratio

σ_{1} / σ_{2}

that has no influence on the computation of the optimal designs (see the discussion at the end of Section 2.1 of [2], and in this situation (constant variance), for the easiness of computations,

σ^{2} = 1

will be assumed without loss of generality. Assuming constant covariance between

y_{i} (t)

,

y_{j} (t)

for every

i, j = 1 \dots, k

may be arguable, mainly because the

k

variables may refer to quite different characteristics and even use different scales. An alternative that may soften this problem would be using normalized values of the variables. However, the case

k = 2

of this example is not controversial, since for k = 2 the intra-covariance matrix S will have this shape:

S = (\begin{matrix} 1 & s \\ s & 1 \end{matrix}),

with

s = C o v [y_{1} (t), y_{2} (t)]

, an assumed constant for all

t

. Thus, when observing

Y_{1} = M R

and

Y_{2} = P S

, the covariance matrix for (the observations taken at) each student will be

Σ_{0} = S \otimes R = (\begin{matrix} R & s R \\ s R & R \end{matrix}) .

(6)

and the previous results can be applied. From Theorems 1 to 4, the

D

-optimal designs can be computed assuming just one variable and one subject. An inter-correlation structure usually employed when the measurements are taken on the same individual is the exponential covariance [20], decreasing with the increasing distance in time between measurements,

C o v [y (t), y (t + d)] = e^{- λ d},

(7)

where the parameter

λ

is characteristic of the individual. When there is no reason to think that the parameter may vary very much between individuals, the same characteristic (reference individual) is used for every of them. In this work,

λ = 1

will be assumed, which is a typical choice.

Two types of evolution models will be considered:

Linear regression model: $\hat{y} = β_{0} + β_{1} t$ ,
that is, $f (t) = {(1, t)}^{T}$ . Since there are $m = 2$ parameters, at least two observations are needed
○
If n = 2, it will be convenient to express the design as ξ = {t, t + d}, where t is the first observation and d > 0 the distance between the two samples, with t + d ≤ t_max, the maximum value for performing the tests. Then, assuming the inter-correlation R given by (7),

$X_{0} = (\begin{array}{l} f {(t)}^{T} \\ f {(t + d)}^{T} \end{array}) = (\begin{matrix} 1 & t \\ 1 & t + d \end{matrix}), R = (\begin{matrix} 1 & e^{- d} \\ e^{- d} & 1 \end{matrix})$

(8)

and

$\begin{array}{r} M_{0} & = & X_{0}^{T} R^{- 1} X_{0} \\ = & \frac{1}{1 - e^{- 2 d}} (\begin{matrix} 1 & 1 \\ t & t + d \end{matrix}) (\begin{matrix} 1 & - e^{- d} \\ - e^{- d} & 1 \end{matrix}) (\begin{matrix} 1 & t \\ 1 & t + d \end{matrix}) \\ = & \frac{1}{1 + e^{- d}} (\begin{matrix} 2 & 2 t + d \\ 2 t + d & 2 t (t + d) + \frac{d^{2}}{1 - e^{- d}} \end{matrix}) . \end{array}$

Thus, $\det (M_{0}) = d^{2} / (1 - e^{- 2 d})$ is an increasing function of $d$ , and the optimal design will be the one having the maximum distance $d$ between the observations, therefore taking the samples at the beginning and at the end of the period of study.
○
When $n = 3$ , let us consider the design $ξ = {t, t + d_{1}, t + d_{1} + d_{2}}$ , with $d_{1}, d_{2} > 0$ and $t + d_{1} + d_{2} \leq t_{m a x}$ . Now

$X_{0} = (\begin{matrix} 1 & t \\ 1 & t + d_{1} \\ 1 & t + d_{1} + d_{2} \end{matrix}), R = (\begin{matrix} 1 & e^{- d_{1}} & e^{- (d_{1} + d_{2})} \\ e^{- d_{1}} & 1 & e^{- d_{2}} \\ e^{- (d_{1} + d_{2})} & e^{- d_{2}} & 1 \end{matrix})$

and

$\det (M_{0}) = \frac{2 e^{2 d_{1} + d_{2}} (e^{d_{2}} (e^{d_{1}} - 1) d_{1}^{2} + (- 2 e^{d_{2}} + e^{d_{1} + d_{2}} + 1) d_{2} d_{1} + e^{d_{2}} (e^{d_{1}} - 1) d_{2}^{2})}{(e^{d_{1} + d_{2}} - 1) (e^{d_{1}} - 2 e^{d_{2}} + e^{2 d_{1} + d_{2}})} .$

Assuming a minimum distance $d_{0}$ between consecutive sample points, the maximum is attained for $d_{1} = t_{m a x} - d_{0}$ , and $d_{2} = d_{0}$ . Thus, the first test should be taken at the beginning and the rest at the end, with the minimum possible distance $d_{0}$ between these last ones.
Exponential regression model: $\hat{y} = β_{0} e^{β_{1} t}$ .
After linearizing, the model can be expressed as $f (t) = e^{β_{1} t} {(1, β_{0} t)}^{T}$ , which depends on the unknown values of the parameters. In this case, the D-optimal designs will depend as well on these values; thus, they will be in fact locally optimal, that is, good for (or near to) those nominal values used in the computation. It is well known that the optimal designs will not depend on the parameters that ‘appear linearly’ in the model [21], that is, $β_{0}$ in this case, while they will depend on the ‘non-linear parameter’ $β_{1}$ . Thus, an initial value will be needed just for this last one. Again, the cases of two and three observations will be studied:
○
$n = 2 \Rightarrow ξ = {t, t + d}$ . The determinant of the information matrix is

$\det (M_{0}) = \frac{β_{0}^{2} d^{2} e^{2 (β_{1} + 1) d + 4 β_{1} t}}{e^{2 d} - 1},$

and it is clear that the values $t$ and $d$ maximizing this determinant will not depend on $β_{0}$ . Figure 1 shows the determinant of the information matrix after removing $β_{0}^{2}$ and using the nominal value $β_{10} = 1$ for the non-linear parameter. For primary school levels ( $t_{m a x} = 6$ ) and assuming a minimum distance $d_{0} = 1$ between tests, the recommended design uses $t = 5$ and $d = 1$ , that is, making the tests the last two years.
○
$n = 3 \Rightarrow ξ = {t, t + d_{1}, t + d_{1} + d_{2}}$ . The determinant of $M_{0}$ is

$\begin{array}{r} \det (M_{0}) & = & \frac{β_{0}^{2} e^{2 (β_{1} + 1) d_{1} + d_{2} + 4 β_{1} t}}{(e^{d_{1} + d_{2}} - 1) (e^{d_{1}} - 2 e^{d_{2}} + e^{2 d_{1} + d_{2}})} \\ [e^{d_{2}} d_{1}^{2} (- 2 e^{β_{1} d_{2}} + e^{2 β_{1} d_{2} + d_{1}} + e^{d_{1}}) \\ - 2 d_{2} d_{1} e^{β_{1} d_{2}} (- e^{β_{1} d_{1}} - e^{(β_{1} + 1) d_{2} + d_{1}} + e^{β_{1} d_{1} + (β_{1} + 1) d_{2}} + e^{d_{2}}) \\ + d_{2}^{2} e^{(2 β_{1} + 1) d_{2}} (- 2 e^{β_{1} d_{1}} + e^{(2 β_{1} + 1) d_{1}} + e^{d_{1}})], \end{array}$

and when removing β₀² and assuming t_max = 6 and d₀ = 1, the maximum for β₁₀ = 1 is attained when t = 4, d₁ = 1, and d₂ = 1, that is, testing the students the last three courses.

4. Discussion

In the last few years, there has seen a remarkable increase in the relevance of the design of experiments in social science areas [22]. Major issues in the education research area must be treated on a sequential multiscale temporal level. Educational policies such as curricula, reduction in class size, programs for students’ support, etc. are developed considering educational theories based on solid experimentation [23]. In addition, large-scale experiments may be unattainable, and some authors alert about extracting generalized conclusions from experiments in education [24]. There is a general agreement in trying to “maximize the scientific benefit using the resources available for an investigation” [25]; thus, the optimal design of experiments becomes a key piece on the research process.

Randomized experiments with multilevel implications have been used by educational researchers in order to determine the effect of some treatment through time. One of the most used techniques in this line is the sequential multiple assignment randomized trials (SMART) [23,26,27]. This approach gives information about the effectiveness of some intervention or treatment through time, but it does not discriminate factor variables [25,28]. As an alternative, factorial designs, with the feature of determining the influence of factors, have also been applied for similar purposes [25,29]. These factorial designs have been used as well to provide a relation between four different mathematical problem representations and the problem solving abilities of elementary school students from grades 1 to 3 [30]. Nevertheless, some limitations regarding to the independence of the variables, the quantity of considered factors, or the presence of more than two levels by factor are addressed [25]. Most of these studies assume independent observations; few consider some kind of relation between samples (for example, the intraclass-correlation between observations within a cluster in [25,31], but not complex correlation structures like the ones derived from the consideration of multiresponse models such as the ones studied in this work.

In educational research, observations are rarely independent, as pointed out in [32]. Mixed-effect regression models cover this and other issues, such as correlated data organized in a multilevel or hierarchical structure, missing data, variability, etc. [33,34,35]. Applications of these models in educational projects can be found in [32] where the correlation between the results of two tests taken before and after a treatment is studied, controlled for a variable with three levels. The levels correspond to three selected difficult topics to study, and a selection model based on AIC criterion is applied to conclude. In [36], mixed linear regression analysis is applied to a large database to establish the correlation between the student’s relationship and their academic performance. Finally, ref. [37] evaluated the impact of sequences of parents’ input on children’s language outcomes. Although all these applications are longitudinal or level studies, they are focused on selecting the best model for prediction, but no one addresses the issue of optimizing data recollection along time.

Despite being less known in the education area, generalized estimating equations (GEEs) are introduced in [38] as an alternative for the analysis of cross sectional clustered data with repeated measurements along time [39,40,41]. It shows GEEs as a generalization of linear general models and that its performance is similar to multilevel models (random effects or mixed model) and ordinary least squares. However, some limitations have to be considered, for example the impossibility of applying classical selection model techniques based on likelihood estimation, the lack of non-random data, and the amount of missing data. In addition, the number of clusters should be relatively high, and the observations in different clusters must be independent, although within-cluster observations may be correlated, as it is refereed in [40].

Again, it is necessary to highlight that none of these approaches seem to be fully appropriate to conduct experiments involving multiresponse and multisubject models with repeated observations in time, with potential correlation structures as the ones described in this paper. The novel approach presented here seems the most convenient for studies that could be similar to the one described in the example.

The relation between the variables ’Mental Representation’ and ’Mathematical Problem Resolution’ in primary school students was extensively studied in [3]. Now, the theory developed in Section 2 has been applied to that case for finding the best design (times when perform the tests to the students) in order to obtain an insight of the evolution of the two variables.

Assuming a sensible hypothesis about the correlation structure, two scenarios have been considered, one with different models of the variables for each subject and another one assuming the same model of each variable for every subject, obtaining interesting theoretical results especially in the second case. Two types of evolution models have been considered as well in order to stress that the optimal allocation of the tests may depend very much on the assumed model. The best design when assuming the linear regression model is making the tests near the extremes of the evaluation period. However, when the evolution is described by an exponential model, the results may be quite different since the optimal designs are very sensitive to the initial value of the parameters. The widely used exponential inter-correlation has been assumed. The preliminary study in [3] discovered that the group of the small (first year) students had a great variability, and thus it was not very informative. For this reason, when a design contains any temporal point belonging to the first school year (for instance when assuming a linear trend evolution), it will be convenient to delay these tests so that the analysis interval starts the second year of the primary school period.

A robustness study has been carried out for the exponential evolution model in order to check whether the optimal designs obtained are sensitive to the choice of the nominal value of

β_{1}

. It has been found that in fact the optimal designs can vary very much; for instance if

n = 2

, the best designs found vary from the ones that take the extremes of the design interval when

β_{10} = 0.1

or the school years 4 and 6 when

β_{10} = 0.5

to those choosing the last temporal points when

β_{10} \geq 1

. A similar variability of results is obtained for

n = 3

: when

β_{10} = 0.1

, the proposal is to make one test at the beginning of the design interval and the other two at the end, and when

β_{10} = 0.5

assuming

d_{0} = 1

, the most convenient school years to perform the tests are 3, 4, and 6.

Optimal designs have been computed for the

D

-optimality criterion, which is the most used, and two popular 2-parameter evolution models have been considered assuming always exponential inter-correlation. However, the procedure can be immediately extended to other optimality criteria, evolution models, and correlation kernels, provided that these can be assumed similar for all the variables.

5. Conclusions

The need to study the problem solving ability of primary school students, its relation with other variables, and the evolution of all of them was the inspiration of this work, from a previous study of [3]. In order to choose the best temporal points where samples should be taken to maximize the information of the collected data, a general procedure for several variables and subjects has been developed from the point of view of optimal design of experiments theory.

It should be noted that the results obtained in Section 2 are quite general and could be applied in other type of studies involving several subjects, with a

k > 2

number of variables of interest. Depending on the cost of every type of measure and the available budget, sometimes it might be convenient to select the variables that will be observed at a specific point instead of sampling all of them. The reason may be to save the cost of those measurements that are not considered so important or informative. Cost constraints could be incorporated into the model in a similar way as described in Chapter 5 of [42]. For instance, when two types of variables are to be observed, the problem can be stated as follows: two types of measurements

y_{1}

,

y_{2}

can be made for every experimental unit, with costs

c_{1}

and

c_{2}

, respectively. In the general case, every time that a subject is measured entails some cost

c

. For the total cost there are three possibilities:

Measure $y_{1}$ , with cost $c_{1}$ . Total cost: $C_{1} = c + c_{1}$ ;
Measure $y_{2}$ , with cost $c_{2}$ . Total cost: $C_{2} = c + c_{2}$ ;
Measure both $y_{1}$ and $y_{2}$ . Now the overall cost is $C_{3} = c + c_{1} + c_{2}$ .

Possible scenarios are described by

t_{x} = {t, x_{1}, x_{2}}

, with

0 \leq t \leq t_{m a x}

and

x_{1}

,

x_{2}

in

{0, 1}

and

x_{1} + x_{2} > 0

, where

x_{i} = 1

when test

y_{i}

is made at a time

t

, and

x_{i} = 0

otherwise. The best designs fitting this budget may be non-balanced. In this case, the covariance matrices could be obtained as described in Example 2 of [1]. Ref. [42] discusses a procedure for studying this issue, but only for one subject and considering intra-correlation and not inter-correlation between samples, which are assumed independent. When there are more subjects and the two types of correlation are non-trivial, the problem becomes far more complex to deal with.

For this study, it has been assumed that the evolution models were similar in the different variables. Should these models be different, the covariance and information matrices would be much more complex, as would the computation of the optimal designs. The assumptions of equal intracorrelation in all the design points and equal intercorrelation for all the variables are quite sensible. In case that any of them (or both) were not true, the covariance matrices could not be expressed by Kronecker products, and the optimal designs for the global model would not be the same as the ones for the individual models of each variable for each subject; in fact, this global optimal design may be extremely difficult to compute.

Author Contributions

Conceptualization, M.R.-H., R.E.P. and J.M.R.-D.; methodology, J.M.R.-D.; software, J.M.R.-D.; validation, M.R.-H., R.E.P. and J.M.R.-D.; formal analysis, J.M.R.-D.; investigation, J.M.R.-D.; resources, M.R.-H., R.E.P. and J.M.R.-D.; data curation, M.R.-H., R.E.P. and J.M.R.-D.; writing—original draft preparation, J.M.R.-D.; writing—review and editing, M.R.-H., R.E.P. and J.M.R.-D.; visualization, M.R.-H., R.E.P. and J.M.R.-D.; supervision, M.R.-H., R.E.P. and J.M.R.-D.; project administration, J.M.R.-D.; funding acquisition, J.M.R.-D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Spanish Ministerio de Economía y Competitividad and Junta de Castilla y León (Projects ‘MTM2016-80539-C2-2-R’ and ‘SA080P17’, ‘SA105P20’).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rodríguez-Díaz, J.M.; Sánchez-León, G. Efficient parameter estimation in multiresponse models measuring radioactivity retention. Radiat. Environ. Biophys. 2019, 58, 167–182. [Google Scholar] [CrossRef] [PubMed]
Rodríguez-Díaz, J.M.; Sánchez-León, G. Optimal designs for multiresponse models with double covariance structure. Chemometr. Intell. Lab. Syst. 2019, 189, 1–7. [Google Scholar] [CrossRef]
Rodríguez-Hernández, M.; Pruneda, R.E.; Rodríguez-Díaz, J.M. Statistical Analysis of the Evolutionary Effects of Language Development in the Resolution of Mathematical Problems in Primary School Education. Mathematics 2021, 9, 1081. [Google Scholar] [CrossRef]
Fedorov, V.V.; Hackl, P. Model-Oriented Design of Experiments; Springer: New York, NY, USA, 1997. [Google Scholar]
Pukelsheim, F. Optimal Design of Experiments; SIAM: Philadelphia, PA, USA, 2006. [Google Scholar]
Atkinson, A.C.; Donev, A.N.; Tobias, R.D. Optimum Experimental Designs, with SAS; Oxford University Press: Oxford, UK, 2007. [Google Scholar]
Rodríguez-Díaz, J.M.; Sánchez-León, G. Design optimality for models defined by a system of ordinary differential equations. Biometr. J. 2014, 56, 886–900. [Google Scholar] [CrossRef]
Dette, H.; Pepelyshev, A.; Zhigljavsky, A. Design for linear regression models with correlated errors. In Handbook of Design and Analysis of Experiments; CRC Press: Boca Raton, FL, USA, 2015; pp. 237–276. [Google Scholar]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
Harville, D.A. Matrix Algebra from a Statistician’s Perspective; Springer: New York, NY, USA, 1997. [Google Scholar]
Rodríguez-Díaz, J.M.; Rivas-López, M.J.; Santos-Martín, M.T.; Mariñas Collado, I. Optimal designs for a linear-model compositional response. Stoch. Environ. Res. Risk Assess. 2020, 34, 139–148. [Google Scholar] [CrossRef]
NCTM. Principles to Actions: Ensuring Mathematical Success for All; National Council of Teachers of Mathematics: Reston, VA, USA, 2014. [Google Scholar]
Cockcroft, W.H. Mathematics Counts: Report of the Committee of Inquiry into the Teaching of Mathematics in Schools Under the Chairmanship of W.H. Cockcroft; Technical Report; Her Majesty’s Stationery Office and the Queen’s Printer for Scotland: London, UK, 1982. [Google Scholar]
Carpenter Thomas, P.; Ansell, E.; Franke, M.L.; Fennema, E.; Weisbeck, L. Models of Problem Solving: A Study of Kindergarten Children’s Problem-Solving Processes. J. Res. Math. Educ. 1993, 24, 428–441. [Google Scholar] [CrossRef]
Cummins, J. Language, Power, and Pedagogy: Bilingual Children in the Crossfire; Multilingual Matters Ltd.: Clevedon, UK, 2000. [Google Scholar]
Dixon, J.A.; Moore, C.F. The developmental role of intuitive principles in choosing mathematical strategies. Dev. Psychol. 1996, 32, 241–253. [Google Scholar] [CrossRef]
De Corte, E.; Verschaffel, L.; De Win, L. Influence of rewording verbal problems on children’s problem representations and solutions. J. Educ. Psychol. 1985, 77, 460–470. [Google Scholar] [CrossRef]
Hegarty, M.; Mayer, R.E.; Monk, C.A. Comprehension of arithmetic word problems: A comparison of successful and unsuccessful problem solvers. J. Educ. Psychol. 1995, 87, 18–32. [Google Scholar] [CrossRef]
Pape, S.J. Compare word problems: Consistency hypothesis revisited. Contemp. Educ. Psychol. 2003, 28, 396–421. [Google Scholar] [CrossRef]
Cressie, N.; Wikle, C.K. Statistics for Spatio-Temporal Data; John Wiley and Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Hill, P.D.H. D-optimal designs for partially nonlinear regression models. Technometrics 1980, 22, 275–276. [Google Scholar] [CrossRef]
Jackson, M.; Cox, D.R. The principles of experimental design and their application in sociology. Annu. Rev. Sociol. 2013, 39, 27–49. [Google Scholar] [CrossRef] [Green Version]
Raudenbush, S.W.; Schwartz, D. Randomized experiments in education, with implications for multilevel causal inference. Annu. Rev. Stat. Its Appl. 2020, 7, 177–208. [Google Scholar] [CrossRef]
Cook, T.D. Randomized experiments in educational policy research: A critical examination of the reasons the educational evaluation community has offered for not doing them. Educ. Eval. Policy Anal. 2002, 24, 175–199. [Google Scholar] [CrossRef]
Dziak, J.J.; Nahum-Shani, I.; Collins, L.M. Multilevel factorial experiments for developing behavioral interventions: Power, sample size, and resource considerations. Psychol. Methods 2012, 17, 153. [Google Scholar] [CrossRef]
Murphy, S.A. An experimental design for the development of adaptive treatment strategies. Stat. Med. 2005, 24, 1455–1481. [Google Scholar] [CrossRef] [Green Version]
Almirall, D.; Kasari, C.; McCaffrey, D.F.; Nahum-Shani, I. Developing optimized adaptive interventions in education. J. Res. Educ. Eff. 2018, 11, 27–34. [Google Scholar] [CrossRef]
Flay, B.R.; Collins, L.M. Historical review of school-based randomized trials for evaluating problem behavior prevention programs. Ann. Am. Acad. Political Soc. Sci. 2005, 599, 115–146. [Google Scholar] [CrossRef]
Collins, L.M.; Dziak, J.J.; Li, R. Design of experiments with multiple independent variables: A resource management perspective on complete and reduced factorial designs. Psychol. Methods 2009, 14, 202. [Google Scholar] [CrossRef] [Green Version]
Elia, I.; Gagatsis, A.; Demetriou, A. The effects of different modes of representation on the solution of one-step additive problems. Learn. Instr. 2007, 17, 658–672. [Google Scholar] [CrossRef]
Cho, S.-J.; Preacher, K.J.; Bottge, B.A. Detecting Intervention Effects in a Cluster-Randomized Design Using Multilevel Structural Equation Modeling for Binary Responses. Appl. Psychol. Meas. 2015, 39, 627–642. [Google Scholar] [CrossRef]
Theobald, E. Students are rarely independent: When, why, and how to use random effects in discipline-based education research. CBE—Life Sci. Educ. 2018, 17, rm2. [Google Scholar] [CrossRef] [PubMed]
Raudenbush, S.W. Educational applications of hierarchical linear models: A review. J. Educ. Stat. 1988, 13, 85–116. [Google Scholar] [CrossRef]
McCulloch, C.E.; Searle, S.R. Generalized, Linear, and Mixed Models; John Wiley & Sons: Hoboken, NJ, USA, 2004. [Google Scholar]
Gałecki, A.; Burzykowski, T. Linear mixed-effects model. In Linear Mixed-Effects Models Using R; Springer: New York, NY, USA, 2013; pp. 245–273. [Google Scholar]
Zhou, H.; Jiang, S.; Liu, X. Regression analysis of intelligent education based on linear mixed effect model. J. Ambient Intell. Humaniz. Comput. 2021, 1–13. [Google Scholar] [CrossRef]
Silvey, C.; Demir-Lira, Ö.E.; Goldin-Meadow, S.; Raudenbush, S.W. Effects of time-varying parent input on children’s language outcomes differ for vocabulary and syntax. Psychol. Sci. 2021, 32, 536–548. [Google Scholar] [CrossRef] [PubMed]
Huang, F.L. Analyzing cross-Sectionally clustered data using generalized estimating equations. J. Educ. Behav. Stat. 2022, 47, 101–125. [Google Scholar] [CrossRef]
Burton, P.; Gurrin, L.; Sly, P. Extending the simple linear regression model to account for correlated responses: An introduction to generalized estimating equations and multi-level mixed modelling. Stat. Med. 1998, 17, 1261–1291. [Google Scholar] [CrossRef]
Ghisletta, P.; Spini, D. An introduction to generalized estimating equations and an application to assess selectivity effects in a longitudinal study on very old individuals. J. Educ. Behav. Stat. 2004, 29, 421–437. [Google Scholar] [CrossRef]
Muth, C.; Bales, K.L.; Hinde, K.; Maninger, N.; Mendoza, S.P.; Ferrer, E. Alternative models for small samples in psychological research: Applying linear mixed effects models and generalized estimating equations to repeated measures data. Educ. Psychol. Meas. 2016, 76, 64–87. [Google Scholar] [CrossRef] [Green Version]
Fedorov, V.V.; Gagnon, R.; Leonov, S. Optimal Design for Multiple Responses with Variance Depending on Unknown Parameters; GSK BDS Technical Report 2001–2003; GlaxoSmithKline Pharmaceuticals: Collegeville, PA, USA, 2003. [Google Scholar]

Figure 1. Determinant of the information matrix of

ξ = {t, t + d}

for the exponential model for

β_{0} = β_{1} = 1

.

Figure 1. Determinant of the information matrix of

ξ = {t, t + d}

for the exponential model for

β_{0} = β_{1} = 1

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rodríguez-Díaz, J.M.; Pruneda, R.E.; Rodríguez-Hernández, M. Design Plan for an Evolution Study of Related Characteristics of a Population. Mathematics 2022, 10, 792. https://doi.org/10.3390/math10050792

AMA Style

Rodríguez-Díaz JM, Pruneda RE, Rodríguez-Hernández M. Design Plan for an Evolution Study of Related Characteristics of a Population. Mathematics. 2022; 10(5):792. https://doi.org/10.3390/math10050792

Chicago/Turabian Style

Rodríguez-Díaz, Juan M., Rosa E. Pruneda, and Mercedes Rodríguez-Hernández. 2022. "Design Plan for an Evolution Study of Related Characteristics of a Population" Mathematics 10, no. 5: 792. https://doi.org/10.3390/math10050792

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design Plan for an Evolution Study of Related Characteristics of a Population

Abstract

1. Introduction

2. Background and New Results

3. Optimal Designs for Evolution of MR and PS

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI