Research

32 pages, 469 KiB

Open AccessArticle

Estimation and Inference for Spatio-Temporal Single-Index Models

by Hongxia Wang, Zihan Zhao, Hongxia Hao and Chao Huang

Mathematics 2023, 11(20), 4289; https://doi.org/10.3390/math11204289 - 14 Oct 2023

Viewed by 551

To better fit the actual data, this paper will consider both spatio-temporal correlation and heterogeneity to build the model. In order to overcome the “curse of dimensionality” problem in the nonparametric method, we improve the estimation method of the single-index model and combine [...] Read more.

To better fit the actual data, this paper will consider both spatio-temporal correlation and heterogeneity to build the model. In order to overcome the “curse of dimensionality” problem in the nonparametric method, we improve the estimation method of the single-index model and combine it with the correlation and heterogeneity of the spatio-temporal model to obtain a good estimation method. In this paper, assuming that the spatio-temporal process obeys the

α

mixing condition, a nonparametric procedure is developed for estimating the variance function based on a fully nonparametric function or dimensional reduction structure, and the resulting estimator is consistent. Then, a reweighting estimation of the parametric component can be obtained via taking the estimated variance function into account. The rate of convergence and the asymptotic normality of the new estimators are established under mild conditions. Simulation studies are conducted to evaluate the efficacy of the proposed methodologies, and a case study about the estimation of the air quality evaluation index in Nanjing is provided for illustration. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

18 pages, 869 KiB

Open AccessArticle

Partially Functional Linear Models with Linear Process Errors

by Yanping Hu and Zhongqi Pang

Mathematics 2023, 11(16), 3581; https://doi.org/10.3390/math11163581 - 18 Aug 2023

Viewed by 650

Abstract

In this paper, we focus on the partial functional linear model with linear process errors deduced by not necessarily independent random variables. Based on Mercer’s theorem and Karhunen–Loève expansion, we give the estimators of the slope parameter and coefficient function in the model, [...] Read more.

In this paper, we focus on the partial functional linear model with linear process errors deduced by not necessarily independent random variables. Based on Mercer’s theorem and Karhunen–Loève expansion, we give the estimators of the slope parameter and coefficient function in the model, establish the asymptotic normality of the estimator for the parameter and discuss the weak convergence with rates of the proposed estimators. Meanwhile, the penalized estimator of the parameter is defined by the SCAD penalty and its oracle property is investigated. Finite sample behavior of the proposed estimators is also analysed via simulations. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

13 pages, 1884 KiB

Open AccessArticle

Modeling the Cigarette Consumption of Poor Households Using Penalized Zero-Inflated Negative Binomial Regression with Minimax Concave Penalty

by Yudhie Andriyana, Rinda Fitriani, Bertho Tantular, Neneng Sunengsih, Kurnia Wahyudi, I Gede Nyoman Mindra Jaya and Annisa Nur Falah

Mathematics 2023, 11(14), 3192; https://doi.org/10.3390/math11143192 - 20 Jul 2023

Viewed by 799

Abstract

The cigarette commodity is the second largest contributor to the food poverty line. Several aspects imply that poor people consume cigarettes despite having a minimal income. In this study, we are interested in investigating factors influencing poor people to be active smokers. Since [...] Read more.

The cigarette commodity is the second largest contributor to the food poverty line. Several aspects imply that poor people consume cigarettes despite having a minimal income. In this study, we are interested in investigating factors influencing poor people to be active smokers. Since the consumption number is a set of count data with zero excess, we have an overdispersion problem. This implies that a standard Poisson regression technique cannot be implemented. On the other hand, the factors involved in the model need to be selected simultaneously. Therefore, we propose to use a zero-inflated negative binomial (ZINB) regression with a minimax concave penalty (MCP) to determine the dominant factors influencing cigarette consumption in poor households. The data used in this study were microdata from the National Socioeconomic Survey (SUSENAS) conducted in March 2019 in East Java Province, Indonesia. The result shows that poor households with a male head of household, having no education, working in the informal sector, having many adult household members, and receiving social assistance tend to consume more cigarettes than others. Additionally, cigarette consumption decreases with the increasing age of the head of household. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

28 pages, 753 KiB

Open AccessArticle

ADMM-Based Differential Privacy Learning for Penalized Quantile Regression on Distributed Functional Data

by Xingcai Zhou and Yu Xiang

Mathematics 2022, 10(16), 2954; https://doi.org/10.3390/math10162954 - 16 Aug 2022

Cited by 1 | Viewed by 1272

Abstract

Alternating Direction Method of Multipliers (ADMM) is a widely used machine learning tool in distributed environments. In the paper, we propose an ADMM-based differential privacy learning algorithm (FDP-ADMM) on penalized quantile regression for distributed functional data. The FDP-ADMM algorithm can resist adversary attacks [...] Read more.

Alternating Direction Method of Multipliers (ADMM) is a widely used machine learning tool in distributed environments. In the paper, we propose an ADMM-based differential privacy learning algorithm (FDP-ADMM) on penalized quantile regression for distributed functional data. The FDP-ADMM algorithm can resist adversary attacks to avoid the possible privacy leakage in distributed networks, which is designed by functional principal analysis, an approximate augmented Lagrange function, ADMM algorithm, and privacy policy via Gaussian mechanism with time-varying variance. It is also a noise-resilient, convergent, and computationally effective distributed learning algorithm, even if for high privacy protection. The theoretical analysis on privacy and convergence guarantees is derived and offers a privacy–utility trade-off: a weaker privacy guarantee would result in better utility. The evaluations on simulation-distributed functional datasets have demonstrated the effectiveness of the FDP-ADMM algorithm even if under high privacy guarantee. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

27 pages, 2037 KiB

Open AccessArticle

Deep Neural Networks for Form-Finding of Tensegrity Structures

by Seunghye Lee, Qui X. Lieu, Thuc P. Vo and Jaehong Lee

Mathematics 2022, 10(11), 1822; https://doi.org/10.3390/math10111822 - 25 May 2022

Cited by 8 | Viewed by 2246

Abstract

Analytical paradigms have limited conventional form-finding methods of tensegrities; therefore, an innovative approach is urgently needed. This paper proposes a new form-finding method based on state-of-the-art deep learning techniques. One of the statical paradigms, a force density method, is substituted for trained deep [...] Read more.

Analytical paradigms have limited conventional form-finding methods of tensegrities; therefore, an innovative approach is urgently needed. This paper proposes a new form-finding method based on state-of-the-art deep learning techniques. One of the statical paradigms, a force density method, is substituted for trained deep neural networks to obtain necessary information of tensegrities. It is based on the differential evolution algorithm, where the eigenvalue decomposition process of the force density matrix and the process of the equilibrium matrix are not needed to find the feasible sets of nodal coordinates. Three well-known tensegrity examples including a 2D two-strut, a 3D-truncated tetrahedron and an icosahedron tensegrity are presented for numerical verifications. The cases of the ReLU and Leaky ReLU activation functions show better results than those of the ELU and SELU. Moreover, the results of the proposed method are in good agreement with the analytical super-stable lines. Three examples show that the proposed method exhibits more uniform final shapes of tensegrity, and much faster convergence history than those of the conventional one. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

15 pages, 1373 KiB

Open AccessArticle

A New Case-Mix Classification Method for Medical Insurance Payment

by Hongliang Liu, Jinpeng Tan, Kyongson Jon and Wensheng Zhu

Mathematics 2022, 10(10), 1640; https://doi.org/10.3390/math10101640 - 11 May 2022

Viewed by 1525

Abstract

Rapidly rising medical expenses can be controlled by a well-designed medical insurance payment system with the ability to ensure the stability and development of medical insurance funds. At present, China is in the stage of exploring the reform of the medical insurance payment [...] Read more.

Rapidly rising medical expenses can be controlled by a well-designed medical insurance payment system with the ability to ensure the stability and development of medical insurance funds. At present, China is in the stage of exploring the reform of the medical insurance payment system. One of the significant tasks is to establish an appropriate reimbursement model for disease treatment expenses, so as to meet the needs of patients for medical services. In this paper, we propose a case-mixed decision tree method that considers the homogeneity within the same case subgroup as well as the heterogeneity between different case subgroups. The optimal case mix is determined by maximizing the inter-group difference and minimizing the intra-group difference. In order to handle the instability of the tree-based method with a small amount of data, we propose a multi-model ensemble decision tree method. This method first extracts and merges the inherent rules of the data by the stacking-based ensemble learning method, then generates a new sample set by aggregating the original data with the additional samples obtained by applying these rules, and finally trains the case-mix decision tree with the augmented dataset. The proposed method ensures the interpretability of the grouping rules and the stability of the grouping at the same time. The experimental results on real-world data demonstrate that our case-mix method can provide reasonable medical insurance payment standards and the appropriate medical insurance compensation payment for different patient groups. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

13 pages, 549 KiB

Open AccessArticle

GR-GNN: Gated Recursion-Based Graph Neural Network Algorithm

by Kao Ge, Jian-Qiang Zhao and Yan-Yong Zhao

Mathematics 2022, 10(7), 1171; https://doi.org/10.3390/math10071171 - 04 Apr 2022

Cited by 1 | Viewed by 2070

Abstract

Under an internet background involving artificial intelligence and big data—unstructured, materialized, network graph-structured data, such as social networks, knowledge graphs, and compound molecules, have gradually entered into various specific business scenarios. One problem that urgently needs to be solved in the industry involves [...] Read more.

Under an internet background involving artificial intelligence and big data—unstructured, materialized, network graph-structured data, such as social networks, knowledge graphs, and compound molecules, have gradually entered into various specific business scenarios. One problem that urgently needs to be solved in the industry involves how to perform feature extractions, transformations, and operations in graph-structured data to solve downstream tasks, such as node classifications and graph classifications in actual business scenarios. Therefore, this paper proposes a gated recursion-based graph neural network (GR-GNN) algorithm to solve tasks such as node depth-dependent feature extractions and node classifications for graph-structured data. The GRU neural network unit was used to complete the node classification task and, thereby, construct the GR-GNN model. In order to verify the accuracy, effectiveness, and superiority of the algorithm on the open datasets Cora, CiteseerX, and PubMed, the algorithm was used to compare the operation results with the classical graph neural network baseline algorithms GCN, GAT, and GraphSAGE, respectively. The experimental results show that, on the validation set, the accuracy and target loss of the GR-GNN algorithm are better than or equal to other baseline algorithms; in terms of algorithm convergence speed, the performance of the GR-GNN algorithm is comparable to that of the GCN algorithm, which is higher than other algorithms. The research results show that the GR-GNN algorithm proposed in this paper has high accuracy and computational efficiency, and very wide application significance. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

21 pages, 583 KiB

Open AccessArticle

Communication-Efficient Distributed Learning for High-Dimensional Support Vector Machines

by Xingcai Zhou and Hao Shen

Mathematics 2022, 10(7), 1029; https://doi.org/10.3390/math10071029 - 23 Mar 2022

Cited by 1 | Viewed by 1408

Abstract

Distributed learning has received increasing attention in recent years and is a special need for the era of big data. For a support vector machine (SVM), a powerful binary classification tool, we proposed a novel efficient distributed sparse learning algorithm, the communication-efficient surrogate [...] Read more.

Distributed learning has received increasing attention in recent years and is a special need for the era of big data. For a support vector machine (SVM), a powerful binary classification tool, we proposed a novel efficient distributed sparse learning algorithm, the communication-efficient surrogate likelihood support vector machine (CSLSVM), in high-dimensions with convex or nonconvex penalties, based on a communication-efficient surrogate likelihood (CSL) framework. We extended the CSL for distributed SVMs without the need to smooth the hinge loss or the gradient of the loss. For a CSLSVM with lasso penalty, we proved that its estimator could achieve a near-oracle property for

l_{1}

penalized SVM estimators on whole datasets. For a CSLSVM with smoothly clipped absolute deviation penalty, we showed that its estimator enjoyed the oracle property, and that it used local linear approximation (LLA) to solve the optimization problem. Furthermore, we showed that the LLA was guaranteed to converge to the oracle estimator, even in our distributed framework and the ultrahigh-dimensional setting, if an appropriate initial estimator was available. The proposed approach is highly competitive with the centralized method within a few rounds of communications. Numerical experiments provided supportive evidence. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

14 pages, 393 KiB

Open AccessArticle

Dynamic Analysis of a Stochastic Rumor Propagation Model with Regime Switching

by Fangju Jia and Chunzheng Cao

Mathematics 2021, 9(24), 3277; https://doi.org/10.3390/math9243277 - 16 Dec 2021

Cited by 3 | Viewed by 2091

Abstract

We study the rumor propagation model with regime switching considering both colored and white noises. Firstly, by constructing suitable Lyapunov functions, the sufficient conditions for ergodic stationary distribution and extinction are obtained. Then we obtain the threshold

R^{s}

which guarantees the extinction [...] Read more.

We study the rumor propagation model with regime switching considering both colored and white noises. Firstly, by constructing suitable Lyapunov functions, the sufficient conditions for ergodic stationary distribution and extinction are obtained. Then we obtain the threshold

R^{s}

which guarantees the extinction and the existence of the stationary distribution of the rumor. Finally, numerical simulations are performed to verify our model. The results indicated that there is a unique ergodic stationary distribution when

R^{s} > 1

. The rumor becomes extinct exponentially with probability one when

R^{s} < 1

. Full article

(This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Statistical Modeling for Analyzing Data with Complex Structures

Share This Special Issue

Special Issue Editor

Special Issue Information

Published Papers (9 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI