The Application of ResNet-34 Model Integrating Transfer Learning in the Recognition and Classification of Overseas Chinese Frescoes

Gao, Le; Zhang, Xin; Yang, Tian; Wang, Baocang; Li, Juntao

doi:10.3390/electronics12173677

Open AccessArticle

The Application of ResNet-34 Model Integrating Transfer Learning in the Recognition and Classification of Overseas Chinese Frescoes

by

Le Gao

¹

,

Xin Zhang

¹,

Tian Yang

^2,*,

Baocang Wang

³ and

Juntao Li

⁴

¹

The Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529000, China

²

Institute for Guangdong Qiaoxiang Studies, Wuyi University, Jiangmen 529000, China

³

The State Key Laboratory of Integrated Service Networks, The Cryptographic Research Center, Xidian University, Xi’an 710071, China

⁴

School of Economics and Management, Wuyi University, Jiangmen 529000, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(17), 3677; https://doi.org/10.3390/electronics12173677

Submission received: 19 July 2023 / Revised: 27 August 2023 / Accepted: 28 August 2023 / Published: 31 August 2023

(This article belongs to the Special Issue Artificial Intelligence Technologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The unique characteristics of frescoes on overseas Chinese buildings can attest to the integration and historical background of Chinese and Western cultures. Reasonable analysis and preservation of overseas Chinese frescoes can provide sustainable development for culture and history. This research adopts image analysis technology based on artificial intelligence and proposes a ResNet-34 model and method integrating transfer learning. This deep learning model can identify and classify the source of the frescoes of the emigrants, and effectively deal with problems such as the small number of fresco images on the emigrants’ buildings, poor quality, difficulty in feature extraction, and similar pattern text and style. The experimental results show that the training process of the model proposed in this article is stable. On the constructed Jiangmen and Haikou fresco JHD datasets, the final accuracy is 98.41%, and the recall rate is 98.53%. The above evaluation indicators are superior to classic models such as AlexNet, GoogLeNet, and VGGNet. It can be seen that the model in this article has strong generalization ability and is not prone to overfitting. It can effectively identify and classify the cultural connotations and regions of frescoes.

Keywords:

artificial intelligence; deep learning; transfer learning; image classification; fresco

Graphical Abstract

1. Introduction

1.1. The Cultural and Historical Background of Overseas Chinese

Ancestral buildings of overseas Chinese in China represent Mediterranean, Indian, and Nanyang architectural cultures that overseas Chinese introduced into their ancestral countries through the Maritime Silk Road. The multicultural fusion frescoes painted on the ancestral buildings of overseas Chinese also have a strong local painting style, which has spread to multiple cultural regions in the southeastern coastal areas of China [1]. This article conducts a comparative study on the pattern style and color of Jiangmen overseas Chinese frescoes representing the Wuyi Overseas Chinese Cultural District and Haikou Overseas Chinese frescoes representing the Qionglei Cultural District. The architectural fresco styles of Jiangmen and Haikou are exemplary examples of the integration of local culture and foreign Western culture. The frescoes on overseas Chinese buildings are a direct and reliable external language that reflects the local cultural origin. However, due to the different overseas Chinese cultures, historical backgrounds, and levels of Sino foreign exchange between the two regions, there are differences in the painting styles, decorative patterns, and styling designs of architectural frescoes.

1.2. Fresco Culture

Fresco painting is one of the earliest forms of painting in human history, which is known as wall art. In recent years, many scholars have done a lot of research work on the recognition and protection of mural paintings from different angles, and they have achieved gratifying results [2,3,4,5]. The digital protection and restoration of frescoes can preserve the fresco information intact and replicate it as needed. By using high-quality fresco images collected and processed, and by establishing a primitive and systematic database, the virtual display of frescoes can be achieved and can also provide a clear window for future generations to understand the historical background of frescoes and the culture of overseas Chinese locations. Lerme et al. [6] proposed a set of artificial intelligence image virtual reconstruction technologies that achieved the digital restoration of frescoes. Jiang et al. [7] addressed the issues of natural weathering and detachment in Dunhuang frescoes and utilized computer virtual restoration to assist in the replication and protection of frescoes. Dondi et al. [8] used a large dataset of hundreds of thousands of artificially generated fresco images as reference objects to identify frescoes of different colors in different periods and to match and repair severely damaged frescoes.

1.3. Fresco Recognition and Classification Methods

There are many studies on the recognition and classification of frescoes using traditional methods, such as feature extraction and classifier classification. For example, Cao et al. [9] targeted problems such as the small number of mural images and the difficulty in feature extraction; they proposed the Inception-v3 model of fusion transfer learning to identify and classify mural paintings and effectively extracted the high-level features. Tang et al. [10] used extracted fresco image features as a measure of image similarity to express the overall similarity between two images. Teixeira et al. [11] used feature extraction algorithms to select key points corresponding to fresco fragments and to reference images for repairing damaged fresco objects. Although traditional fresco recognition and classification methods can extract certain features from frescoes, due to the diversity of textures and colors in frescoes themselves, traditional methods cannot learn more abundant features of frescoes, resulting in insufficient generalization ability for feature extraction and classification of frescoes. In this paper, the ResNet-34 model of fusion transfer learning can better extract the rich texture and color features of fresco paintings. At the same time, transfer learning can better solve the problem of limited image quantity and insufficient generalization ability. With the continuous updating of deep learning, CNN has made many excellent achievements in the realm of image recognition and classification. The convolutional neural network is widely used in various fields in addition to helping medical imaging segmentation [12], resource prediction image [13], human motion recognition [14], cell image [15], and other image classification fields. Recently, CNN has been gradually applied to fresco image restoration, and image reconstruction and classification.

1.4. Heritage Conservation and Cultural Significance

In order to address the shortcomings of insufficient feature extraction in previous fresco recognition and classification methods, as well as the inability to reach a consensus on the source of frescoes in traditional manual recognition methods, more scientific and convincing identification of the source areas of frescoes should be carried out. Therefore, a method that can scientifically and effectively identify the source of frescoes is particularly important. In this paper, by collecting a large number of overseas Chinese frescoes in Jiangmen and Haikou areas, using the pretrained ResNet-34 network model and integrating transfer learning, a ResNet-34 model integrating transfer learning that can effectively identify the areas to which the frescoes belong is proposed, and the task of identifying and classifying overseas Chinese frescoes is completed. This research is a cross application of artificial intelligence technology and cultural anthropology. By identifying and categorizing the frescoes of overseas Chinese living in Haikou and Jiangmen, this study investigates the reasons for the style of frescoes drawn by overseas Chinese under the influence of Nanyang and North American cultures, as well as the proportions that are suitable for local cultural elements. This study can realize the practical and cultural significance of the protection and restoration of overseas Chinese cultural heritage a hundred years ago. Compared with the manual data processing of traditional anthropological field research, this study strengthens the depth and breadth of cultural research and has strong innovation and practicality.

2. Related Theories

2.1. Convolutional Neural Network (CNN)

CNN is one of the magnificent neural networks in the realm of deep learning technology [16,17,18,19,20]. This method was first put forward by Lecun et al. [21]. In recent years, CNN has been quickly updated and applied. CNN is the core of deep learning algorithms. Its name and structure are inspired by the human brain and mimic the way biological neurons transmit signals to each other. Generally, this includes the input convolution normalized activation pool, full connection softmax output, and other operations. Lenet marked the official debut of CNN [21], with Alexnet and VGG [22,23]. Now, ResNet is universally accepted [24], and CNN continues to improve and has been extensively applied in various fields [25,26,27,28,29,30].

2.2. Convolution Layer

The convolution layer of the convolutional neural network contains two dimensions, namely, height and width, which are commonly used in two-dimensional convolution operations. In general, convolution operations use a two-dimensional kernel array (also known as convolution kernel) to input data and obtain new two-dimensional data. The convolution kernel then moves over the input data, with each move representing one convolution operation. Through continuous convolution operations on the input data, a feature of the data is extracted, and X features are extracted from X convolution nuclei. The operation principle of convolution is shown in Formula (1), where X is the input matrix, and W is the size of the convolution kernel.

S_{(i, j)} = {(X * W)}_{(i, j)} = \sum_{m} \sum_{n} x_{(i + m, j + n)} w_{(m, n)}

(1)

2.3. ResNet-34 Network

Residual network is a CNN model proposed by four researchers. It has a good effect in image classification and target recognition [24]. The deeper the network model is, the more information can be obtained, and the richer the characteristics are. Data test analysis shows that with the deepening of the deep learning network, the model optimization effect becomes worse, and the accuracy of test data and training data also decreases accordingly. If the gradients between layers are between 0 and 1 and the layers are reduced, the gradients will disappear. On the contrary, if the gradient transferred layer by layer is greater than 1, then the gradient explosion will occur after layer-by-layer expansion. Therefore, a simple stacking layer will inevitably lead to network degradation. In order to make the deeper network train better, He et al. proposed a new network structure, namely, ResNet [31]. The advantage of the residual network is that it can alleviate the problem of gradient disappearance in a neural network. ResNet-34 is used as the primary network in this article. Table 1 is the specific parameter table of ResNet34 [24].

The pooling layer is mainly used for dimensionality reduction, which improves the fault tolerance of the model by reducing the number of parameters. Maximum pooling selects the maximum feature value in the region, which can better retain the texture features; average pooling selects the average characteristic value in the region, which can better retain the background characteristics. The calculation of average and maximum pooling is shown in Figure 1.

2.4. Image Feature Extraction

The frescoes of overseas Chinese residences are rich in color, and there are significant differences in the color expression of frescoes in the Jiangmen and Haikou regions. This article will use the histogram method to extract the color features of images in frescoes and analyze the color features in fresco images through calculations of different color ratios. The definition of the color histogram is shown in Formula (2):

H(m) = nm/N; m = 0, 1, …, L − 1

(2)

In Formula (2), m is the grayscale level to which the pixel belongs; nm represents the number of grayscale pixels; N is the total number of pixels; and L is the total number of grayscale levels. Due to the fact that fresco images are drawn on walls, the texture of frescoes is more complex compared to that of typical natural images. This study uses local binary patterns (LBP) to calculate the texture features of frescoes. The LBP algorithm can maintain the unchanged characteristics of the image under grayscale transformation operations, and it can provide more than 90% of the features of fresco images. The LBP algorithm is defined in Formula (3):

LBP (xc, yc) = \sum_{p = 0}^{p - 1} 2^{p} (i_{p} - i_{c})

(3)

In Formula (3), (xc, yc) represents the central element in the neighborhood, with a pixel value of

i_{c}

; the pixel values of other elements in the neighborhood are

i_{p}

; p is the number of central elements; s(x) represents the symbol operator; and s(x) is defined in Formula (4).

s (x) = \{\begin{matrix} 1, x \geq 0 \\ 0, x < 0 \end{matrix}

(4)

2.5. Transfer Learning

In most tasks in machine learning, deep learning, and data mining, we assume that the data used in training and inference follow the same distribution and come from the same feature space. However, in practical application, this hypothesis is difficult to establish, and some problems are often encountered: (1) the number of labeled training samples is limited; and (2) the data distribution will change. At this time, the transfer learning method is a good choice, that is, the knowledge in domain B is transferred to domain A, the classification effect of domain A is improved, and there is no need to spend a lot of time to label the data in domain A. Due to the limited number of frescoes in this study, which is quite different from the 10 million data sample size in ImageNet, it is difficult to train the deep network model. To achieve better training results, an important premise is to have enough data support. In order to solve this problem, we can use the method of transfer learning [32]. Because the training model parameters have a strong feature migration ability, they can be directly introduced when extracting features from other datasets, which cannot only improve the efficiency of network model development but also strengthen the model performance and accelerate the training process. This paper adopts a fine-tuning migration strategy. During the training process, only the softmax layer is changed, and other layers load the weight parameters that ResNet-34 has trained in the ImageNet dataset.

3. Materials and Methods

3.1. Study Area and Datasets

Jiangmen is a famous homeland of diaspora Chinese in Guangdong province, China, while Haikou is also an ancestral homeland for overseas Chinese. The geographical locations and sampling points of the two cities are shown in Figure 2. The experimental dataset of this study was captured and collected by a high-resolution SLR camera and high-definition smart phone. The classification of frescoes was studied using the styles and colors of fresco patterns from different regions as sample features.

The data of the experiment were collected from the Jiangmen overseas Chinese architectures. Its architectural style is a combination of Chinese and Western styles, mainly baroque style, imitation Renaissance style, Roman arcade style, Ionian column style, and a small amount of south China traditional style. Jiangmen architecture constitutes the historical imprint and urban memory of the changes in the Jiangmen urban architectural landscape caused by overseas Chinese investing in real estate in their homelands. The experimental data of Haikou were collected in the old arcade street of Haikou. The mural history reflected in the architectural heritage of the arcades in Haikou is closely related to the early overseas Chinese in Haikou who went overseas to make a living. Overseas Chinese who made a living in Namyang returned home to invest and build on a large scale. The Haikou arcade has a strong Namyang style. With an area of 25,000 square kilometers, there are nearly 500 arcade buildings. In this study, a total of 2385 overseas Chinese architectural frescoes was collected in the Jiangmen and Haikou regions.

3.2. Data Preprocessing

Data preprocessing can eliminate duplicates, errors, and poor-quality images. To facilitate the construction of CNN and network training, the sample image size is reset to 224 × 224 pixel. Generally, the larger the amount of training data, the more accurate the recognition rate of the system. Therefore, to obtain a successful neural network, a large number of parameters is needed. However, in practice, it is difficult to find a large number of original data that can be used for training. In this case, it is necessary to expand the data before providing it to the model, that is, data enhancement.

Dataset enhancement is mainly carried out to increase the amount of training data, improve the generalization ability and robustness of the model, and reduce the overfitting phenomenon of the network. The data enhancement method adopted in this paper (as shown in Figure 3) mainly included Gaussian noise [33], contrast enhancement [34], image sharpening [35], and image rotation [36], and the number of database images was expanded to a total of 11,380 images. We constructed the Jiangmen Haikou fresco image dataset (JHD). The cross-validation method was used for dataset division. For the preprocessed dataset, the images of each category were randomly divided into three parts, 80% of which were the training set, 10% of which were the test set, and 10% of which were the verification set. CNN is the most popular neural network model for image classification. Image classification is given a group of images marked with a single category. We needed to predict what category they were for a new group of test images and measure the accuracy of the prediction. This research had two different styles of wall painting. We used ResNet-34 to train the model and identify the image features. After fine-tuning the parameters, our model could better classify the architectural frescoes.

3.3. Classification Model of Expatriate Frescoes Integrating Transfer Learning

Due to the low quality and small quantity of fresco images, in order to extract the characteristics of fresco images in depth on the JHD fresco dataset in this study, the model in this paper will conduct pretraining on the large ImageNet dataset and apply the knowledge learned from transfer learning to the JHD fresco dataset so as to identify and classify the fresco images. The classification model for overseas Chinese fresco images proposed in this article includes feature extraction and classification sections. The feature extraction section uses a convolutional neural network, a color histogram, and an LBP texture feature histogram. The classification section is the softmax layer. The classification model is shown in Figure 4.

In Figure 4, the proposed classification model of expatriate frescoes integrating transfer learning is mainly divided into three parts for regional classification of frescos. Firstly, a pretrained ResNet-34 model is adapted to extract high-dimensional features from frescos. In order to better express the features extracted from the front-end convolutional layer, three consecutive fully connected layers are used to extract the deep features of the fresco image. Then, the color histogram is used to extract the color features of the fresco, and the LBP texture histogram is used to extract the texture features of the fresco image. Finally, the high-dimensional features extracted from the pretrained model are fused with artistic features to generate feature vectors as the required output nodes in the softmax layer.

3.4. Improved Classification Model for Overseas Chinese Frescoes

3.4.1. Integrating Transfer Learning to Enhance the Stability of the Model

Due to the local characteristics of existing overseas Chinese frescos, as well as the problems of limited quantity, severe damage, and poor quality of fresco images, the collection and preprocessing of overseas Chinese fresco images are relatively difficult. We need to collect fresco images from different cities for regional classification of overseas Chinese fresco images, which makes it more difficult to collect and organize a great number of data. To optimize the result of the model training and better withdraw the deep features of the fresco image, and to overcome the instability of the model caused by the complexity of fresco features and the cliff problem in the process of feature extraction, this method is based on the ResNet-34 model and integrates transfer learning. The purpose of transfer learning is to transfer valuable information learned in one field to another, so that the final classification results will not be affected by the changes of fresco image pixels.

The method of integrating transfer learning is to pretrain the ResNet-34 model on the large dataset ImageNet, extract the shallow features of the image, and then apply the transfer learning knowledge as the output of the model bottleneck layer to the JHD fresco dataset. This model freezes the convolutional layers before the fully connected and softmax layers of the ResNet-34 model, trains a new fully connected and softmax layer for deep extraction of image features from frescos, and completes the network training and fresco classification tasks.

3.4.2. Introducing Cross Entropy Function to Stabilize Model Gradient

To solve the vanishing gradient problem of the model, this study uses the loss function of the cross entropy function and the softmax function. It can effectively solve problems such as slow or stagnant weight updates of hidden layers caused by the phenomenon of model gradient vanishing. Cross entropy represents the distance between the actual and desired outputs of the model. The smaller the value of cross entropy, the closer the actual output and the expected result of the model are, and the better the effect is. In the process of backpropagation, the cross-entropy value of the training process is output, which can be used to judge whether the model is overfitting.

3.4.3. Increasing the Number of Fully Connected Layers to Enhance Image Feature Expression

When using the original network model directly to extract fresco images of overseas Chinese, there is often a problem of insufficient image feature extraction. On the pretrained ResNet-34 model during this experiment, after fine-tuning the parameters of all layers, in order to better learn and express the high-dimensional image features extracted by the front-end network, three consecutive fully connected layers were constructed after the bottleneck layer of the network model. To avoid gradient dispersion issues, the softmax layer is selected to classify image features.

3.5. Classification Process of Overseas Chinese Architectural Frescoes

The framework for regional classification of overseas Chinese architectural frescoes by the ResNet-34 model integrating transfer learning is shown in Figure 5, which is mainly divided into the following six stages.

Stage 1: Fresco image preprocessing stage. The input data for this stage comprise the original image dataset of overseas Chinese architectural frescos, and the output data comprise the training set, testing set, and validation set of the frescos. The specific steps are as follows: (1) Modify the size of each fresco image in the original dataset, and the unified format is 224 × 224 pixels, eliminating duplicates, errors, and poor quality images. (2) Expand the image dataset using preprocessing methods such as Gaussian noise, salt and pepper noise, histogram equalization, and rotations of the image to obtain the JHD fresco dataset. (3) Using a random function on the JHD dataset images, 80% of the fresco images were used as the training set, 10% of the fresco images were used as the test set, and 10% of the fresco images were used as the validation set.

Stage 2: Model pretraining stage. At this stage, the input data constitute the training set, and the output data constitute the transfer model. The specific steps are as follows: (1) Train on a large dataset ImageNet and pretrain the ResNet-34 model. (2) Adjust the parameters of the model slightly, and record the changes of learning rate and accuracy rate at different iterations. (3) Train the training set of fresco images to obtain the trained ResNet-34 model. (4) Obtain the migration model.

Stage 3: Image art feature extraction stage. At this stage, the input data comprise the training set, and the output data are the artistic features of fresco images. The specific steps are as follows: (1) Use the color histogram algorithm to extract the color features of fresco images. (2) Use the LBP texture histogram algorithm to extract texture features of fresco images. (3) Obtain the artistic features of the fresco.

Stage 4: Feature fusion stage. At this stage, the input data are high-level features and artistic features of frescos, while the output data are fusion features of fresco images. The specific steps are as follows: (1) Obtain the deep features of the frescos extracted from the pretrained model. (2) Obtain color and texture features of the frescos. (3) Integrate deep features, color features, and texture features to obtain artistic features.

Stage 5: Model testing stage. At this stage, the input is the test set, and the output is the test accuracy. The steps are as follows: (1) Import the test set into the pretrained transfer model. (2) Use the statistical classification results to obtain the final accuracy rate.

Stage 6: Model validation stage. At this stage, the input is a validation set, and the output is obtained to verify the accuracy of fresco image classification. The steps are as follows: (1) Import the validation set into the pretrained transfer model; (2) obtain statistical validation of the results.

4. Results and Discussion

4.1. Experimental Environment

In this paper, the hardware environment of this experiment was as follows: an Intel Core i7-9700 CPU processor, 16 GB of RAM, and an NVIDIA GeForce RTX 3090 GPU. The SPM12 and CAT12 toolkits in MATLAB R2016b were used for image preprocessing. The network model was built using the open-source deep learning framework Python, the Python library version was Python 3.8, and pytorch 1.9 was used to build the model with an input image size of 224 × 224 × 3 pixels. The principle of hyperparameters is to design a nested learning process where one learning algorithm finds the optimal hyperparameters for another learning algorithm. In this article, 80% of the dataset was used for training, 10% for testing, and 10% for validation. Training set data were used for training, and test and validation set data were used for training hyperparameters. Based on the characteristics of our dataset and model, the hyperparameter settings were as follows: epochs = 100, Batch_size = 32, Initial Learning rate Learning_Rate = 0.01, Weight_ Decay = 0.0001, Dropout = 0.5.

4.2. Evaluation Index

The accuracy was used to evaluate the test results. The calculation formula of the accuracy rate is shown in Formula (5):

Accuracy = (TP + TN) / (TP + FP + FN + TN)

(5)

Among them, TP (true positions) is the number of correctly divided positive examples; FP (false positions) is the number of samples that are incorrectly divided into positive cases; FN (false negatives) is the number of samples that are incorrectly divided into negative cases; and TN (true negatives) is the number of samples that are correctly divided into negative cases.

4.3. Results and Analysis

4.3.1. Model Training and Validation

The number of iterations is the number of times the model iterates over the training set. Too few iterations will result in underfitting the model, while too many iterations will result in overfitting the model. In this study, an adaptive algorithm was used to adjust the number of iterations. The number of iterations was dynamically adjusted based on the model’s performance on the validation set. In this experiment, after training and testing the model many times, the number of iterations of network training epoch was set to 100, the batch_size was set to 32, the Adam optimizer was used to speed up the convergence of the model, the learning rate was set to 0.0001, and the cross entropy loss function was used to realize the feature extraction and classification of arcade patterns. As can be seen in Table 2, the learning rate was 0.0001, and the research model had better performance. Figure 6 shows the changes of accuracy and cross entropy in the training process. From Figure 6, it can be seen that during the model training process, the training accuracy continued to increase. After reaching 60 training times, the accuracy of the model tended to stabilize, reaching around 98%. It can be seen from the cross entropy in Figure 6 that with an increase in training times, the cross entropy kept decreasing and tended to be stable after 60 times. In summary, the model in this article had good performance during the training process and was not prone to overfitting.

The learning rate is a scale factor that adjusts the weight during training. Too large a learning rate will cause the model fluctuation to fail to converge, while too small a learning rate will make the model converge too slowly, wasting training time and computing resources. In general, the initial learning rate can be set to 0.01, and if the model training is unstable, one can try to lower the learning rate. In this experiment, when the number of iteration steps was the same, the learning rate was set to 0.0001, 0.001, and 0.01, respectively, for multiple groups of experiments. Finally, the experimental results were statistically analyzed. The comparison of accuracy under different learning rates is shown in Table 2. As can be seen from Table 2, when the learning rate (LR) was 0.0001, the model in this paper showed good performance, and the final accuracy was as high as 98.41%. Compared to the two groups of experiments with learning rates of 0.001 and 0.0001, the accuracy of this model was improved by 8.61% and 4.04%, respectively.

4.3.2. Comparison of Different Fresco Features

In the experiment of identifying fresco areas where overseas Chinese reside, the color features, texture features, and painting style of fresco images will have a significant impact on the experimental results. The painting style of frescos is a reflection of the culture of the time, and it is also based on color characteristics. The recognition accuracy between regions with significant differences in fresco color and texture is higher. Based on the above research, this section conducts comparative experiments from two aspects.

(1) In order to verify the impact of the color features of fresco images on recognition performance, a portion of the fresco images were selected for color adjustment, including increasing grayscale values, increasing saturation, and inverting color transformations, before proceeding with fresco region recognition. Figure 7 is an example of color adjustment for a fresco image. In Figure 7a, the image used in the experiment was a 224 × 224 pixel size original fresco image, whereas in Figure 7b, the image increased the grayscale value based on Figure 7a, and in Figure 7c the image increased the saturation value based on Figure 7a,d, where the image underwent an inverse color transformation based on Figure 7a.

Table 3 represents the probability that the fresco images of overseas Chinese residences are correctly recognized as preset regional labels. From Table 3, it can be seen that after adjusting the color of the fresco image, the final recognition accuracy decreased. The accuracy decreased by 61.53, 1.76, and 27.15 percentage points after increasing the grayscale value, saturation, and inverse transformation, respectively. The above experiments indicate that after the loss of some color features in overseas frescos, the color histogram did not extract the rich color features of the fresco image, resulting in poor learning and classification of features when recognizing regions.

(2) In order to verify the impact of texture features on the recognition effect of fresco images, and considering that the change in image resolution directly affects the calculation of texture features, a portion of fresco images was selected for resolution adjustment before region recognition. Due to the influence of image resolution on the texture features of the image, this experiment expanded the resolution of the original image to 2 and 4 times, respectively, and applied them to the network model in this paper to determine the accuracy of the image being recognized as the correct region.

Table 4 represents the probability that overseas Chinese architectural fresco images are correctly recognized as preset area labels. From Table 4, it can be seen that as the image resolution increases, the texture features of the image become more blurry, and the final recognition accuracy also decreases.

From Table 3 and Table 4, it can be seen that the color features of the image have a significant impact on the accuracy of the final image recognition, while the texture features of the image have a relatively small impact. From this, it can be concluded that color features play a decisive role in the recognition and classification experiments of fresco regions in this article.

4.3.3. Comparison of Performance between Different Models

In order to better reflect the advantages of the ResNet model and transfer learning fusion method in arcade area recognition and classification, this paper compared the accuracy of this model with the classical deep learning network model. The classical deep learning network models include AlexNet [37], GoogLeNet [38], and VGGNet [39]. The comparative experiment of this experimental model adopted the same configuration of software and hardware environments and used the same super parameter setting and the same image preprocessing method in data training. Finally, the accuracy and recall rates were used to analyze the classification results of different models. The accuracy rate was determined for our prediction results, which indicated how many of the predicted positive samples were truly positive samples. The recall rate was determined for our original sample, which indicated how many positive examples in the sample were predicted correctly. As shown in Table 5, the accuracy rate of transfer learning and ResNet-34 in this paper was 98.41, and the recall rate was 98.53. Compared with AlexNet-10 and AlexNet-S6, the accuracy rates were 85.51% and 86.68%, and the recall rates were 83.79% and 84.55%. The accuracy rate of the GoogLeNet network was 90.29%, and the recall rate was 89.61%. The accuracy rate of the VGGNet-16 network was 89.26%, and the recall rate was 88.15%. It can be seen from the above table that our model used in this paper had good performance in classification of arcade decoration images. The above experimental research shows the correctness of this idea and the effectiveness of the method.

As can be seen from the results discussed above, although AlexNet, GoogLeNet, and VGGNet models have high accuracy in deep learning image processing and great advantages in image processing, the results of the small data sample test lacking transfer learning are obviously inferior to the results of our model test in this paper. The ResNet-34 model integrated with transfer learning in this paper has the following two advantages: (1) The network model is deeper, can extract more image features, and has better performance in image recognition; and (2) when the dataset is small, ideal results can also be trained.

5. Conclusions

In order to solve the problems of small number, poor quality, difficulty in feature extraction, and similarity between pattern text and painting style of fresco images on overseas Chinese buildings, this paper proposed a ResNet-34 model and method integrating transfer learning and applied it to the recognition and classification of overseas Chinese architectural frescos. This model adopted the method of transfer learning, trained on the large dataset ImageNet, and obtained the transfer model. It trained the training set of fresco images to obtain the trained ResNet-34 model. It solved the training problem caused by the small amount of data in the overseas Chinese architectural fresco dataset. The dataset was expanded by data enhancement and an expansion algorithm. Finally, the classification accuracy on the test set was 98 41%, which shortened the data operation time, and it could extract the features of architectural images and classify them. Compared with the classical convolution neural network, each evaluation index was better than AlexNet, GoogLeNet, VGGNet and other classical models. Compared with AlexNet-10, AlexNet-S6, GoogLeNet, and VGGNet-16 network models, the accuracy of our model for overseas Chinese hometown building recognition improved by 13, 11.73, 8.12, and 9.15 percentage points, respectively. The experimental results showed that the proposed model has stable recognition and classification performance, higher accuracy, and faster convergence. This research is innovative applied research that applies the most advanced computer research methods to architectural frescos of overseas Chinese residences. The classification of architectural ornamentation containing cultural information in the Wuyi Overseas Chinese cultural area and the Qionglei cultural area is carried out to achieve accurate positioning of the sources of other ornamentation images and cultural areas. Finally, the research results of this paper can provide technical support for digital storage, protection, and dissemination of overseas Chinese architectural culture tracing and pattern evolution. More importantly, the network model and database can provide data support for the cultural and historical background of overseas Chinese searching for roots and repairing ancestral homes. In the experiment, due to small differences between the hardware environment and the painting style and color of some overseas Chinese architectural fresco images, the model in this paper cannot extract good color features for the color gradient and cliffs of frescos. In future research, in response to potential challenges and the current small amount of data, we will continue to expand the JHD image dataset, conduct further research based on the characteristics of frescos themselves, and improve classification accuracy, making the regional classification of overseas architectural frescos more rapid and effective.

Author Contributions

Conceptualization, L.G. and T.Y.; methodology, L.G. and B.W.; software, X.Z.; validation, L.G., T.Y. and B.W.; formal analysis, L.G.; investigation, T.Y.; resources, L.G.; data curation, T.Y. and J.L.; writing—original draft preparation, L.G.; writing—review and editing, T.Y.; visualization, L.G.; supervision, L.G.; project administration, L.G.; funding acquisition, L.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (2022YFC3303200), the teaching reform project of Guangdong province (GDJX2020009), and the Guangdong province philosophy and social science planning discipline joint project (GD20XSH06).

Data Availability Statement

In this study, the authors used a publicly available dataset for analysis, ImageNet, which has been deposited on the website http://image-net.org/ (accessed on 27 August 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Gao, L.; Wu, Y.; Yang, T.; Zhang, X.; Zeng, Z.; Chan, C.K.D.; Chen, W. Research on Image Classification and Retrieval Using Deep Learning with Attention Mechanism on Diaspora Chinese Architectural Heritage in Jiangmen, China. Buildings 2023, 13, 275. [Google Scholar] [CrossRef]
Volpi, F.; Vagnini, M.; Vivani, R.; Malagodi, M.; Fiocco, G. Non-invasive identification of red and yellow oxide and sulfide pigments in wall-paintings with portable ER-FTIR spectroscopy. J. Cult. Herit. 2023, 63, 158–168. [Google Scholar] [CrossRef]
Saez-Hernandez, R.; Antela, K.U.; Gallello, G.; Cervera, M.; Mauri-Aucejo, A.R. A smartphone-based innovative approach to discriminate red pigments in roman frescoes mock-ups. J. Cult. Herit. 2022, 58, 156–166. [Google Scholar] [CrossRef]
Priego, E.; Herraez, J.; Denia, J.L.; Navarro, P. Technical study for restoration of mural paintings through the transfer of a photographic image to the vault of a church. J. Cult. Herit. 2022, 58, 112–121. [Google Scholar] [CrossRef]
Liu, Z.; Yang, R.; Wang, W.; Xu, W.; Zhang, M. Multi-analytical approach to the mural painting from an ancient tomb of Ming Dynasty in Jiyuan, China: Characterization of materials and techniques. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2022, 279, 121419. [Google Scholar] [CrossRef]
Lerme, N.; Hegarat-Mascle, S.L.; Zhang, B.; Aldea, E. Fast and efficient reconstruction of digitized frescoes. Pattern Recognit. Lett. 2020, 138, 417–423. [Google Scholar] [CrossRef]
Jiang, C.; Jiang, Z.; Shi, D. Computer-Aided Virtual Restoration of Frescoes Based on Intelligent Generation of Line Drawings. Math. Probl. Eng. 2022, 1, 9092765. [Google Scholar] [CrossRef]
Dondi, P.; Lombardi, L.; Setti, A. DAFNE: A dataset of fresco fragments for digital anastlylosis. Pattern Recognit. Lett. 2020, 138, 631–637. [Google Scholar] [CrossRef]
Cao, J.; Yan, M.; Jia, Y.; Tian, X. Application of inception-v3 model integrated with transfer learning in dynasty identification of ancient murals. J. Comput. Appl. 2021, 11, 3219–3227. [Google Scholar] [CrossRef]
Tang, D.; Lu, D.; Yang, B.; Xu, D. Similarity metrics between mural images with constraints of the overall structure of contours. J. Image Graph. 2013, 8, 968–975. [Google Scholar]
Teixeira, T.S.; Andrade, M.L.S.C.; Luz, M.R. Reconstruction of frescoes by sequential layers of feature extraction. Pattern Recognit. Lett. 2021, 147, 172–178. [Google Scholar] [CrossRef]
Su, H.; Gao, L.; Lu, Y.; Jing, H.; Hong, J.; Huang, L.; Chen, Z. Attention-guided cascaded network with pixel-importance-balance loss for retinal vessel segmentation. Front. Cell Dev. Biol. 2023, 11, 1196191. [Google Scholar] [CrossRef] [PubMed]
Gao, L.; Wang, K.; Zhang, X.; Wang, C. Intelligent identification and prediction mineral resources deposit based on deep learning. Sustainability 2023, 15, 10269. [Google Scholar] [CrossRef]
Najeeb, R.M.; Syed, A.R.A.; Usman, U.S.; Asma, C.; Nirvana, P. Cascading pose features with CNN-LSTM for multiview human action recognition. Signals 2023, 4, 40–55. [Google Scholar] [CrossRef]
Cedric, A.; Lionel, H.; Chiara, P.; Ondrej, M.; Olivier, C.; William, P.; Francesca, A.; Kiran, P.; Sophie, M. CNN-Based cell analysis: From image to quantitative representation. Front. Phys. 2022, 9, 776805. [Google Scholar] [CrossRef]
Lawrence, S.; Giles, C.L.; Tsoi, A.C.; Back, A.D. Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 1997, 1, 98–113. [Google Scholar] [CrossRef]
Xu, X.B.; Ma, F.; Zhou, J.M.; Du, C.W. Applying convolutional neural networks (CNN) for end-to-end soil analysis based on laser- induced breakdown spectroscopy (LIBS) with less spectral preprocessing. Comput. Electron. Agric. 2022, 199, 107171. [Google Scholar] [CrossRef]
Murugan, G.; Moyal, V.; Nandankar, P.; Pandithurai, O.; Pimo, E.S. A novel CNN method for the accurate spatial data recovery from digital images. Mater. Proc. 2021, 80, 1706–1712. [Google Scholar] [CrossRef]
Parrage-Alava, J.; Alcivar-Cevallos, R.; Riascos, J.A.; Becerra, M.A. Aphids detection on lemons leaf image using convolutional neural networks. Systems and Information Sciences. Adv. Intell. Syst. Comput. 2020, 1273, 16–27. [Google Scholar] [CrossRef]
Raki, H.; Gonzalez-Vergara, J.; Aalaila, Y.; Elhamdi, M.; Bamansour, S.; Guachi-Guachi, L.; Peluffo-Ordonez, D.H. Crop classification using deep learning: A quick comparative study of modern approaches. Applied Informatics. Commun. Comput. Inf. Sci. 2022, 1643, 31–44. [Google Scholar] [CrossRef]
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 11, 2278–2324. [Google Scholar] [CrossRef]
Alex, K.; Ilya, S.; Geoffrey, E.H. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Szegedy, C. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Smith, Y.; Zajicek, G.; Werman, M.; Pizov, G.; Sherman, Y. Similarity measurement method for the classification of architecturally differentiated images. Comput. Biomed. Res. Int. J. 1999, 32, 1–12. [Google Scholar] [CrossRef]
Akbarimajd, A.; Hoertel, N.; Hussain, M.A.; Neshat, A.A.; Marhamati, M.; Bakhtoor, M.; Momeny, M. Learning-to-augment incorporated noise-robust deep CNN for detection of COVID-19 in noisy X-ray images. J. Comput. Sci. 2022, 63, 101763. [Google Scholar] [CrossRef]
He, H.J.; Xu, H.Z.; Zhang, Y.; Gao, K.; Li, H.X.; Ma, L.F.; Li, J. Mask R-CNN based automated identification and extraction of oil well sites. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102875. [Google Scholar] [CrossRef]
Kim, Y.H.; Park, K.R. MTS-CNN: Multi-task semantic segmentation-convolutional neural network for detecting crops and weeds. Comput. Electron. Agric. 2022, 199, 107146. [Google Scholar] [CrossRef]
Polsinelli, M.; Cinque, L.; Placidig, G. A light CNN for detecting COVID-19 from CT scans of the chest. Pattern Recognit. Lett. 2020, 140, 95–100. [Google Scholar] [CrossRef] [PubMed]
Kabir, S.; Patidar, S.; Xia, X.; Liang, Q.H.; Neal, J.; Pender, G. A deep convolutional neural network model for rapid prediction of fluvial flood inundation. J. Hydrol. 2020, 590, 125481. [Google Scholar] [CrossRef]
He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Identity Mappings in Deep Residual Networks. Comput. Vis. ECCV 2016, 9908, 630–645. [Google Scholar] [CrossRef]
Lv, Y.Z.; Xue, J.N.; Duan, F.; Sun, Z.; Li, J.H. An exploratory study of transfer learning frameworks in the context of few available shots of neurophysiological signals. Comput. Electr. Eng. 2022, 101, 108091. [Google Scholar] [CrossRef]
Rafael, L.J.; Jose, M.M.; Casilari, E. Analytical and empirical evaluation of the impact of Gaussian noise on the modulations employed by Bluetooth Enhanced Data Rates. EURASIP J. Wirel. Commun. Netw. 2012, 2012, 1–11. [Google Scholar] [CrossRef]
Zhou, Z.; Shi, Z.; Ren, W. Linear contrast enhancement network for low-illumination image enhancement. IEEE Trans. Instrum. Meas. 2023, 72, 1–16. [Google Scholar] [CrossRef]
Pham, T.D. Kriging-weighted laplacian kernels for grayscale image sharpening. IEEE Access 2022, 10, 57094–57106. [Google Scholar] [CrossRef]
Liu, K.; Tian, Y.Z. Research and analysis of deep learning image enhancement algorithm based on fractional differential. Chaos Solitons Fractals 2020, 131, 109507. [Google Scholar] [CrossRef]
Singh, I.; Goyal, G.; Chandel, A. AlexNet architecture based convolutional neural network for toxic comments classification. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 7547–7558. [Google Scholar] [CrossRef]
Ak, A.; Topuz, V.; Midi, I. Motor imagery EEG signal classification using image processing technique over GoogLeNet deep learning algorithm for controlling the robot manipulator. Biomed. Signal Process. Control 2022, 72, 103295. [Google Scholar] [CrossRef]
Feng, S.; Zhao, L.; Shi, H.; Wang, M.; Shen, S.; Wang, W. One-dimensional VGGNet for high-dimensional data. Appl. Soft Comput. 2023, 135, 110035. [Google Scholar] [CrossRef]

Figure 1. Pool operation.

Figure 2. Geographical locations and sampling points of the research area.

Figure 3. Examples of image data enhancement.

Figure 4. Classification model of overseas Chinese frescoes integrating transfer learning.

Figure 5. The framework of the ResNet-34 model integrated with transfer learning to classify the expatriate frescoes.

Figure 6. Changes in accuracy and cross entropy during training.

Figure 7. Color adjustment of fresco images.

Table 1. ResNet34 parameter table.

Layer Name	Output Size	34-Layer
conv1	112 × 112	7 × 7, 64, stride 2
Max pooling	3 × 3	stride 2
conv2_x	56 × 56	$[\begin{matrix} 3 \times 3, & 64 \\ 3 \times 3, & 64 \end{matrix}] \times 3$
conv3_x	28 × 28	$[\begin{matrix} 3 \times 3, & 128 \\ 3 \times 3, & 128 \end{matrix}] \times$ 4
conv4_x	14 × 14	$[\begin{matrix} 3 \times 3, & 256 \\ 3 \times 3, & 256 \end{matrix}] \times 6$
conv5_x	7 × 7	$[\begin{matrix} 3 \times 3, & 512 \\ 3 \times 3, & 512 \end{matrix}] \times 3$
Global Average Pooling
Fully Connected Layer
Softmax

Table 2. Comparison of accuracy at different learning rates.

Number	LR	Accuracy
1	0.01	89.80%
2	0.001	94.37%
3	0.0001	98.41%

Table 3. Comparison of regional recognition accuracy for fresco images with different color features.

Color Feature	Accuracy
Original Image	98.41%
Increase Grayscale Value	36.88%
Saturation Add	96.65%
Inverse Transformation	71.26%

Table 4. Comparison of regional recognition accuracy of different resolutions of overseas Chinese fresco images.

Resolution Ratio	Accuracy
224 × 224	98.41%
448 × 448	91.58%
896 × 896	89.79%

Table 5. Performance comparison of different models.

Model	Accuracy	Recall Rate
AlexNet-10	85.41%	83.79%
AlexNet-S6	86.68%	84.55%
GoogLeNet	90.29%	89.61%
VGGNet-16	89.26%	88.15%
Ours	98.41%	98.53%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gao, L.; Zhang, X.; Yang, T.; Wang, B.; Li, J. The Application of ResNet-34 Model Integrating Transfer Learning in the Recognition and Classification of Overseas Chinese Frescoes. Electronics 2023, 12, 3677. https://doi.org/10.3390/electronics12173677

AMA Style

Gao L, Zhang X, Yang T, Wang B, Li J. The Application of ResNet-34 Model Integrating Transfer Learning in the Recognition and Classification of Overseas Chinese Frescoes. Electronics. 2023; 12(17):3677. https://doi.org/10.3390/electronics12173677

Chicago/Turabian Style

Gao, Le, Xin Zhang, Tian Yang, Baocang Wang, and Juntao Li. 2023. "The Application of ResNet-34 Model Integrating Transfer Learning in the Recognition and Classification of Overseas Chinese Frescoes" Electronics 12, no. 17: 3677. https://doi.org/10.3390/electronics12173677

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Application of ResNet-34 Model Integrating Transfer Learning in the Recognition and Classification of Overseas Chinese Frescoes

Abstract

1. Introduction

1.1. The Cultural and Historical Background of Overseas Chinese

1.2. Fresco Culture

1.3. Fresco Recognition and Classification Methods

1.4. Heritage Conservation and Cultural Significance

2. Related Theories

2.1. Convolutional Neural Network (CNN)

2.2. Convolution Layer

2.3. ResNet-34 Network

2.4. Image Feature Extraction

2.5. Transfer Learning

3. Materials and Methods

3.1. Study Area and Datasets

3.2. Data Preprocessing

3.3. Classification Model of Expatriate Frescoes Integrating Transfer Learning

3.4. Improved Classification Model for Overseas Chinese Frescoes

3.4.1. Integrating Transfer Learning to Enhance the Stability of the Model

3.4.2. Introducing Cross Entropy Function to Stabilize Model Gradient

3.4.3. Increasing the Number of Fully Connected Layers to Enhance Image Feature Expression

3.5. Classification Process of Overseas Chinese Architectural Frescoes

4. Results and Discussion

4.1. Experimental Environment

4.2. Evaluation Index

4.3. Results and Analysis

4.3.1. Model Training and Validation

4.3.2. Comparison of Different Fresco Features

4.3.3. Comparison of Performance between Different Models

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI