Special Issue "Advances in Computer Vision, Pattern Recognition, Machine Learning and Symmetry"

A special issue of Symmetry (ISSN 2073-8994). This special issue belongs to the section "Computer".

Deadline for manuscript submissions: 31 December 2023 | Viewed by 28695

Special Issue Editors

Institute of Systems and Robotics, University of Coimbra, 3004-531 Coimbra, Portugal
Interests: rehabilitation robotics; assistive robotics; medical engineering; applied machine learning
Special Issues, Collections and Topics in MDPI journals
Center for MicroElectroMechanical Systems, University of Minho, 4800-058 Guimarães, Portugal
Interests: human motion; human locomotion; human–robot interactions and collaboration; medical devices; neuro-rehabilitation of patients suffering from motor problems by means of bio-inspired robotics and neuroscience technologies
Special Issues, Collections and Topics in MDPI journals
1. Institute of Systems and Robotics, University of Coimbra, 3030-290 Coimbra, Portugal
2. Polytechnic Institute of Tomar, 2300-313 Tomar, Portugal
Interests: human-machine interface; brain-computer interface; biosignal processing; assistive robotics

Special Issue Information

Dear Colleagues,

Machine intelligence is no longer a science fiction utopia but rather a very present reality. It has been evolving rapidly within the fields of computer vision, pattern recognition, machine learning, and symmetry. It is a daunting task to try and keep up with the abundance of new publications that present the most recent advancements within each field. As such, this Special Issue is dedicated to presenting and aggregating recent advancements in these research fields, spread across a universe of applications, such as industry, medicine, robotics, biotechnology, mechanical engineering, and others, as well as in fundamental and theoretical forms.

Please note that all submitted papers must be within the general scope of the Symmetry journal.

Dr. João Ruivo Paulo
Dr. Cristina P. Santos
Dr. Gabriel Pires
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Symmetry is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • pattern recognition
  • machine learning
  • symmetry
  • machine intelligence applied in industry, medicine, and biotechnology
  • intelligence in biomedical engineering
  • intelligent robotic systems
  • autonomous driving systems
  • data mining

Published Papers (25 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Symmetric Graph-Based Visual Question Answering Using Neuro-Symbolic Approach
Symmetry 2023, 15(9), 1713; https://doi.org/10.3390/sym15091713 - 07 Sep 2023
Viewed by 300
Abstract
As the applications of robots expand across a wide variety of areas, high-level task planning considering human–robot interactions is emerging as a critical issue. Various elements that facilitate flexible responses to humans in an ever-changing environment, such as scene understanding, natural language processing, [...] Read more.
As the applications of robots expand across a wide variety of areas, high-level task planning considering human–robot interactions is emerging as a critical issue. Various elements that facilitate flexible responses to humans in an ever-changing environment, such as scene understanding, natural language processing, and task planning, are thus being researched extensively. In this study, a visual question answering (VQA) task was examined in detail from among an array of technologies. By further developing conventional neuro-symbolic approaches, environmental information is stored and utilized in a symmetric graph format, which enables more flexible and complex high-level task planning. We construct a symmetric graph composed of information such as color, size, and position for the objects constituting the environmental scene. VQA, using graphs, largely consists of a part expressing a scene as a graph, a part converting a question into SPARQL, and a part reasoning the answer. The proposed method was verified using a public dataset, CLEVR, with which it successfully performed VQA. We were able to directly confirm the process of inferring answers using SPARQL queries converted from the original queries and environmental symmetric graph information, which is distinct from existing methods that make it difficult to trace the path to finding answers. Full article
Show Figures

Figure 1

Article
Efficient DCNN-LSTM Model for Fault Diagnosis of Raw Vibration Signals: Applications to Variable Speed Rotating Machines and Diverse Fault Depths Datasets
Symmetry 2023, 15(7), 1413; https://doi.org/10.3390/sym15071413 - 14 Jul 2023
Viewed by 473
Abstract
Bearings are the backbone of industrial machines that can shut down or damage the whole process when a fault occurs in them. Therefore, health diagnosis and fault identification in the bearings are essential to avoid a sudden shutdown. Vibration signals from the rotating [...] Read more.
Bearings are the backbone of industrial machines that can shut down or damage the whole process when a fault occurs in them. Therefore, health diagnosis and fault identification in the bearings are essential to avoid a sudden shutdown. Vibration signals from the rotating bearings are extensively used to diagnose the health of industrial machines as well as to analyze their symmetrical behavior. When a fault occurs in the bearings, deviations from their symmetrical behavior can be indicative of potential faults. However, fault identification is challenging when (1) the vibration signals are recorded from variable speeds compared to the constant speed and (2) the vibration signals have diverse fault depths. In this work, we have proposed a highly accurate Deep Convolution Neural Network (DCNN)–Long Short-Term Memory (LSTM) model with a SoftMax classifier. The proposed model offers an innovative approach to fault diagnosis, as it obviates the need for preprocessing and digital signal processing techniques for feature computation. It demonstrates remarkable efficiency in accurately diagnosing fault conditions across variable speed vibration datasets encompassing diverse fault conditions, including but not limited to outer race fault, inner race fault, ball fault, and mixed faults, as well as constant speed datasets with varying fault depths. The proposed method can extract the features automatically from these vibration signals and, hence, are excellent to enhance the performance and efficiency to diagnose the machine’s health. For the experimental study, two different datasets—the constant speed with different fault depths and variable speed rotating machines—are considered to validate the performance of the proposed method. The accuracy achieved for the variable speed rotating machine dataset is 99.40%, while for the diverse fault dataset, the accuracy reaches 99.87%. Furthermore, the experimental results of the proposed method are compared with the existing methods in the literature as well as the artificial neural network (ANN) model. Full article
Show Figures

Figure 1

Article
A Convolutional Recurrent Neural-Network-Based Machine Learning for Scene Text Recognition Application
Symmetry 2023, 15(4), 849; https://doi.org/10.3390/sym15040849 - 02 Apr 2023
Cited by 2 | Viewed by 1368
Abstract
Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually [...] Read more.
Optical character recognition (OCR) is the process of acquiring text and layout information through analysis and recognition of text data image files. It is also a process to identify the geometric location and orientation of the texts and their symmetrical behavior. It usually consists of two steps: text detection and text recognition. Scene text recognition is a subfield of OCR that focuses on processing text in natural scenes, such as streets, billboards, license plates, etc. Unlike traditional document category photographs, it is a challenging task to use computer technology to locate and read text information in natural scenes. Imaging sequence recognition is a longstanding subject of research in the field of computer vision. Great progress has been made in this field; however, most models struggled to recognize text in images of complex scenes with high accuracy. This paper proposes a new pattern of text recognition based on the convolutional recurrent neural network (CRNN) as a solution to address this issue. It combines real-time scene text detection with differentiable binarization (DBNet) for text detection and segmentation, text direction classifier, and the Retinex algorithm for image enhancement. To evaluate the effectiveness of the proposed method, we performed experimental analysis of the proposed algorithm, and carried out simulation on complex scene image data based on existing literature data and also on several real datasets designed for a variety of nonstationary environments. Experimental results demonstrated that our proposed model performed better than the baseline methods on three benchmark datasets and achieved on-par performance with other approaches on existing datasets. This model can solve the problem that CRNN cannot identify text in complex and multi-oriented text scenes. Furthermore, it outperforms the original CRNN model with higher accuracy across a wider variety of application scenarios. Full article
Show Figures

Figure 1

Article
Cross-Correlation Fusion Graph Convolution-Based Object Tracking
Symmetry 2023, 15(3), 771; https://doi.org/10.3390/sym15030771 - 21 Mar 2023
Viewed by 850
Abstract
Most popular graph attention networks treat pixels of a feature map as individual nodes, which makes the feature embedding extracted by the graph convolution lack the integrity of the object. Moreover, matching between a template graph and a search graph using only part-level [...] Read more.
Most popular graph attention networks treat pixels of a feature map as individual nodes, which makes the feature embedding extracted by the graph convolution lack the integrity of the object. Moreover, matching between a template graph and a search graph using only part-level information usually causes tracking errors, especially in occlusion and similarity situations. To address these problems, we propose a novel end-to-end graph attention tracking framework that has high symmetry, combining traditional cross-correlation operations directly. By utilizing cross-correlation operations, we effectively compensate for the dispersion of graph nodes and enhance the representation of features. Additionally, our graph attention fusion model performs both part-to-part matching and global matching, allowing for more accurate information embedding in the template and search regions. Furthermore, we optimize the information embedding between the template and search branches to achieve better single-object tracking results, particularly in occlusion and similarity scenarios. The flexibility of graph nodes and the comprehensiveness of information embedding have brought significant performance improvements in our framework. Extensive experiments on three challenging public datasets (LaSOT, GOT-10k, and VOT2016) show that our tracker outperforms other state-of-the-art trackers. Full article
Show Figures

Figure 1

Article
MSG-Point-GAN: Multi-Scale Gradient Point GAN for Point Cloud Generation
Symmetry 2023, 15(3), 730; https://doi.org/10.3390/sym15030730 - 15 Mar 2023
Cited by 1 | Viewed by 938
Abstract
The generative adversarial network (GAN) has recently emerged as a promising generative model. Its application in the image field has been extensive, but there has been little research concerning point clouds.The combination of a GAN and a graph convolutional network has been the [...] Read more.
The generative adversarial network (GAN) has recently emerged as a promising generative model. Its application in the image field has been extensive, but there has been little research concerning point clouds.The combination of a GAN and a graph convolutional network has been the state-of-the-art method for generating point clouds. However, there is a significant gap between the generated point cloud and the point cloud used for training. In order to improve the quality of the generated point cloud, this study proposed multi-scale gradient point GAN (MSG-Point-GAN). The training of the GAN is a dynamic game process, and we expected the generation and discrimination capabilities to be symmetric, so that the network training would be more stable. Based on the concept of progressive growth, this method used the network structure of a multi-scale gradient GAN (MSG-GAN) to stabilize the training process. The discriminator of this method used part of the PointNet structure to resolve the problem of the disorder and rotation of the point cloud. The discriminator could effectively determine the authenticity of the generated point cloud. This study also analyzed the optimization process of the objective function of the MSG-Point-GAN. The experimental results showed that the training process of the MSG-Point-GAN was stable, and the point cloud quality was superior to other methods in subjective vision. From the perspective of performance metrics, the gap between the point cloud generated by the proposed method and the real point cloud was significantly smaller than that generated by other methods. Based on the practical analysis, the point cloud generated by the proposed method for training the point-cloud classification network was improved by about 0.2%, as compared to the original network. The proposed method provided a stable training framework for point cloud generation. It can effectively promote the development of point-cloud-generation technology. Full article
Show Figures

Graphical abstract

Article
Breast Cancer Diagnosis in Thermography Using Pre-Trained VGG16 with Deep Attention Mechanisms
Symmetry 2023, 15(3), 582; https://doi.org/10.3390/sym15030582 - 23 Feb 2023
Viewed by 1339
Abstract
One of the most prevalent cancers in women is breast cancer. The mortality rate related to this disease can be decreased by early, accurate diagnosis to increase the chance of survival. Infrared thermal imaging is one of the breast imaging modalities in which [...] Read more.
One of the most prevalent cancers in women is breast cancer. The mortality rate related to this disease can be decreased by early, accurate diagnosis to increase the chance of survival. Infrared thermal imaging is one of the breast imaging modalities in which the temperature of the breast tissue is measured using a screening tool. The previous studies did not use pre-trained deep learning (DL) with deep attention mechanisms (AMs) on thermographic images for breast cancer diagnosis. Using thermal images from the Database for Research Mastology with Infrared Image (DMR-IR), the study investigates the use of a pre-trained Visual Geometry Group with 16 layers (VGG16) with AMs that can produce good diagnosis performance utilizing the thermal images of breast cancer. The symmetry of the three models resulting from the combination of VGG16 with three types of AMs is evident in all its stages in methodology. The models were compared to state-of-art breast cancer diagnosis approaches and tested for accuracy, sensitivity, specificity, precision, F1-score, AUC score, and Cohen’s kappa. The test accuracy rates for the AMs using the VGG16 model on the breast thermal dataset were encouraging, at 99.80%, 99.49%, and 99.32%. Test accuracy for VGG16 without AMs was 99.18%, whereas test accuracy for VGG16 with AMs improved by 0.62%. The proposed approaches also performed better than previous approaches examined in the related studies. Full article
Show Figures

Figure 1

Article
MODeLING.Vis: A Graphical User Interface Toolbox Developed for Machine Learning and Pattern Recognition of Biomolecular Data
Symmetry 2023, 15(1), 42; https://doi.org/10.3390/sym15010042 - 23 Dec 2022
Viewed by 1084
Abstract
Many scientific publications that affect machine learning have set the basis for pattern recognition and symmetry. In this paper, we revisit the concept of “Mind-life continuity” published by the authors, testing the symmetry between cognitive and electrophoretic strata. We opted for machine learning [...] Read more.
Many scientific publications that affect machine learning have set the basis for pattern recognition and symmetry. In this paper, we revisit the concept of “Mind-life continuity” published by the authors, testing the symmetry between cognitive and electrophoretic strata. We opted for machine learning to analyze and understand the total protein profile of neurotypical subjects acquired by capillary electrophoresis. Capillary electrophoresis permits a cost-wise solution but lacks modern proteomic techniques’ discriminative and quantification power. To compensate for this problem, we developed tools for better data visualization and exploration in this work. These tools permitted us to examine better the total protein profile of 92 young adults, from 19 to 25 years old, healthy university students at the University of Lisbon, with no serious, uncontrolled, or chronic diseases affecting the nervous system. As a result, we created a graphical user interface toolbox named MODeLING.Vis, which showed specific expected protein profiles present in saliva in our neurotypical sample. The developed toolbox permitted data exploration and hypothesis testing of the biomolecular data. In conclusion, this analysis offered the data mining of the acquired neuroproteomics data in the molecular weight range from 9.1 to 30 kDa. This molecular weight range, obtained by pattern recognition of our dataset, is characteristic of the small neuroimmune molecules and neuropeptides. Consequently, MODeLING.Vis offers a machine-learning solution for probing into the neurocognitive response. Full article
Show Figures

Figure 1

Article
Image Virtual Viewpoint Generation Method under Hole Pixel Information Update
Symmetry 2023, 15(1), 34; https://doi.org/10.3390/sym15010034 - 23 Dec 2022
Viewed by 858
Abstract
A virtual viewpoint generation method is proposed to address the problem of low fidelity in the generation of virtual viewpoints for images with overlapping pixel points. Virtual viewpoint generation factors such as overlaps, holes, cracks, and artifacts are analyzed and preprocessed. When the [...] Read more.
A virtual viewpoint generation method is proposed to address the problem of low fidelity in the generation of virtual viewpoints for images with overlapping pixel points. Virtual viewpoint generation factors such as overlaps, holes, cracks, and artifacts are analyzed and preprocessed. When the background of the hole is a simple texture, pheromone information around the hole is used as the support, a pixel at the edge of the hole is detected, and the hole is predicted at the same time, so that the hole area is filled in blocks. When the hole background has a relatively complex texture, the depth information of the hole pixels is updated with the inverse 3D transformation method, and the updated area pheromone is projected onto the auxiliary plane and compared with the known plane pixel auxiliary parameters. The hole filling is performed according to the symmetry of the pixel position of the auxiliary reference viewpoint plane to obtain the virtual viewpoint after optimization. The proposed method was validated using image quality metrics and objective evaluation metrics such as PSNR. The experimental results show that the proposed method could generate virtual viewpoints with high fidelity, excellent quality, and a short image-processing time, which effectively enhanced the virtual viewpoint generation performance. Full article
Show Figures

Figure 1

Article
An Augmented Model of Rutting Data Based on Radial Basis Neural Network
Symmetry 2023, 15(1), 33; https://doi.org/10.3390/sym15010033 - 23 Dec 2022
Cited by 1 | Viewed by 957
Abstract
The rutting depth is an important index to evaluate the damage degree of the pavement. Therefore, establishing an accurate rutting depth prediction model can guide pavement design and provide the necessary basis for pavement maintenance. However, the sample size of pavement rutting depth [...] Read more.
The rutting depth is an important index to evaluate the damage degree of the pavement. Therefore, establishing an accurate rutting depth prediction model can guide pavement design and provide the necessary basis for pavement maintenance. However, the sample size of pavement rutting depth data is small, and the sampling is not standardized, which makes it hard to establish a prediction model with high accuracy. Based on the data of RIOHTrack’s asphalt pavement structure, this study builds a reliable data-augmented model. In this paper, different asphalt rutting data augmented models based on Gaussian radial basis neural networks are constructed with the temperature and loading of asphalt pavements as the main features. Experimental results show that the method outperforms classical machine learning methods in data augmentation, with an average root mean square error of 3.95 and an average R-square of 0.957. Finally, the augmented data of rutting depth is constructed for training, and multiple neural network models are used for prediction. Compared with unaugmented data, the prediction accuracy is increased by 50%. Full article
Show Figures

Figure 1

Article
Big Data Clustering Using Chemical Reaction Optimization Technique: A Computational Symmetry Paradigm for Location-Aware Decision Support in Geospatial Query Processing
Symmetry 2022, 14(12), 2637; https://doi.org/10.3390/sym14122637 - 13 Dec 2022
Viewed by 1019
Abstract
The emergence of geospatial big data has opened up new avenues for identifying urban environments. Although both geographic information systems (GIS) and expert systems (ES) have been useful in resolving geographical decision issues, they are not without their own shortcomings. The combination of [...] Read more.
The emergence of geospatial big data has opened up new avenues for identifying urban environments. Although both geographic information systems (GIS) and expert systems (ES) have been useful in resolving geographical decision issues, they are not without their own shortcomings. The combination of GIS and ES has gained popularity due to the necessity of boosting the effectiveness of these tools in resolving very difficult spatial decision-making problems. The clustering method generates the functional effects necessary to apply spatial analysis techniques. In a symmetric clustering system, two or more nodes run applications and monitor each other simultaneously. This system is more efficient than an asymmetric system since it utilizes all available hardware and does not maintain a node in a hot standby state. However, it is still a major issue to figure out how to expand and speed up clustering algorithms without sacrificing efficiency. The work presented in this paper introduces an optimized hierarchical distributed k-medoid symmetric clustering algorithm for big data spatial query processing. To increase the k-medoid method’s efficiency and create more precise clusters, a hybrid approach combining the k-medoid and Chemical Reaction Optimization (CRO) techniques is presented. CRO is used in this approach to broaden the scope of the optimal medoid and improve clustering by obtaining more accurate data. The suggested paradigm solves the current technique’s issue of predicting the accurate clusters’ number. The suggested approach includes two phases: in the first phase, the local clusters are built using Apache Spark’s parallelism paradigm based on their portion of the whole dataset. In the second phase, the local clusters are merged to create condensed and reliable final clusters. The suggested approach condenses the data provided during aggregation and creates the ideal clusters’ number automatically based on the dataset’s structures. The suggested approach is robust and delivers high-quality results for spatial query analysis, as shown by experimental results. The proposed model reduces average query latency by 23%. Full article
Show Figures

Figure 1

Article
Hypernetwork Representation Learning Based on Hyperedge Modeling
Symmetry 2022, 14(12), 2584; https://doi.org/10.3390/sym14122584 - 07 Dec 2022
Viewed by 809
Abstract
Most network representation learning approaches only consider the pairwise relationships between the nodes in ordinary networks but do not consider the tuple relationships, namely the hyperedges, among the nodes in the hypernetworks. Therefore, to solve the above issue, a hypernetwork representation learning approach [...] Read more.
Most network representation learning approaches only consider the pairwise relationships between the nodes in ordinary networks but do not consider the tuple relationships, namely the hyperedges, among the nodes in the hypernetworks. Therefore, to solve the above issue, a hypernetwork representation learning approach based on hyperedge modeling, abbreviated as HRHM, is proposed, which fully considers the hyperedges to obtain ideal node representation vectors that are applied to downstream machine learning tasks such as node classification, link prediction, community detection, and so on. Experimental results on the hypernetwork datasets show that with regard to the node classification task, the mean node classification accuracy of HRHM approach goes beyond other best baseline approach by about 1% on the MovieLens and wordnet, and with regard to the link prediction task, except for HPHG approach, the mean AUC value of HRHM approach surpasses that of other baseline approaches by about 17%, 18%, and 6%, respectively, on the GPS, drug, and wordnet. The mean AUC value of HRHM approach is very close to that of other best baseline approach on the MovieLens. Full article
Show Figures

Figure 1

Article
Interactive Image Segmentation Based on Feature-Aware Attention
Symmetry 2022, 14(11), 2396; https://doi.org/10.3390/sym14112396 - 12 Nov 2022
Viewed by 1245
Abstract
Interactive segmentation is a technique for picking objects of interest in images according to users’ input interactions. Some recent works take the users’ interactive input to guide the deep neural network training, where the users’ click information is utilized as weak-supervised information. However, [...] Read more.
Interactive segmentation is a technique for picking objects of interest in images according to users’ input interactions. Some recent works take the users’ interactive input to guide the deep neural network training, where the users’ click information is utilized as weak-supervised information. However, limited by the learning capability of the model, this structure does not accurately represent the user’s interaction intention. In this work, we propose a multi-click interactive segmentation solution for employing human intention to refine the segmentation results. We propose a coarse segmentation network to extract semantic information and generate rough results. Then, we designed a feature-aware attention module according to the symmetry of user intention and image semantic information. Finally, we establish a refinement module to combine the feature-aware results with coarse masks to generate precise intentional segmentation. Furthermore, the feature-aware module is trained as a plug-and-play tool, which can be embedded into most deep image segmentation models for exploiting users’ click information in the training process. We conduct experiments on five common datasets (SBD, GrabCut, DAVIS, Berkeley, MS COCO) and the results prove our attention module can improve the performance of image segmentation networks. Full article
Show Figures

Figure 1

Article
Fabric Surface Defect Detection Using SE-SSDNet
Symmetry 2022, 14(11), 2373; https://doi.org/10.3390/sym14112373 - 10 Nov 2022
Cited by 1 | Viewed by 1338
Abstract
For fabric defect detection, the crucial issue is that large defects can be detected but not small ones, and vice versa, and this symmetric contradiction cannot be solved by a single method, especially for colored fabrics. In this paper, we propose a method [...] Read more.
For fabric defect detection, the crucial issue is that large defects can be detected but not small ones, and vice versa, and this symmetric contradiction cannot be solved by a single method, especially for colored fabrics. In this paper, we propose a method based on a combination of two networks, SE and SSD, namely the SE-SSD Net method. The model is based on the SSD network and adds the SE module for squeezing and the Excitation module after its convolution operation, which is used to increase the weight of the model for the feature channels containing defect information while re-preserving the original network to extract feature maps of different scales for detection. The global features are then subjected to the Excitation operation to obtain the weights of different channels, which are multiplied by the original features to form the final features so that the model can pay more attention to the channel features with a large amount of information. In this way, large-scale feature maps can be used to detect small defects, while small-scale feature maps are used to detect relatively large defects, thus solving the asymmetry problem in detection. The experimental results show that our proposed algorithm can detect six different defects in colored fabrics, which basically meets the practical needs. Full article
Show Figures

Figure 1

Article
PSG-Yolov5: A Paradigm for Traffic Sign Detection and Recognition Algorithm Based on Deep Learning
Symmetry 2022, 14(11), 2262; https://doi.org/10.3390/sym14112262 - 28 Oct 2022
Cited by 6 | Viewed by 2055
Abstract
With the gradual popularization of autonomous driving technology, how to obtain traffic sign information efficiently and accurately is very important for subsequent decision-making and planning tasks. Traffic sign detection and recognition (TSDR) algorithms include color-based, shape-based, and machine learning based. However, the algorithms [...] Read more.
With the gradual popularization of autonomous driving technology, how to obtain traffic sign information efficiently and accurately is very important for subsequent decision-making and planning tasks. Traffic sign detection and recognition (TSDR) algorithms include color-based, shape-based, and machine learning based. However, the algorithms mentioned above are insufficient for traffic sign detection tasks in complex environments. In this paper, we propose a traffic sign detection and recognition paradigm based on deep learning algorithms. First, to solve the problem of insufficient spatial information in high-level features of small traffic signs, the parallel deformable convolution module (PDCM) is proposed in this paper. PDCM adaptively acquires the corresponding receptive field preserving the integrity of the abstract information through symmetrical branches thereby improving the feature extraction capability. Simultaneously, we propose sub-pixel convolution attention module (SCAM) based on the attention mechanism to alleviate the influence of scale distribution. Distinguishing itself from other feature fusion, our proposed method can better focus on the information of scale distribution through the attention module. Eventually, we introduce GSConv to further reduce the computational complexity of our proposed algorithm, better satisfying industrial application. Experimental results demonstrate that our proposed methods can effectively improve performance, both in detection accuracy and mAP@0.5. Specifically, when the proposed PDCM, SCAM, and GSConv are applied to the Yolov5, it achieves 89.2% mAP@0.5 in TT100K, which exceeds the benchmark network by 4.9%. Full article
Show Figures

Figure 1

Article
Remaining Useful Life Prediction of Milling Cutters Based on CNN-BiLSTM and Attention Mechanism
Symmetry 2022, 14(11), 2243; https://doi.org/10.3390/sym14112243 - 25 Oct 2022
Cited by 1 | Viewed by 915
Abstract
Machining tools are a critical component in machine manufacturing, the life cycle of which is an asymmetrical process. Extracting and modeling the tool life variation features is very significant for accurately predicting the tool’s remaining useful life (RUL), and it is vital to [...] Read more.
Machining tools are a critical component in machine manufacturing, the life cycle of which is an asymmetrical process. Extracting and modeling the tool life variation features is very significant for accurately predicting the tool’s remaining useful life (RUL), and it is vital to ensure product reliability. In this study, based on convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM), a tool wear evolution and RUL prediction method by combining CNN-BiLSTM and attention mechanism is proposed. The powerful CNN is applied to directly process the sensor-monitored data and extract local feature information; the BiLSTM neural network is used to adaptively extract temporal features; the attention mechanism can selectively study the important degradation features and extract the tool wear status information. By evaluating the performance and generalization ability of the proposed method under different working conditions, two datasets are applied for experiments, and the proposed method outperforms the traditional method in terms of prediction accuracy. Full article
Show Figures

Figure 1

Article
Crowd Density Estimation in Spatial and Temporal Distortion Environment Using Parallel Multi-Size Receptive Fields and Stack Ensemble Meta-Learning
Symmetry 2022, 14(10), 2159; https://doi.org/10.3390/sym14102159 - 15 Oct 2022
Cited by 1 | Viewed by 1209
Abstract
The estimation of crowd density is crucial for applications such as autonomous driving, visual surveillance, crowd control, public space planning, and warning visually distracted drivers prior to an accident. Having strong translational, reflective, and scale symmetry, models for estimating the density of a [...] Read more.
The estimation of crowd density is crucial for applications such as autonomous driving, visual surveillance, crowd control, public space planning, and warning visually distracted drivers prior to an accident. Having strong translational, reflective, and scale symmetry, models for estimating the density of a crowd yield an encouraging result. However, dynamic scenes with perspective distortions and rapidly changing spatial and temporal domains still present obstacles. The main reasons for this are the dynamic nature of a scene and the difficulty of representing and incorporating the feature space of objects of varying sizes into a prediction model. To overcome the aforementioned issues, this paper proposes a parallel multi-size receptive field units framework that leverages the majority of the CNN layer’s features, allowing for the representation and participation in the model prediction of the features of objects of all sizes. The proposed method utilizes features generated from lower to higher layers. As a result, different object scales can be handled at different framework depths, and various environmental densities can be estimated. However, the inclusion of the vast majority of layer features in the prediction model has a number of negative effects on the prediction’s outcome. Asymmetric non-local attention and the channel weighting module of a feature map are proposed to handle noise and background details and re-weight each channel to make it more sensitive to important features while ignoring irrelevant ones, respectively. While the output predictions of some layers have high bias and low variance, those of other layers have low bias and high variance. Using stack ensemble meta-learning, we combine individual predictions made with lower-layer features and higher-layer features to improve prediction while balancing the tradeoff between bias and variance. The UCF CC 50 dataset and the ShanghaiTech dataset have both been subjected to extensive testing. The results of the experiments indicate that the proposed method is effective for dense distributions and objects of various sizes. Full article
Show Figures

Figure 1

Article
Prediction of COVID-19 Cases Using Constructed Features by Grammatical Evolution
Symmetry 2022, 14(10), 2149; https://doi.org/10.3390/sym14102149 - 14 Oct 2022
Viewed by 807
Abstract
A widely used method that constructs features with the incorporation of so-called grammatical evolution is proposed here to predict the COVID-19 cases as well as the mortality rate. The method creates new artificial features from the original ones using a genetic algorithm and [...] Read more.
A widely used method that constructs features with the incorporation of so-called grammatical evolution is proposed here to predict the COVID-19 cases as well as the mortality rate. The method creates new artificial features from the original ones using a genetic algorithm and is guided by BNF grammar. After the artificial features are generated, the original data set is modified based on these features, an artificial neural network is applied to the modified data, and the results are reported. From the comparative experiments done, it is clear that feature construction has an advantage over other machine-learning methods for predicting pandemic elements. Full article
Show Figures

Figure 1

Article
A Novel Driver Abnormal Behavior Recognition and Analysis Strategy and Its Application in a Practical Vehicle
Symmetry 2022, 14(10), 1956; https://doi.org/10.3390/sym14101956 - 20 Sep 2022
Cited by 1 | Viewed by 1032
Abstract
In this work, a novel driver abnormal behavior analysis system based on practical facial landmark detection (PFLD) and you only look once version 5 (YOLOv5) were developed to solve the recognition and analysis of driver abnormal behaviors. First, a library for analyzing the [...] Read more.
In this work, a novel driver abnormal behavior analysis system based on practical facial landmark detection (PFLD) and you only look once version 5 (YOLOv5) were developed to solve the recognition and analysis of driver abnormal behaviors. First, a library for analyzing the abnormal behavior of vehicle drivers was designed, in which the factors that cause an abnormal behavior of drivers were divided into three categories according to the behavioral characteristics including natural behavioral factors, unnatural behavioral factors, and passive behavioral factors. Then, different neural network models were established through the representation of the actual scene of the three behaviors. Specifically, the abnormal driver behavior caused by natural behavioral factors was identified by a PFLD neural network model based on facial key point detection, and the abnormal driver behavior caused by unnatural behavioral factors and passive behavioral factors were identified by a YOLOv5 neural network model based on target detection. In addition, in a test of the driver abnormal behavior analysis system in an actual vehicle, the precision rate was greater than 95%, which meets the requirements of practical application. Full article
Show Figures

Figure 1

Article
An Intelligent Vision-Based Tracking Method for Underground Human Using Infrared Videos
Symmetry 2022, 14(8), 1750; https://doi.org/10.3390/sym14081750 - 22 Aug 2022
Viewed by 903
Abstract
The underground mine environment is dangerous and harsh, tracking and detecting humans based on computer vision is of great significance for mine safety monitoring, which will also greatly facilitate identification of humans using the symmetrical image features of human organs. However, existing methods [...] Read more.
The underground mine environment is dangerous and harsh, tracking and detecting humans based on computer vision is of great significance for mine safety monitoring, which will also greatly facilitate identification of humans using the symmetrical image features of human organs. However, existing methods have difficulty solving the problems of accurate identification of humans and background, unstable human appearance characteristics, and humans occluded or lost. For these reasons, an improved aberrance repressed correlation filter (IARCF) tracker for human tracking in underground mines based on infrared videos is proposed. Firstly, the preprocess operations of edge sharpening, contrast adjustment, and denoising are used to enhance the image features of original videos. Secondly, the response map characteristics of peak shape and peak to side lobe ratio (PSLR) are analyzed to identify abnormal human locations in each frame, and the method of calculating the image similarity by generating virtual tracking boxes is used to accurately relocate the human. Finally, using the value of PSLR and the highest peak point of the response map, the appearance model is adaptively updated to further improve the robustness of the tracker. Experimental results show that the average precision and success rate of the IARCF tracker in the five underground scenarios reach 0.8985 and 0.7183, respectively, and the improvement of human tracking in difficult scenes is excellent. The IARCF tracker can effectively track underground human targets, especially occluded humans in complex scenes. Full article
Show Figures

Figure 1

Article
Hypernetwork Representation Learning with Common Constraints of the Set and Translation
Symmetry 2022, 14(8), 1745; https://doi.org/10.3390/sym14081745 - 22 Aug 2022
Viewed by 747
Abstract
Different from conventional networks with only pairwise relationships among the nodes, there are also complex tuple relationships, namely the hyperedges among the nodes in the hypernetwork. However, most of the existing network representation learning methods cannot effectively capture the complex tuple relationships. Therefore, [...] Read more.
Different from conventional networks with only pairwise relationships among the nodes, there are also complex tuple relationships, namely the hyperedges among the nodes in the hypernetwork. However, most of the existing network representation learning methods cannot effectively capture the complex tuple relationships. Therefore, in order to resolve the above challenge, this paper proposes a hypernetwork representation learning method with common constraints of the set and translation, abbreviated as HRST, which incorporates both the hyperedge set associated with the nodes and the hyperedge regarded as the interaction relation among the nodes through the translation mechanism into the process of hypernetwork representation learning to obtain node representation vectors rich in the hypernetwork topology structure and hyperedge information. Experimental results on four hypernetwork datasets demonstrate that, for the node classification task, our method outperforms the other best baseline methods by about 1%. As for the link prediction task, our method is almost entirely superior to other baseline methods. Full article
Show Figures

Figure 1

Article
Multi-Type Object Tracking Based on Residual Neural Network Model
Symmetry 2022, 14(8), 1689; https://doi.org/10.3390/sym14081689 - 15 Aug 2022
Cited by 7 | Viewed by 1072
Abstract
In this paper, a tracking algorithm based on the residual neural network model and machine learning is proposed. Compared with the widely used VGG network, the residual neural network has deeper characteristic layers and special additional layer structure, which break the symmetry of [...] Read more.
In this paper, a tracking algorithm based on the residual neural network model and machine learning is proposed. Compared with the widely used VGG network, the residual neural network has deeper characteristic layers and special additional layer structure, which break the symmetry of the network and reduce the degradation of the neural network. The additional layer and convolution layer are used for feature fusion to represent the target. The multi-features of the object can be captured by using the developed algorithm, so that the accuracy of tracking can be improved in some complex scenarios. In addition, we defined a new measure to calculate the similarity of different image regions and find the optimal matched region. The search area is delimited according to the continuity of the target motion, which improves the real-time performance of tracking. The experimental results illustrate that the proposed algorithm achieved a higher accuracy while taking into account the real time performance, especially in dealing with some complex scenarios such as deformation, rotation changes and background clutters, in comparison with the Multi-Domain Network (MDNet) algorithm based on a convolutional neural network. Full article
Show Figures

Figure 1

Article
Internal Similarity Network for Rejoining Oracle Bone Fragment Images
Symmetry 2022, 14(7), 1464; https://doi.org/10.3390/sym14071464 - 18 Jul 2022
Cited by 2 | Viewed by 934
Abstract
Rejoining oracle bone fragments plays an import role in studying the history and culture of the Shang dynasty by its characters. However, current computer vision technology has a low accuracy in judging whether the texture of oracle bone fragment image pairs can be [...] Read more.
Rejoining oracle bone fragments plays an import role in studying the history and culture of the Shang dynasty by its characters. However, current computer vision technology has a low accuracy in judging whether the texture of oracle bone fragment image pairs can be put back together. When rejoining fragment images, the coordinate sequence and texture features of edge pixels from original and target fragment images form a continuous symmetrical structure, so we put forward an internal similarity network (ISN) to rejoin the fragment image automatically. Firstly, an edge equidistant matching (EEM) algorithm was given to search similar coordinate sequences of edge segment pairs on the fragment image contours and to locally match the edge coordinate sequence of an oracle bone fragment image. Then, a target mask-based method was designed in order to put two images into a whole and to cut a local region image by the local matching edge. Next, we calculated a convolution feature gradient map (CFGM) of the local region image texture, and an internal similarity pooling (ISP) layer was proposed to compute the internal similarity of the convolution feature gradient map. Finally, ISN was contributed in order to evaluate a similarity score of a local region image texture and to determine whether two fragment images are a coherent whole. The experiments show that the correct judgement probability of ISN is higher than 90% in actual rejoining work and that our method searched 37 pairs of correctly rejoined oracle bone fragment images that have not been discovered by archaeologists. Full article
Show Figures

Figure 1

Article
CLHF-Net: A Channel-Level Hierarchical Feature Fusion Network for Remote Sensing Image Change Detection
Symmetry 2022, 14(6), 1138; https://doi.org/10.3390/sym14061138 - 01 Jun 2022
Cited by 3 | Viewed by 1265
Abstract
Remote sensing (RS) image change detection (CD) is the procedure of detecting the change regions that occur in the same area in different time periods. A lot of research has extracted deep features and fused multi-scale features by convolutional neural networks and attention [...] Read more.
Remote sensing (RS) image change detection (CD) is the procedure of detecting the change regions that occur in the same area in different time periods. A lot of research has extracted deep features and fused multi-scale features by convolutional neural networks and attention mechanisms to achieve better CD performance, but these methods do not result in well-fused feature pairs of the same scale and features of different layers. To solve this problem, a novel CD network with symmetric structure called the channel-level hierarchical feature fusion network (CLHF-Net) is proposed. First, a channel-split feature fusion module (CSFM) with symmetric structure is proposed, which consists of three branches. The CSFM integrates feature information of the same scale feature pairs more adequately and effectively solves the problem of insufficient communication between feature pairs. Second, an interaction guidance fusion module (IGFM) is designed to fuse the feature information of different layers more effectively. IGFM introduces the detailed information from shallow features into deep features and deep semantic information into shallow features, and the fused features have more complete feature information of change regions and clearer edge information. Compared with other methods, CLHF-Net improves the F1 scores by 1.03%, 2.50%, and 3.03% on the three publicly available benchmark datasets: season-varying, WHU-CD, and LEVIR-CD datasets, respectively. Experimental results show that the performance of the proposed CLHF-Net is better than other comparative methods. Full article
Show Figures

Figure 1

Article
Research on Prediction Method of Gear Pump Remaining Useful Life Based on DCAE and Bi-LSTM
Symmetry 2022, 14(6), 1111; https://doi.org/10.3390/sym14061111 - 28 May 2022
Cited by 6 | Viewed by 1443
Abstract
As a hydraulic pump is the power source of a hydraulic system, predicting its remaining useful life (RUL) can effectively improve the operating efficiency of the hydraulic system and reduce the incidence of failure. This paper presents a scheme for predicting the RUL [...] Read more.
As a hydraulic pump is the power source of a hydraulic system, predicting its remaining useful life (RUL) can effectively improve the operating efficiency of the hydraulic system and reduce the incidence of failure. This paper presents a scheme for predicting the RUL of a hydraulic pump (gear pump) through a combination of a deep convolutional autoencoder (DCAE) and a bidirectional long short-term memory (Bi-LSTM) network. The vibration data were characterized by the DCAE, and a health indicator (HI) was constructed and modeled to determine the degradation state of the gear pump. The DCAE is a typical symmetric neural network, which can effectively extract characteristics from the data by using the symmetry of the encoding network and decoding network. After processing the original vibration data segment, health indicators were entered as a label into the RUL prediction model based on the Bi-LSTM network, and model training was carried out to achieve the RUL prediction of the gear pump. To verify the validity of the methodology, a gear pump accelerated life experiment was carried out, and whole life cycle data were obtained for method validation. The results show that the constructed HI can effectively characterize the degenerative state of the gear pump, and the proposed RUL prediction method can effectively predict the degeneration trend of the gear pump. Full article
Show Figures

Figure 1

Article
A Semi-Supervised Semantic Segmentation Method for Blast-Hole Detection
Symmetry 2022, 14(4), 653; https://doi.org/10.3390/sym14040653 - 23 Mar 2022
Cited by 4 | Viewed by 1593
Abstract
The goal of blast-hole detection is to help place charge explosives into blast-holes. This process is full of challenges, because it requires the ability to extract sample features in complex environments, and to detect a wide variety of blast-holes. Detection techniques based on [...] Read more.
The goal of blast-hole detection is to help place charge explosives into blast-holes. This process is full of challenges, because it requires the ability to extract sample features in complex environments, and to detect a wide variety of blast-holes. Detection techniques based on deep learning with RGB-D semantic segmentation have emerged in recent years of research and achieved good results. However, implementing semantic segmentation based on deep learning usually requires a large amount of labeled data, which creates a large burden on the production of the dataset. To address the dilemma that there is very little training data available for explosive charging equipment to detect blast-holes, this paper extends the core idea of semi-supervised learning to RGB-D semantic segmentation, and devises an ERF-AC-PSPNet model based on a symmetric encoder–decoder structure. The model adds a residual connection layer and a dilated convolution layer for down-sampling, followed by an attention complementary module to acquire the feature maps, and uses a pyramid scene parsing network to achieve hole segmentation during decoding. A new semi-supervised learning method, based on pseudo-labeling and self-training, is proposed, to train the model for intelligent detection of blast-holes. The designed pseudo-labeling is based on the HOG algorithm and depth data, and proved to have good results in experiments. To verify the validity of the method, we carried out experiments on the images of blast-holes collected at a mine site. Compared to the previous segmentation methods, our method is less dependent on the labeled data and achieved IoU of 0.810, 0.867, 0.923, and 0.945, at labeling ratios of 1/8, 1/4, 1/2, and 1. Full article
Show Figures

Figure 1

Back to TopTop