AI for Computational Vision, Natural Language Processing, and Geoinformatics

Zheng, Wenfeng; Liu, Mingzhe; Li, Kenan; Liu, Xuan

doi:10.3390/app132413276

Open AccessEditorial

AI for Computational Vision, Natural Language Processing, and Geoinformatics

¹

School of Automation, University of Electronic Science and Technology of China, Chengdu 610054, China

²

College of Computer Science and Cyber Security, Chengdu University of Technology, Chengdu 610059, China

³

Department of Epidemiology and Biostatistics, College for Public Health and Social Justice, Saint Louis University, St. Louis, MO 63103, USA

⁴

School of Public Affairs and Administration, University of Electronic Science and Technology of China, Chengdu 610054, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(24), 13276; https://doi.org/10.3390/app132413276

Submission received: 28 November 2023 / Accepted: 14 December 2023 / Published: 15 December 2023

(This article belongs to the Special Issue AI for Computational Vision, Natural Language Processing, and Geoinformatics)

Download Versions Notes

The rapid development of artificial intelligence technology has had a huge impact on the fields of computer vision, natural language processing, and geographic information applications. Semantic reasoning enables machines to perform tasks akin to human intelligence, thereby enhancing human–machine interactions and decision-making processes. This breakthrough spans various fields, including text classification [1], named entity recognition [2], machine translation [3], and machine reading comprehension [4]. As one of the important research tasks in the field of natural language processing, people have conducted extensive research on language models. Google’s BERT model [5] has achieved excellent performance in multiple downstream tasks such as machine reading comprehension. Subsequently, a large number of natural language processing models based on pre-trained models have been proposed. However, given the complexity of logical reasoning machine reading comprehension problems, the performance of pretrained language models in these tasks remains suboptimal.

Numerous studies have shown that modeling the logical structure of pretrained models is one of the most effective methods to enhance logical reasoning ability. These methods include introducing a symbolic logic into neural network models [6] and using graph structures for logical reasoning [4]. Article [7] proposed a discourse graph construction method that uses punctuation and explicit connectors for node segmentation and utilizes positional encoding. This article presents a discourse graph attention network based on a multi-head attention mechanism. This network adaptively garners information from adjacent nodes using attention weight coefficients and simulates varying levels of attention to each condition during the inference process.

Another popular task in natural language processing is text classification. After Google proposed the BERT model in 2018, a large number of novel text feature extraction tasks were proposed. This includes disease diagnosis methods based on crop electronic medical records [8], pre-trained models using Chinese full-word masking strategies [9], the recognition of protein–protein interactions from biomedical texts [10], customer comment analysis models based on BERTopic [11,12,13], and Arabic satirical article classification using artificial intelligence methods [14,15]. Article [16] developed an event co-reference parsing system for Arabic and proposed a pattern for annotating Arabic event co-references, providing key support for developing advanced common reference parsing systems. In the article [17], the authors compiled a large Arabic news satirical article dataset and constructed a satirical work classification model using machine learning (ML), deep learning (DL), and transformer. In article [18], the author proposed the HTMC-PGT framework for the single-path hierarchical multi-label classification problem in poverty governance. This framework simplifies the HMTC problem into the training and combination problem of multi-class classifiers in the classifier tree, providing new solutions for traditional methods. Article [19] utilized clustering algorithms and topic modeling techniques to automatically extract consumer intentions from comment data and compares their performance with traditional methods. This study helps us to more accurately understand consumer emotions.

Artificial intelligence tasks have been extensively integrated into computer vision tasks, with some key areas including object detection [20,21,22,23], image classification [24,25], and medical image processing [26,27,28,29]. Most existing object detection algorithms are based on the YOLO algorithm and have been applied in various fields, such as citrus orchards [30], driver distraction [31], ship detection [32], steel plate defect detection [33], etc. Article [34] introduces a lightweight mask detection algorithm called ECGYOLO based on improved YOLOv7tiny. This algorithm replaces the ELAN module with an ECG module and introduces an ECA mechanism in the neck section, which can meet the real-time and lightweight requirements of mask detection. Article [35] proposes an improved method based on YOLOv8 for accurate recognition of small targets in remote sensing images. The cross-row convolution module in YOLOv8s is replaced with the SPD Conv module, and the path aggregation network is replaced with the SPANet structure. The results show that the algorithm has significantly improved recognition accuracy.

In the field of image classification, article [36] proposes a new pooling operation that integrates it into attention blocks and applied extension operations and point convolution in the channel direction. This method significantly improved the accuracy of image classification on ImageNet. In the field of medical image processing, article [37] applied the ViT model to the recognition and localization of malignant tumors. The authors proposed an improved ViT architecture (ViT patch) by adding a shared MLP header to the output of each patch token. This method provided more task-related supervisory information, improved the generalization ability of the ViT model, and optimized feature learning at a deep level.

The field of sports also presents intriguing applications in computer vision tasks. In the article [38], the authors applied computer network graph theory to propose a passing network for evaluating football team performance. The author used the ratio of average clustering coefficient to average centrality as an overall network indicator to measure the coordination of football team performance, and their results show that this indicator helps to explain the team’s coordination level and has a certain reference value for evaluating the competitiveness of football teams.

This Special Issue also covers an article in the field of data security. There have been many studies on image steganography based on deep learning [39,40]. The article [41] proposed several ideas for implementing data hiding in WebP images, including format-based methods and data-based methods. In addition, the authors also proposed a container selection technique that benefits from the available WebP compression parameters. The authors tested three application programs based on these methods, demonstrating their effectiveness.

In summary, the emergence of artificial intelligence has profoundly impacted various industries such as computer vision and natural language processing. As researchers continue to delve deeper into artificial intelligence, new opportunities and applications will emerge, and the ability of intelligent systems to solve complex problems will be further enhanced.

Author Contributions

Conceptualization, W.Z., M.L., K.L. and X.L.; writing—original draft preparation, W.Z. and K.L.; writing—review and editing, W.Z., M.L., K.L. and X.L. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Sajid, N.A.; Rahman, A.; Ahmad, M.; Musleh, D.; Basheer Ahmed, M.I.; Alassaf, R.; Chabani, S.; Ahmed, M.S.; Salam, A.A.; AlKhulaifi, D. Single vs. Multi-Label: The Issues, Challenges and Insights of Contemporary Classification Schemes. Appl. Sci. 2023, 13, 6804. [Google Scholar] [CrossRef]
Zhang, J.; Liu, L.; Gao, K.; Hu, D. Few-shot Class-incremental Pill Recognition. arXiv 2023, arXiv:2304.11959. [Google Scholar]
Chen, G.; Ma, S.; Chen, Y.; Zhang, D.; Pan, J.; Wang, W.; Wei, F. Towards making the most of cross-lingual transfer for zero-shot neural machine translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 142–157. [Google Scholar]
Li, X.; Cheng, G.; Chen, Z.; Sun, W.; Qu, Y. AdaLoGN: Adaptive logic graph network for reasoning-based machine reading comprehension. arXiv 2022, arXiv:2203.08992. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Jiao, F.; Guo, Y.; Song, X.; Nie, L. MERIt: Meta-path guided contrastive learning for logical reasoning. arXiv 2022, arXiv:2203.00357. [Google Scholar]
Wu, M.; Sun, T.; Wang, Z.; Duan, J. DaGATN: A Type of Machine Reading Comprehension Based on Discourse-Apperceptive Graph Attention Networks. Appl. Sci. 2023, 13, 12156. [Google Scholar] [CrossRef]
Ding, J.; Li, B.; Xu, C.; Qiao, Y.; Zhang, L. Diagnosing crop diseases based on domain-adaptive pre-training BERT of electronic medical records. Appl. Intell. 2023, 53, 15979–15992. [Google Scholar] [CrossRef]
Cui, Y.; Che, W.; Liu, T.; Qin, B.; Yang, Z. Pre-training with whole word masking for chinese bert. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 3504–3514. [Google Scholar] [CrossRef]
Rehana, H.; Çam, N.B.; Basmaci, M.; Zheng, J.; Jemiyo, C.; He, Y.; Özgür, A.; Hur, J. Evaluation of GPT and BERT-based models on identifying protein-protein interactions in biomedical text. arXiv 2023, arXiv:2303.17728. [Google Scholar]
Uncovska, M.; Freitag, B.; Meister, S.; Fehring, L. Rating analysis and BERTopic modeling of consumer versus regulated mHealth app reviews in Germany. NPJ Digit. Med. 2023, 6, 115. [Google Scholar] [CrossRef]
Alhaj, F.; Al-Haj, A.; Sharieh, A.; Jabri, R. Improving Arabic cognitive distortion classification in Twitter using BERTopic. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 854–860. [Google Scholar] [CrossRef]
Ji, Y.; Ma, Y. The robust maximum expert consensus model with risk aversion. Inf. Fusion 2023, 99, 101866. [Google Scholar] [CrossRef]
Rahma, A.; Azab, S.S.; Mohammed, A. A Comprehensive Review on Arabic Sarcasm Detection: Approaches, Challenges and Future Trends. IEEE Access 2023, 11, 18261–18280. [Google Scholar] [CrossRef]
Himdi, H.; Weir, G.; Assiri, F.; Al-Barhamtoshy, H. Arabic fake news detection based on textual analysis. Arab. J. Sci. Eng. 2022, 47, 10453–10469. [Google Scholar] [CrossRef]
Aldawsari, M.; Kolhar, M.; Dawood Omer, O.S. Within-Document Arabic Event Coreference: Challenges, Datasets, Approaches and Future Direction. Appl. Sci. 2023, 13, 11004. [Google Scholar] [CrossRef]
Assiri, F.; Himdi, H. Comprehensive Study of Arabic Satirical Article Classification. Appl. Sci. 2023, 13, 10616. [Google Scholar] [CrossRef]
Wang, X.; Guo, L. Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism. Appl. Sci. 2023, 13, 7377. [Google Scholar] [CrossRef]
An, Y.; Oh, H.; Lee, J. Marketing Insights from Reviews Using Topic Modeling with BERTopic and Deep Clustering Network. Appl. Sci. 2023, 13, 9443. [Google Scholar] [CrossRef]
Fan, D.P.; Ji, G.P.; Cheng, M.M.; Shao, L. Concealed object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6024–6042. [Google Scholar] [CrossRef]
Qi, G.; Zhang, Y.; Wang, K.; Mazur, N.; Liu, Y.; Malaviya, D. Small object detection method based on adaptive spatial parallel convolution and fast multi-scale fusion. Remote Sens. 2022, 14, 420. [Google Scholar] [CrossRef]
Liu, Y.; Li, Q.; Yuan, Y.; Du, Q.; Wang, Q. ABNet: Adaptive balanced network for multiscale object detection in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5614914. [Google Scholar] [CrossRef]
Zhang, C.; Lam, K.M.; Wang, Q. Cof-net: A progressive coarse-to-fine framework for object detection in remote-sensing imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5600617. [Google Scholar] [CrossRef]
Yuan, L.; Hou, Q.; Jiang, Z.; Feng, J.; Yan, S. Volo: Vision outlooker for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 6575–6586. [Google Scholar] [CrossRef]
Dai, Z.; Liu, H.; Le, Q.V.; Tan, M. Coatnet: Marrying convolution and attention for all data sizes. Adv. Neural Inf. Process. Syst. 2021, 34, 3965–3977. [Google Scholar]
Zheng, W.; Yang, B.; Xiao, Y.; Tian, J.; Liu, S.; Yin, L. Low-Dose CT Image Post-Processing Based on Learn-Type Sparse Transform. Sensors 2022, 22, 2883. [Google Scholar] [CrossRef] [PubMed]
Xu, S.; Yang, B.; Xu, C.; Tian, J.; Liu, Y.; Yin, L.; Liu, S.; Zheng, W.; Liu, C. Sparse Angle CBCT Reconstruction Based on Guided Image Filtering. Front. Oncol. 2022, 12, 832037. [Google Scholar] [CrossRef]
Hatamizadeh, A.; Tang, Y.; Nath, V.; Yang, D.; Myronenko, A.; Landman, B.; Roth, H.R.; Xu, D. UNETR: Transformers for 3D Medical Image Segmentation. In Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 4–8 January 2022; pp. 1748–1758. [Google Scholar]
Song, L.; Liu, G.; Ma, M. TD-Net: Unsupervised medical image registration network based on Transformer and CNN. Appl. Intell. 2022, 52, 18201–18209. [Google Scholar] [CrossRef]
Chen, J.; Liu, H.; Zhang, Y.; Zhang, D.; Ouyang, H.; Chen, X. A Multiscale Lightweight and Efficient Model Based on YOLOv7: Applied to Citrus Orchard. Plants 2022, 11, 3260. [Google Scholar] [CrossRef]
Liu, S.; Wang, Y.; Yu, Q.; Liu, H.; Peng, Z. CEAM-YOLOv7: Improved YOLOv7 Based on Channel Expansion and Attention Mechanism for Driver Distraction Behavior Detection. IEEE Access 2022, 10, 129116–129124. [Google Scholar] [CrossRef]
Liu, Y.; Wang, X. SAR Ship Detection Based on Improved YOLOv7-Tiny. In Proceedings of the 2022 IEEE 8th International Conference on Computer and Communications (ICCC), Chengdu, China, 9–12 December 2022; pp. 2166–2170. [Google Scholar]
Wang, C.; Sun, M.; Cao, Y.; He, K.; Zhang, B.; Cao, Z.; Wang, M. Lightweight Network-Based Surface Defect Detection Method for Steel Plates. Sustainability 2023, 15, 3733. [Google Scholar] [CrossRef]
Hu, W.; Zou, J.; Huang, Y.; Wang, H.; Zhao, K.; Liu, M.; Liu, S. ECGYOLO: Mask Detection Algorithm. Appl. Sci. 2023, 13, 7501. [Google Scholar] [CrossRef]
Ma, M.; Pang, H. SP-YOLOv8s: An Improved YOLOv8s Model for Remote Sensing Image Tiny Object Detection. Appl. Sci. 2023, 13, 8161. [Google Scholar] [CrossRef]
Chen, C.; Zhang, H. Attention Block Based on Binary Pooling. Appl. Sci. 2023, 13, 10012. [Google Scholar] [CrossRef]
Feng, H.; Yang, B.; Wang, J.; Liu, M.; Yin, L.; Zheng, W.; Yin, Z.; Liu, C. Identifying malignant breast ultrasound images using ViT-patch. Appl. Sci. 2023, 13, 3489. [Google Scholar] [CrossRef]
Zhou, W.; Yu, G.; You, S.; Wang, Z. An Improved Passing Network for Evaluating Football Team Performance. Appl. Sci. 2023, 13, 845. [Google Scholar] [CrossRef]
Xintao, D.; Jia, K.; Li, B.; Guo, D.; Zhang, E.; Qin, C. Reversible Image Steganography Scheme Based on a U-Net Structure. IEEE Access 2019, 7, 9314–9323. [Google Scholar]
Bi, X.; Yang, X.; Wang, C.; Liu, J. High-Capacity Image Steganography Algorithm Based on Image Style Transfer. Secur. Commun. Netw. 2021, 2021, 4179340. [Google Scholar] [CrossRef]
Koptyra, K.; Ogiela, M.R. An Efficient Steganographic Protocol for WebP Files. Appl. Sci. 2023, 13, 12404. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zheng, W.; Liu, M.; Li, K.; Liu, X. AI for Computational Vision, Natural Language Processing, and Geoinformatics. Appl. Sci. 2023, 13, 13276. https://doi.org/10.3390/app132413276

AMA Style

Zheng W, Liu M, Li K, Liu X. AI for Computational Vision, Natural Language Processing, and Geoinformatics. Applied Sciences. 2023; 13(24):13276. https://doi.org/10.3390/app132413276

Chicago/Turabian Style

Zheng, Wenfeng, Mingzhe Liu, Kenan Li, and Xuan Liu. 2023. "AI for Computational Vision, Natural Language Processing, and Geoinformatics" Applied Sciences 13, no. 24: 13276. https://doi.org/10.3390/app132413276

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

AI for Computational Vision, Natural Language Processing, and Geoinformatics

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI