Next Article in Journal
Enabling Multi-Part Plant Segmentation with Instance-Level Augmentation Using Weak Annotations
Next Article in Special Issue
Using ChatGPT and Persuasive Technology for Personalized Recommendation Messages in Hotel Upselling
Previous Article in Journal
Examining the Drivers of E-Commerce Adoption by Moroccan Firms: A Multi-Model Analysis
Previous Article in Special Issue
Structure Learning and Hyperparameter Optimization Using an Automated Machine Learning (AutoML) Pipeline
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Practical Hybrid IoT Architecture with Deep Learning Technique for Healthcare and Security Applications

1
Faculty of International Training, Thai Nguyen University of Technology, 3/2 Street, Tich Luong Ward, Thai Nguyen 250000, Vietnam
2
Department of Mechanical Engineering, TUETECH University, 1B Street Dong Bam Ward, Thai Nguyen 250000, Vietnam
3
Industry 4.0 Implementation Center, National Taiwan University of Science and Technology, Taipei 106335, Taiwan
4
Department of Mechanical Engineering, Palestine Technical University—Kadoorie, Tulkarm P.O. Box 7, Palestine
5
Department of Mechanical Engineering, National Yang-Ming Chiao Tung University, Hsinchu 30010, Taiwan
6
Department of Electrical Engineering, College of Engineering, Taif University, Taif 21944, Saudi Arabia
7
Department of Electrical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 807618, Taiwan
8
Department of Electrical Engineering, Faculty of Engineering at Shoubra, Benha University, Cairo 11629, Egypt
*
Authors to whom correspondence should be addressed.
Information 2023, 14(7), 379; https://doi.org/10.3390/info14070379
Submission received: 14 May 2023 / Revised: 20 June 2023 / Accepted: 27 June 2023 / Published: 3 July 2023
(This article belongs to the Special Issue Systems Engineering and Knowledge Management)

Abstract

:
Facial mask detection technology has become increasingly important even beyond the context of the COVID-19 pandemic. Along with the advancement in facial recognition technology, face mask detection has become a crucial feature for various applications. This paper introduces an Internet of Things (IoT) architecture based on a developed deep learning algorithm named You Only Look Once (YOLO) to keep society healthy, and secured, and collect data for future research. The proposed paradigm is built on the basis of economic consideration and is easy to implement. Yet, the used YOLOv4-tiny is one of the fastest object detection models to exist. A mask detection camera (MaskCam) that leverages the computing power of NVIDIA’s Jetson Nano edge nanodevices was built side by side with a smart camera application to detect a mask on the face of an individual. MaskCam distinguishes between mask wearers, those who are not wearing masks, and those who are not wearing masks properly according to MQTT protocol. Furthermore, a self-developed web browsing application comes with the MaskCam system to collect and visualize statistics for qualitative and quantitative analysis. The practical results demonstrate the superiority and effectiveness of the proposed smart mask detection system. On the one hand, YOLOv4-full obtained the best results even at smaller resolutions, although the frame rate is too small for real-time use. On the other hand, it is twice as fast as the other detection models, regardless of the quality of detection. Consequently, inferences may be run more frequently over the entire video sequence, resulting in more accurate output.

1. Introduction

Facial mask detection technology has become increasingly important. It has been widely known and used as an effective tool for dealing with the COVID-19 pandemic. However, with the advancement in facial recognition technology, face mask detection has many more other applications that extend far beyond the pandemic, such as preventing the spread of various other contagious diseases, containing healthcare-associated infections (HAIs), ensuring personal protective equipment (PPE) compliance, protecting individuals from air pollution, enhancing security protocols, and reducing criminal activities, among others. Let us further explore those applications in more detail as follows.
Preventing the spread of various kinds of contagious diseases, including but not limited to COVID-19: Contagious diseases such as COVID-19, influenza, tuberculosis, and other respiratory infections can be transmitted through airborne droplets released when coughing or sneezing [1]. Facial mask detection technology can help identify individuals who are not following the mask-wearing protocol and advise them to wear masks in certain designated places, such as hospitals, and other areas, such as residential buildings, transportation hubs, and schools, that are currently at a high risk of contagious diseases [2]. This proactive approach helps to minimize the spread of contagious diseases, reducing public health risks and protecting vulnerable populations.
Containing healthcare-associated infections (HAIs): HAIs are a persistent challenge in clinical settings, posing significant risks to patient safety and care quality. Infection control measures, such as proper hand hygiene and appropriate personal protective equipment (PPE), are crucial in preventing cross-contamination and the spread of HAIs. By leveraging facial mask detection technology, healthcare facilities can monitor mask compliance among healthcare professionals, reinforcing a culture of compliance and a commitment to patient safety [3]. In addition, by combining facial recognition with mask detection, access can be granted only to those individuals who are both authorized and complying with safety regulations. Therefore, this technology can provide a more seamless and contactless entry process, reducing the need for manual verification and, at the same time, reducing the risk of transmitting germs.
Ensuring personal protective equipment (PPE) compliance: Apart from the case of infectious diseases, facial mask detection technology is valuable in ensuring PPE compliance in many other industries wherein masks are mandatory for work safety. In sectors such as construction, mining, and chemical plants where workers are exposed to hazardous conditions, masks and protective gear are essential for their protection [4]. Facial mask detection technology can be used to ensure all employees adhere to proper PPE protocols and minimize the risk of workplace-related illnesses or accidents. Cameras fitted in workspaces enable real-time monitoring and alert systems can notify supervisors of PPE violations and help maintain a safe work environment [5].
Protecting individuals from air pollution: Air pollution is a severe global problem that affects the health of millions of people. In areas with high levels of pollution, such as industrial zones, densely populated cities, and construction sites, wearing a mask serves as a protective barrier against harmful airborne particles. Facial mask detection technology can thus be used to alert to individuals who are not wearing masks and advise them to put on a mask once air quality reaches a hazardous threshold.
Enhancing security protocols: By integrating facial mask detection technology into existing security systems, businesses and organizations can strengthen their overall security protocols. The technology can help ensure that individuals entering their premises are not wearing masks, adding an additional layer of protection from potential security threats [6]. For examples, in sensitive facilities such as power plants, military bases, and research laboratories, facial mask detection technology can enhance the security protocol by ensuring that individuals entering the area are not wearing masks or other disguises to conceal their identity. This can help protect sensitive information and assets from espionage or sabotage.
Reducing criminal activities: Facial mask detection technology can be a valuable deterrent against criminal activity. By being able to detect individuals wearing masks, criminals may be more hesitant to use masks while committing crimes [7]. For example, facial mask detection technology can help boost security in financial institutions and ATMs by preventing criminals from hiding their identity with masks.
The versatility of facial mask detection technology stems from its ability to analyze and monitor people and alert them when they should wear or not wear a mask in the designated places and situations mentioned above. Fortunately, artificial intelligence (AI) techniques, particularly deep learning algorithms, have flourished in recent decades. To improve throughput, efficiency, accuracy, and so on, Keras, OpenCV, and Tensorflow algorithms are used in conjunction with the Python embedded language. This provides researchers and engineers with a powerful tool that can solve the mask detection problem without any hassles. In mobile environments where there are a great deal of walking and people tend to move around a lot, deep learning models can take up a tremendous amount of time and processing power. Therefore, a new technique is needed to combat such problems.
Sethi et al. [8] present a face mask based on MAFA (MAsked FAces) datasets and deep learning algorithms. As a result, mask detection achieved a relatively high level of accuracy, at about 98.2%. According to Asif and Sohaib et al. [9], a machine learning method combined with the method of transfer learning offers an alternative solution. Other researchers have found solutions to this problem by utilizing computer vision, AI, or other methods. These studies are similar in that they all achieve high mask detection accuracy. However, they only focus on techniques that increase results as much as possible. With the Internet of Things (IoT), the applicability of a product is not only determined by the accuracy of its operation but also by its connectivity across many different locations [10]. To facilitate research and analysis of collected data, data should be easily extracted, retrieved, and stored. Hence, for the above reasons, this paper proposes a MaskCam system based on a developed deep learning algorithm named You Only Look Once (YOLO) as an object detection method [11]. You Only Look Once, or YOLO, is an algorithm that employs neural networks to detect objects in real time. One of the main advantages of this algorithm over others is its speed and accuracy. The above advantages make it applicable to a wide variety of applications, such as traffic signals and parking, among others. The object detection function of YOLO is implemented by convolutional neural networks (CNNs). This technique has the distinguishing characteristic that the prediction is made within a single algorithm run. Furthermore, the algorithm has extensive learning capabilities, allowing it to learn from the prior representation and then apply the created model to object detection. The YOLO algorithm utilizes residual blocks, bounding box regression, and intersection over union (IOU) techniques. In the residual blocks, the image will first be divided into small grids, and then objects that appear in each grid will be detected. Bounding box regression identifies the object in an image by creating an outline to highlight it in each cell. Through a single bounding box regression in the YOLO algorithm, the object’s height, width, and center can be predicted, as well as its class. The intersection over union (IOU) detects objects by using the very nature of box overlap. YOLO uses IOU to assign a perfect output box around objects. An IOU value of 1 indicates that the box predicted by the IOU is the same as the one that exists in reality. During this process, bounding boxes that are not equal to or different from the real box are eliminated.
The developed YOLOv4-tiny was used in this study. In comparison with other similar object detectors, YOLO has long been the most popular option. In contrast to area-based detectors, which generate region recommendations sent to the classifier, it uses the complete picture as input. This makes it significantly faster than other traditional detectors. YOLOv4-tiny is a compressed version of YOLOv4 designed for training on less powerful machines [12,13]. The MQTT (MQ Telemetry Transport) protocol is also used to communicate between the data server and the detection device [14]. A detailed description of hardware installation and a system overview will be covered in the Section 2. A web-based GUI front-end design will take the features of displaying statistics, as well as relationships, and send MQTT messages to devices using the Streamlit framework. Offices, schools, hospitals, or any place that requires people to put on a mask as part of an epidemic prevention program can benefit from the developed smart mask detection system. Launching the system is easy and inexpensive. Several measures were taken as part of this study to analyze the problem and identify the infractions:
  • YOLO tiny V4 was used to measure the accuracy of a masked face using a custom-built dataset of a blend of several data sets.
  • Mask detection algorithms that provide high accuracy and frame rates were analyzed, so that the system can operate in real time.
  • The use of MQTT communication for IoT applications allows the connection of devices to servers and data storage to become simpler and more convenient.
The remainder of the paper is organized as follows. Section 2 provides detailed explanations of the methodology. Section 3 describes the study results and discussion. Finally, Section 4 lays out the conclusions.

2. Methodology

This section presents the system overview, hardware installation, and software development.

2.1. System Overview

An intelligent mask-wearing surveillance system was developed using the powerful NVIDIA device. The overall system is depicted in Figure 1. Basically, the system consists of two main parts:
(1)
Device-side: The intelligent camera is powered by NVIDIA Jetson Nano. Based on an optimized deep learning detection model, NVIDIA Jetson Nano, which is considered the brain of the intelligent camera, captures and detects mask-wearing or mask-no-wearing cases. Device-side can contain a variety of devices installed in different surveillance locations.
(2)
Server-side: Data received from the device side are stored on the server, which is regarded as a warehouse. Data detected are then presented on a dashboard for analysis. It also handles the user’s commands and feedback and then transmits them to the device.

2.2. Hardware Installation

NVIDIA Jetson Nano, Logitech 270 HD Webcam, and Intel Dual Band Wireless AC 8265 were used in the hardware setup of the intelligent camera device. The total cost for those devices is about 350 USD. Figure 2 illustrates the device in its fully assembled state. In addition, a cooling fan was installed to cool the NVIDIA board to prevent thermal throttling and maintain performance over a long period.

2.3. Software Development

2.3.1. Deep Mask Detection Model and Optimization

In the Jetson Nano device, YOLOv4-tiny’s single object detection is applied to detect whether or not faces’ bounding boxes have masks. The deep learning algorithm is implemented using OpenCV and Python programming language. Detection models are used to recognize four categories: faces wearing masks, faces without masks, faces not visible, and faces with misplaced masks, as shown in Figure 3. Four public datasets, with approximately 6000 labels for each class, are used in the model: Kaggle Medical Masks [15], MAFA [16], WiderFace [17], and WIDER FACE [18]. Furthermore, the detections are tracked across the scene using an open-source object tracker, named Norfair [19]. Every time a person walks in front of the camera, the algorithm detects their face’s bounding box as it changes within the scene. Rather than counting each individual frame after frame, the algorithm counts each individual once. Once the detection result for their face exceeds a certain threshold for various frames, a voting process determines whether or not the individual is wearing a mask. This is also true if the face cannot be clearly seen. After running this algorithm, the final output is a count of how many individuals passed in front of the camera. In addition, it shows what percentage of those individuals were wearing a mask.
In order for the training model to run as efficiently as possible on the resource-constrained device, it needs to be converted to an optimized format, producing a TensorRT engine [20]. Model weight is reduced to 16 floating points, which runs well on the Jetson Nano while maintaining reasonable accuracy. Additionally, the optimal NVIDIA DeepStream SDK was reduced to accelerate the inference process on the NVIDIA GPU [21]. The processing pipeline is presented in Figure 4. A Python multiprocessing module can be used to handle the detection, video streaming, and MQTT communication processes required to stream the rendered video.

2.3.2. MQTT Broker and Webserver

In order to collect statistics and visualize them, a separate server was implemented. This server can be run on a device alongside the Jetson Nano (for example, AWS EC2) [22]. The web-based GUI system accommodates statistics from the MaskCam system, stores them in a database, and displays them. It can also send MQTT commands directly to devices via its web interface.
A web browser application was designed and developed using the Streamlit framework for the web-based GUI frontend. With Streamlit, the development time of IoT dashboards is shortened through an easy-to-use GUI. The frontend web application displays statistics, as well as relationships, and sends MQTT messages to devices. The dashboard interface consists of three main components:
(1)
Device selection: The device is selected to view its recorded data.
(2)
Filters: By selecting the date/time, the data will be visualized.
(3)
Reported statistics: Analyzing and visualizing the statistical data gathered from the selected device during a specific period.
On the backend, PostgreSQL was selected to store statistical and device information. When compared with MySQL, PostgreSQL is best suited to systems with complex queries that must be executed or for data warehousing and analysis. In Python, RESTful APIs for web applications were developed using the FastAPI framework. In addition to supporting asynchronous programming, FastAPI can also be used with Uvicorn and Gunicorn. Furthermore, the backend module executes an MQTT subscriber task that reads all commands from devices and records them in the database.

3. Results and Discussions

This section presents the implementation of the IoT architecture with deep learning technique, which leverages a face mask detection model YOLOv4-tiny network, which is implemented in TensorflowRT and optimized by DeepStream. The model uses a combination of four public datasets: Kaggle Medical Masks, MAFA, WiderFace, and WIDER FACE datasets, with approximately 6000 labels for each object class. The dataset includes four object classes: face with mask, face without mask, face not visible, and misplaced mask. With the integrated C270 HD Webcam, the quality of video can reach 30 FPS with a resolution of 1780 × 720. Various object detection models were compared, including MobileNetv2, a full version of YOLOv4, and a tiny variant of YOLOv4. Using different input resolutions, the models are trained and optimized with TensorRT before being benchmarked on the same reference videos. The comparisons between these models can be found in Table 1. Despite the fact that YOLOv4-full obtained the best results even at smaller resolutions, the frame rate is too small for real-time use. While the quality of detection is similar between YOLOv4-tiny and MobileNetV2, YOLOv4-full is significantly faster, i.e., twice as fast as the other detection models. Consequently, inferences can be run over the whole video sequence more frequently, leading to better results.
As is obvious in Figure 5, MaskCam detected the people in the picture regardless of their movements. In addition, each detected object is given a number for clear presentation and analysis. As seen, objects 9 and 11 refer to those who were not wearing masks, including those not wearing the mask properly. In addition, object 8 is not considered visible. Figure 6 and Figure 7 demonstrate how statistical detection data can be plotted using the web-based dashboard. In addition to reporting the number of people who pass through the surveillance area and whether or not they are wearing masks, it also reports the percentage of people wearing masks.
In comparison with other studies, Hiten Goyal et al. [23] compared the results for different models and found a different level of accuracy, as illustrated in Figure 8. Because the authors used a different set of hardware than the tool used in the current study, there would be a significant difference in accuracy. As shown in Figure 9, the present proposed study showed outstanding accuracy, a shorter processing time, and the smallest model size, at about 98%, 8.95 s, and 33 MB, respectively. The results of the study by Arjya Das et al. [24] indicated 96.96% accuracy for one dataset and 94.58% accuracy for the second dataset when using different models to determine whether or not an image contained a mask, as depicted in Figure 9. Various masks in the image along with multiple faces in the image are believed to be the main reasons for the difference in accuracy, according to the author. Therefore, when training the current model with a different mask than that with which it was trained, the accuracy would differ slightly. Upscaling can be carried out on low-quality images for detection and classification [25,26]. Furthermore, CNNs can achieve higher accuracy by improving image quality [27]. The CNN-based model proposed by Kaur et al. [28] proceeds by correctly recognizing the face and then evaluating whether or not the face has been covered. Figure 10 shows that our model functions somewhat similarly.
A similar algorithm was developed by Bhuiyan et al. [29] to identify whether or not the individual being monitored is wearing a mask. The performance of the custom-trained model was enhanced with data augmentation [30]. The performance of different models is compared using the model metrics, which are defined in Table 2. Figure 10 illustrates the comparison of different models studied and compared by Naeem Ullah et al. [31], as other algorithms can be used to improve our current model. Comparing the current model with other models, it has demonstrated impressive results and can be implemented in real-world scenarios.
By effectively detecting people wearing masks, people not wearing masks, people not wearing masks properly, and people who are invisible in the detected image frame, the algorithm was able to detect all of the conditions mentioned above. Therefore, this facial mask detection technology is very versatile and can be employed for different purposes. For example, in some places such as hospitals and transportation hubs, or during a pandemic where there is a high risk of spreading contagious diseases and wearing masks is mandatory, this technology helps to identify individuals who are not following the mask-wearing protocol and then advise them to wear masks. However, in some restricted access and sensitive places such as banks, military bases, power plants, and so on, where people are asked to not wear masks while entering, this technology helps to identify those who are wearing masks and inform them to take off their masks so that their identity can be checked, thus enhancing security. Generally, there are instructions for individuals to follow on whether or not they need to wear masks in particular situations, but in case they do not follow the instructions, this facial mask detection technology in our study will help inform them to comply those instructions.

4. Conclusions

This study proposes a new cost-effective Internet-of-Things-based and deep-learning-based mask detection solution to assist people in adhering to many applications of facial mask detection technology. Indoor measurement was the primary focus of this study. In addition to being applicable to inexpensive hardware devices, the pipeline also has high efficiency in mask detection. Further, Jetson Nano was successfully used in practice to deploy deep learning models using the proposed method. Yet, it is also possible to improve a neural-network-based product in terms of accuracy and the implementation of deep learning algorithms. Lastly, a web application was developed for data visualization and analysis. The present smart mask detection system showed outstanding accuracy, a shorter processing time, and the smallest model size, at about 98%, 8.95 s, and 33 MB, respectively, compared with the other used models.
Future study will include state-of-the-art detection (YOLOv5, motion detection) to enhance the accuracy for individuals at a distance and work well on edge devices. A potential benefit of this work is that it can be applied to other applications by creating more accurate models, which will result in better accuracy and higher quality results. This includes self-driving cars, traffic signals, parking, facial recognition, robotics, and the medical industries that use these technologies to detect objects in a variety of different scenarios.

Author Contributions

Conceptualization, M.E. and M.-Q.T.; Data curation, M.E. and M.-Q.T.; Formal analysis, M.E., M.-Q.T., V.Q.V., M.A. and M.K.; Investigation, M.-Q.T., V.Q.V., M.A. and M.K.; Methodology, M.E. and M.-Q.T.; Resources, M.E.; Software, M.-Q.T.; Supervision, M.E., M.A. and S.S.M.G.; Validation, V.Q.V., M.A., M.K. and S.S.M.G.; Visualization, V.Q.V., M.A. and M.K.; Writing—original draft, V.Q.V., M.A., M.K., M.E. and M.-Q.T.; Writing—review and editing, M.-Q.T., M.A., M.E, V.Q.V. and S.S.M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Thai Nguyen University of Technology (TNUT), Vietnam.

Data Availability Statement

Not applicable.

Acknowledgments

This work was financially supported by Thai Nguyen University of Technology (TNUT). The researchers would like to acknowledge the Deanship of Scientific Research, Taif University, Palestine Technical University, and TUETECH University for supporting this work, and also acknowledge Nguyen. T. T. Phuc, Dai-Dong Nguyen, and Thanh- Tung Vo from Taiwan Tech for their support.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liang, M.; Gao, L.; Cheng, C.; Zhou, Q.; Uy, J.P.; Heiner, K.; Sun, C. Efficacy of face mask in preventing respiratory virus transmission: A systematic review and meta-analysis. Travel Med. Infect. Dis. 2020, 36, 101751. [Google Scholar] [CrossRef] [PubMed]
  2. Leung, N.H.; Chu, D.K.; Shiu, E.Y.; Chan, K.H.; McDevitt, J.J.; Hau, B.J.; Yen, H.L.; Li, Y.; Ip, D.K.; Peiris, J.S.; et al. Respiratory virus shedding in exhaled breath and efficacy of face masks. Nat. Med. 2020, 26, 676–680. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Rong, R.; Lin, L.; Yang, Y.; Zhao, S.; Guo, R.; Ye, J.; Zhu, X.; Wen, Q.; Liu, D. Trending prevalence of healthcare-associated infections in a tertiary hospital in China during the COVID-19 pandemic. BMC Infect. Dis. 2023, 23, 41. [Google Scholar] [CrossRef]
  4. Chen, S.; Demachi, K. A Vision-Based Approach for Ensuring Proper Use of Personal Protective Equipment (PPE) in Decommissioning of Fukushima Daiichi Nuclear Power Station. Appl. Sci. 2020, 10, 5129. [Google Scholar] [CrossRef]
  5. Bhing, N.W.; Sebastian, P. Personal Protective Equipment Detection with Live Camera. In Proceedings of the 2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), Kuala Terengganu, Malaysia, 13–15 September 2021; pp. 221–226. [Google Scholar]
  6. Mahmoud, H.A.H.; Mengash, H.A. A novel technique for automated concealed face detection in surveillance videos. Pers. Ubiquitous Comput. 2021, 25, 129–140. [Google Scholar] [CrossRef]
  7. Kumar, A. A cascaded deep-learning-based model for face mask detection. Data Technol. Appl. 2023, 57, 84–107. [Google Scholar] [CrossRef]
  8. Sethi, S.; Kathuria, M.; Kaushik, T. Face mask detection using deep learning: An approach to reduce risk of corona-virus spread. J. Biomed. Inform. 2021, 120, 103848. [Google Scholar] [CrossRef] [PubMed]
  9. Asif, S.; Yi, W.; Tao, Y.; Si, J.; Amjad, K. Real time face mask detection system using transfer learning with machine learning method in the era of COVID-19 pandemic. In Proceedings of the 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 28–31 May 2021. [Google Scholar]
  10. Petrović, N.; Kocić, Đ. Iot-Based System for COVID-19 Indoor Safety Monitoring; IcETRAN Belgrade: Belgrade, Serbia, 2020. [Google Scholar]
  11. Liu, C.; Tao, Y.; Liang, J.; Li, K.; Chen, Y. Object detection based on YOLO network. In Proceedings of the 2018 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 14–16 December 2018. [Google Scholar]
  12. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. Available online: https://towardsdatascience.com/yolo-you-only-look-once-real-time-object-detection-explained-492dc9230006 (accessed on 8 April 2022).
  13. Jiang, Z.; Zhao, L.; Li, S.; Jia, Y. Real-time object detection method for embedded devices. Comput. Vis. Pattern Recognit. 2020, 14, 4244. [Google Scholar]
  14. Benitez Baltazar, V.H.; Pacheco, J.; Moreno, J.; De Nuñez, C. Autonomic Face Mask Detection with Deep Learning: An IoT Application. Revista Mexicana de Ingenieria Biomedica 2021, 42, 160–170. [Google Scholar]
  15. Evan Danilovich. (2020 March). Medical Masks Dataset. Version 1. Available online: https://www.kaggle.com/ivandanilovich/medical-masks-dataset (accessed on 14 May 2020).
  16. Ge, S.; Li, J.; Ye, Q.; Luo, Z. Detecting Masked Faces in the Wild with LLE-CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2682–2690. [Google Scholar]
  17. Jain, V.; Learned-Miller, E. FDDB: A Benchmark for Face Detection in Unconstrained Settings; Technical Report UM-CS-2010-009; Dept. of Computer Science, University of Massachusetts: Amherst, MA, USA, 2010. [Google Scholar]
  18. Yang, S.; Luo, P.; Loy, C.C.; Tang, X. WIDER FACE: A Face Detection Benchmark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
  19. Alori, J.; Descoins, A.; Ríos, B.; Castro, A. Tryolabs/norfair: V0.3.1. Available online: https://zenodo.org/record/5146254 (accessed on 29 July 2021).
  20. Al Ghadani, A.K.A.; Mateen, W.; Ramaswamy, R.G. Tensor-based cuda optimization for ann inferencing using parallel acceleration on embedded gpu. Artif. Intell. Appl. Innov. 2020, 583, 291. [Google Scholar]
  21. Stepanenko, S.; Yakimov, P. Using high-performance deep learning platform to accelerate object detection. In Proceedings of the International Conference on Information Technology and Nanotechnology, Samara, Russia, 21–24 May 2019. [Google Scholar]
  22. Rizvi, S.R.; Killough, B.; Cherry, A.; Gowda, S. Lessons learned and cost analysis of hosting a full stack Open Data Cube (ODC) application on the Amazon Web Services (AWS). In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018. [Google Scholar]
  23. Goyal, H.; Sidana, K.; Singh, C.; Jain, A.; Jindal, S. A real time face mask detection system using convolutional neural network. Multimed. Tool Appl. 2022, 81, 14999–15015. [Google Scholar] [CrossRef] [PubMed]
  24. Das, A.; Ansari, M.W.; Basak, R. COVID-19 Face Mask Detection Using TensorFlow, Keras and OpenCV. In Proceedings of the 2020 IEEE 17th India Council International Conference, New Delhi, India, 10–13 December 2020. [Google Scholar]
  25. Zhang, F.; Yang, F.; Li, C.; Yuan, G. CMNet: A Connect-and-Merge Convolutional Neural Network for Fast Vehicle Detection in Urban Traffic Surveillance. IEEE Access 2019, 7, 72660–72671. [Google Scholar] [CrossRef]
  26. Hao, S.; Wang, W.; Ye, Y.; Li, E.; Bruzzone, L. A Deep Network Architecture for Super-Resolution-Aided Hyperspectral Image Classification with Classwise Loss. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4650–4663. [Google Scholar] [CrossRef]
  27. Qin, B.; Li, D. Identifying Facemask-Wearing Condition Using Super-Resilution with Classification Network to Prevent COVID-19. Sensors 2020, 20, 5236. [Google Scholar] [CrossRef] [PubMed]
  28. Kaur, G.; Sinha, R.; Tiwari, P.K.; Yadav, S.K.; Pandey, P.; Raj, R.; Vashisth, A.; Rakhra, M. Face mask recognition system using CNN model. Neurosci. Inform. 2021, 2, 100035. [Google Scholar] [CrossRef] [PubMed]
  29. Bhuiyan, M.R.; Khushbu, S.A.; Islam, M.S. A deep learning-based assistive system to classify COVID-19 face mask for human safety with YOLOv3. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India, 1–3 July 2020; pp. 1–5. [Google Scholar]
  30. Ullah, N.; Javed, A.; Ghazanfar, M.A.; Alsufyani, A.; Bourouis, S. A novel DeepMaskNet model for face mask detection and masked facial recognition. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 9905–9914. [Google Scholar] [CrossRef]
  31. Mata, B.U.; Bhavya, S.; Ashitha, S. Face mask detection using convolutional neural network. J. Nat. Rem. 2021, 12, 14–19. [Google Scholar]
Figure 1. The pipeline of the intelligent mask-wearing surveillance system.
Figure 1. The pipeline of the intelligent mask-wearing surveillance system.
Information 14 00379 g001
Figure 2. Assembled Jetson-Nano-based smart camera device.
Figure 2. Assembled Jetson-Nano-based smart camera device.
Information 14 00379 g002
Figure 3. Detection strategy of the proposed system.
Figure 3. Detection strategy of the proposed system.
Information 14 00379 g003
Figure 4. The mask face detector and tracker run as a DeepStream pipeline.
Figure 4. The mask face detector and tracker run as a DeepStream pipeline.
Information 14 00379 g004
Figure 5. Real-time results of the system.
Figure 5. Real-time results of the system.
Information 14 00379 g005
Figure 6. Dashboard interface.
Figure 6. Dashboard interface.
Information 14 00379 g006
Figure 7. An example of a dashboard statistic.
Figure 7. An example of a dashboard statistic.
Information 14 00379 g007
Figure 8. Comparison to other models in terms of accuracy, times, and size.
Figure 8. Comparison to other models in terms of accuracy, times, and size.
Information 14 00379 g008
Figure 9. Epochs vs. accuracy.
Figure 9. Epochs vs. accuracy.
Information 14 00379 g009
Figure 10. Comparison to other models in terms of accuracy, precision, recall, and F1 score.
Figure 10. Comparison to other models in terms of accuracy, precision, recall, and F1 score.
Information 14 00379 g010
Table 1. Performance of different Jetson Nano models with TensorRT optimization.
Table 1. Performance of different Jetson Nano models with TensorRT optimization.
ModelYOLOv4-FullYOLOv4-TinyMobileNetV2
Input resolution608 × 6081024 × 6081024 × 608
FPS2.5146
Accuracy (mAP)89.01%84.5%86.12%
Table 2. Model evaluation metrics.
Table 2. Model evaluation metrics.
Precision =   T P T P   +   F P               (1) Accuracy =   T P   +   T N T P   +   F N   +   T N   +   F P               (2)
Recall = T P T P   +   F N               (3) F1_score = 2   ×   p r e c i s i o n   ×   r e c a l l P r e c i s i o n   +   r e c a l l               (4)
TP: true positive; TN: true negative; FP: false positive; FN: false negative.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vu, V.Q.; Tran, M.-Q.; Amer, M.; Khatiwada, M.; Ghoneim, S.S.M.; Elsisi, M. A Practical Hybrid IoT Architecture with Deep Learning Technique for Healthcare and Security Applications. Information 2023, 14, 379. https://doi.org/10.3390/info14070379

AMA Style

Vu VQ, Tran M-Q, Amer M, Khatiwada M, Ghoneim SSM, Elsisi M. A Practical Hybrid IoT Architecture with Deep Learning Technique for Healthcare and Security Applications. Information. 2023; 14(7):379. https://doi.org/10.3390/info14070379

Chicago/Turabian Style

Vu, Viet Q., Minh-Quang Tran, Mohammed Amer, Mahesh Khatiwada, Sherif S. M. Ghoneim, and Mahmoud Elsisi. 2023. "A Practical Hybrid IoT Architecture with Deep Learning Technique for Healthcare and Security Applications" Information 14, no. 7: 379. https://doi.org/10.3390/info14070379

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop