End-to-End Dataset Collection System for Sport Activities

Fresta, Matteo; Bellotti, Francesco; Capello, Alessio; Dabbous, Ali; Lazzaroni, Luca; Ansovini, Flavio; Berta, Riccardo

doi:10.3390/electronics13071286

Open AccessArticle

End-to-End Dataset Collection System for Sport Activities

by

Matteo Fresta

^*

,

Francesco Bellotti

,

Alessio Capello

,

Ali Dabbous

,

Luca Lazzaroni

,

Flavio Ansovini

and

Riccardo Berta

Department of Electrical, Electronic and Telecommunication Engineering (DITEN), University of Genoa, Via Opera Pia 11A, 16145 Genoa, Italy

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(7), 1286; https://doi.org/10.3390/electronics13071286

Submission received: 15 February 2024 / Revised: 19 March 2024 / Accepted: 28 March 2024 / Published: 29 March 2024

(This article belongs to the Special Issue Recent Advancements in Embedded Computing)

Download

Browse Figures

Versions Notes

Abstract

:

Datasets are key to developing new machine learning-based applications but are very costly to prepare, which hinders research and development in the field. We propose an edge-to-cloud end-to-end system architecture optimized for sport activity recognition dataset collection and application deployment. Tests in authentic contexts of use in four different sports have revealed the system’s ability to effectively collect machine learning-usable data, with an energy consumption compatible with the timeframe of most of the sport types. The proposed architecture relies on a key feature of the Measurify internet of things framework for the management of measurement data (i.e., .csv dataset management) and supports a workflow designed for efficient data labeling of signal timeseries. The architecture is independent of any specific sport, and a new dataset generation application can be set up in a few days, even by novice developers. With a view to concretely supporting the R&D community, our work is released open-source.

Keywords:

wearable devices; edge-to-cloud architecture; IoT; sport activity recognition (SAR); embedded systems; microcontrollers; inertial measurement units (IMUs); convolutional neural networks (CNNs)

1. Introduction

The Internet of Things (IoT) is also spreading in the world of sports, influencing how people engage in physical activities and interact with the environment [1]. IoT devices, in fact, are able to take accurate measurements, perform real-time computations, and send information to the fog (i.e., personal devices) and/or the cloud to enhance people’s performance and fitness.

Particularly, inertial sensor and fog/cloud-based data technologies have enabled a comprehensive evaluation of athlete performance, ever more frequently leveraging machine learning (ML) techniques for efficient extraction of information in real-time and/or processing historical data [2].

As significant modules and applications are being developed for sport activity recognition (SAR), the literature shows that a lot of development resources are needed to deploy a complete, usable solution (e.g., baseball [3], golf [4], running [5]). This, in fact, requires understanding, selecting, and integrating different technologies concerning sensors, embedded hardware and software, and communication protocols.

However, we argue that these should be treated as commodities easily tweakable to meet critical requirements (typically in terms of power efficiency, battery life, computing capability, memory footprint, overall performance, low cost, versatility, and user simplicity), while the application designer should focus on the added value provided by the SAR-specific ML modules.

Thus, we propose an IoT system architecture specifically designed and optimized for end-to-end collection of SAR data, specifically upholding supervised ML dataset creation. Our focus stems from the common observation that datasets are key to research and development (R&D) (e.g., [6]) but are very costly to prepare, especially for new applications [7]. The architecture is generic (i.e., sport-agnostic), and its deployment in a sport-specific application should not require a high level of technical expertise on the developer’s part.

With a view to concretely supporting the R&D community, both in building SAR datasets and deploying relevant applications, our work is released open-source. In this paper, we focus on addressing some crucial research questions that have been raised by the above-stated aims and requirements:

RQ1: Is it possible to develop an end-to-end versatile system architecture for efficiently and easily collecting data on activities from different sports and supporting efficient ML dataset creation?
RQ2: Can devices with limited resources and low cost, such as an Arduino board, provide reliable results in ultra-low-power and low-energy contexts?
RQ3: Is it possible to develop a compact system that can be easily integrated in various sport contexts, either on the human body or on sport instruments, without compromising the quality of the collected data nor disturbing regular sport activity?

The remainder of the paper is organized as follows: Section 2 provides an overview of the related work, while Section 3 presents the proposed system architecture. Section 4 studies some real-world applications of the system, highlighting its versatility, and Section 5 draws the final conclusions and outlines some possible directions for future research.

2. Related Works

Recent advancements in wearable technology and computational intelligence have significantly transformed performance analysis in sports science. SAR systems have been proposed in the literature (e.g., [8,9,10]) to efficiently collect data and analyze sport sessions, providing valuable tools to measure and improve athlete performance.

One of the approaches highlighted in the literature is computer vision (CV) for athlete detection and movement analysis. Nadeem et al. [11] proposed a human posture estimation system that identifies human behaviors through silhouette detection, using a maximum entropy Markov model to recognize actions with accuracy rates of around 90%. Tabish et al. [12] utilized a convolutional neural network (CNN)-based approach to recognize 20 distinct actions performed during cricket matches, obtaining 99.85% accuracy using VGG19 [13] and K-Means clustering [14]. Similarly, in [15], a VGG16-based [13] hockey activity recognition system is presented for recognizing four possible classes, namely free hit, goal, penalty corner, and long corner. The resulting accuracy is 98%.

While CV-based methods allow coaches and athletes, they are constrained by their operational environments. Quality cameras are costly, require monitoring by other users, and are required to capture all participants within their field of view. These factors can affect the accuracy and efficacy of the measurements and analyses. In addition, performing ML or deep learning (DL) on field devices (e.g., microcontrollers) is problematic because of the limited resources available (computational capabilities, memory size, battery duration).

An emerging alternative to CV is the use of inertial sensing technology, which involves wearable devices equipped with accelerometers, gyroscopes, and magnetometers. This approach overcomes the limitations of CV-based systems and allows for working with smaller data sizes [16]. By processing inertial signals, it is possible to determine the orientation, speed, and path of movement of an athlete. Literature proposes systems for monitoring and analyzing sports activities using wearable inertial-sensing technology in a variety of sports, including running [5], football [17], tennis [18,19], table tennis [20], baseball [3], golf [4], basketball [21], and volleyball [22]. The key benefits of such inertial-sensing-based SAR systems are their affordability, portability, compactness, and efficiency in power usage [23].

We argue that the research landscape in the area is now mature for addressing the challenge of designing a versatile system architecture able to generically handle multiple sports by gathering sensor outputs and storing them in the cloud for a posteriori analysis or ML model training. Hsu et al. [9] propose a SAR system that can recognize eleven different sports with 99.55% accuracy. However, the system does not target the recognition of different types of actions within a single sport. Additionally, multiple sensors are used, which can cause discomfort for the athlete and increase the risk of device damage, especially in contact sports.

To promote athlete comfort and reduce rupture risks, we propose a system that utilizes a single wearable device. To effectively classify movements across various sports, the position of the wearable device has to be adaptable based on the specific body part most relevant to the sport being considered, as failure to do so may result in poor recognition [24]. Since embedded edge devices are at risk of breakage, they are not suited for local storage, and it is thus useful to have an auxiliary personal device (i.e., a fog device) close to the data source and a cloud infrastructure to which data can be shipped for later analysis. Perumal et al. [25] propose a system that can collect real-time data using IMU sensors and transmit it to an IoT Edge Server via Wi-Fi in a smart home setting. A web-based dashboard is also made available for data inspection. The authors particularly emphasize the importance of data collection, standalone systems, and dashboards to monitor results.

Testoni et al. [26] propose an end-to-end architecture for activity recognition based on smartphone device signals that relies on a MongoDB [27] cloud database. Balkhi et al. [28] propose a multipurpose wearable device for automatic weight detection, activity type recognition, and repetition count in weightlifting training. Other state-of-the-art end-to-end architectures for multi-sport activity dataset collection using wearable devices that directly communicate with the computer are presented in [29,30,31]. Our previous work, Edgine [32], is a generic end-to-end open-source architecture to collect IoT data [19], but it does not specifically facilitate ML as it lacks the possibility of labeling and visualizing recorded data.

3. System Architecture

In order to address the challenge of building a framework for efficient collection of multiple-sport activity data, we propose a system architecture consisting of three main layers: a wearable edge device (ED), a personal fog device (FD), and the cloud. The ED is a microcontroller that collects the sensor raw signals, while the FD serves as the primary storage interface. It is typically connected to the ED using a short-range wireless connection (e.g., Bluetooth, Wi-Fi) to avoid latency issues and any line or Internet connection failures. The FD can also be used to configure the system (e.g., in terms of sampling rate and sport activities’ recognizable labels). During data collection, the ED continuously sends raw data to the FD, which also processes it for subsequent transmission to the cloud. This data delivery step, which typically is performed after the sports session, requires an Internet connection and the use of HTTPS Representational State Transfer (REST)-ful APIs. Finally, the cloud database stores and organizes the data for easy retrieval and analysis. The proposed workflow is summarized in Figure 1.

In the following subsections, we explain and discuss the design choices and the challenges we faced to meet the requirements for the ED in terms of performance, power, affordability, versatility, and compactness of size. Such requirements, collected by informally interviewing relevant sports experts, are synthesized as follows:

Operative range: >150 m, based on the standard soccer field’s diagonal length;
Battery duration: >4 h, longer than most match/session durations;
Size: <50 mm × 50 mm × 30 mm, to allow the device to be placed without hindering the sport activity;
Wearable device: the device should be easily attachable to body/equipment parts such as the wrist, waist, leg, handle, etc.;
Ease of use: users who are deeply engaged in a sport activity cannot be distracted nor are they supposed to have particular knowledge about devices and systems.

3.1. Edge

The edge layer incorporates the device that records and preprocesses physical signals before transmitting them to the next fog layer. In the system design, we evaluated different options for communication, sensor, microcontroller, and power supply, considering the target application discussed in the following.

3.1.1. Communication

For the choice of the most suitable communication technology for our architecture, we identified three options: GSM, Wi-Fi, and BLE. Each one presents unique advantages and trade-offs, necessitating a consideration of factors such as, particularly, power efficiency and operational range. Table 1 provides an outlook, considering our aims. We assessed Sim900 as a GSM module [33], Nina W102 as a Wi-Fi module [34], and Nina B306 as Bluetooth module [35], as they are commonly used modules in IoT applications due to their affordability and reliability.

The comparison highlights the strengths and weaknesses of each communication type, leading to different possible applications for each one:

GSM: This option is more expensive and demands a higher power supply. It is best suited for environmental data collection applications with a powerful edge carrying out the activities of the fog layer as well, thus also transmitting data to the cloud. It offers wide coverage, making it ideal for utilization in remote areas. For these reasons, GSM in wearables is typically used for safety and health systems [36,37], which require prompt action in cases of serious need, thus bypassing the fog layer;
Wi-Fi: This communication system is preferred for applications where the size of the board is relevant and a power source is available. It utilizes a fog layer as a bridge between the device and the database, without the possibility of controlling data recording. Wi-Fi is suitable for indoor environments or areas where connectivity is stable and reliable [38].
BLE: This solution fits applications that require both a compact device size and low power consumption. It needs a fog layer to relay data to the cloud server. This layer can also serve as a human–computer interface with the ED, enabling dynamic functionality changes remotely. BLE is ideal for wearable devices or scenarios where energy efficiency and user inputs are priorities [39].

In the process of collecting sports activity data, the requirements for power efficiency, affordability, and compactness make BLE the optimal connectivity option [40]. This solution ensures an operating range that can entirely cover a sports field while maintaining low power consumption, which is essential for continuous operation throughout a whole match. Moreover, BLE is particularly suited for applications requiring the ongoing transmission of numerous small data packets, so this choice optimizes both performance and reliability.

3.1.2. Sensor and Microcontroller

The type of deployable sensors strictly depends on the objective of data collection. For our SAR task, we consider inertial sensors, which have proven to be reliable and efficient for sports activities [41,42]. Given the size constraint of the wearable device, it is necessary to have sensors and a connectivity module already present on board. We discarded the option of creating a custom board, which could have optimized efficiency and compactness but would have raised issues in terms of cost and open-source hardware architecture replicability. Thus, we chose to use a commercially available board and adapt it to our usecase, with some modifications in terms of power supply, as discussed in the next sub-section.

Particularly, we opted for an Arduino Nano family board [43], given its large developer community, online support, comprehensive documentation, and compact size. Considering the choices related to sensors and the BLE module, we could have selected either the Arduino Nano 33 BLE or the Arduino Nano 33 BLE Sense. The only difference between these two boards is the wide variety of sensors (i.e., light, humidity, temperature, pressure, etc.) on the latter. Therefore, we built our prototype using the Arduino Nano 33 BLE Sense, but the developed architecture supports both models.

The firmware running on the board has been designed to collect and transmit data from the sensors to the fog layer via its built-in BLE connectivity module. This system provides 9-axis IMU readings (i.e., accelerometer, gyroscope, and magnetometer) over BLE. The BLE protocol limits messages to 20 bytes, so to transmit the array of IMU values, each reading is sent as a 16-bit signed integer (int16_t). This conversion preserves the full resolution of the 16-bit sensors and ensures no data loss.

3.1.3. Power Supply

The type of power supply on the board is strictly dependent on the application:

Battery-less: it requires additional hardware to harvest energy from the environment [44] (i.e., antennas, solar panels, etc.). This solution would add significant complexity to the design and is not the focus of our research but could be considered at a later stage.
External battery: this option exploits the USB connection of the Arduino board, allowing the user to connect an external battery. It adds weight to the device and could increase its size.
Embedded battery: this option requires the development of a dedicated board to connect a small battery to the device. While this solution does result in a slight increase in the device’s size and weight, it retains its overall compact design.

Considering the SAR task, we opted for an embedded battery. Given the lack on the market of battery shields tailored to the Arduino Nano’s compactness, we designed and developed a dedicated board to equip the Arduino Nano 33 BLE Sense with a battery, enabling complete standalone operation.

The battery shield is designed to deliver power to the Arduino board through a rechargeable battery and enables the battery’s recharging when the Arduino board is connected to an external power source. The specific battery employed is a commonly available button-cell Lithium-Ion (LIR2450) with a nominal capacity of 120 mAh.

The battery shield circuit consists of the battery charger; a step-up converter to raise the battery voltage from 3.6 V to 5 V, which is the level indicated by Arduino specifications, and the connection to the Arduino board. The step-up circuit is represented in Figure 2. S1 (top of the picture) is the switch to turn on/off the ED connection to the battery. IC1 (TLV6112) is the operational amplifier implementing the step-up. IC2 (74LVC1GD4Z) is an inverter toggle to disable IC1 and enable the battery charger IC3 (MCP73831) (Figure 3). With this last configuration, the battery is recharging and the Arduino is powered from its built-in USB port. Two LEDs notify the user about the current status: D3 indicates the recharging of the battery, and D1 shows that 5 V is supplied by the step-up converter.

Figure 4 finally shows the connections between the Arduino pins and the battery shield. Pin 12 (VUSB) forwards the power from the Arduino’s USB port to IC3 to re-charge the battery. When the USB is not connected, on the other hand, pin 15 (VIN) is connected to the output of the IC1 to supply the microcontroller with 5 V. The D2 Zener diode has been placed to block any possible current flow back from the board (VIN) to IC1 during a re-charge. The picture clearly shows that several Arduino pins are left unused and freely employable to connect sensors or other outboard devices (e.g., Real Time Clock), depending on the specific application needs.

Our final ED prototype is sized 48 mm × 26 mm × 12 mm, maintaining the compact size of the original Arduino board, which is 45 mm × 18 mm × 3 mm. A render of the ED is shown in Figure 5.

3.2. Fog

The fog layer serves as the intermediary system tier, typically utilizing personal devices like smartphones and tablets to receive BLE communications and forward data to the cloud through HTTPS REST APIs. To support data collection operations, we designed and developed a smartphone app we called “Smart Collector”. We avoided limitations related to the operating system in use (i.e., iOS, Android, Windows) by deploying the app with Flutter [45], a well-established cross-platform development framework. Flutter allows users to develop responsive and reliable applications with essential device-level functionalities, such as Bluetooth and Internet connectivity, through dedicated packages, namely quick_blue and connectivity_plus.

The Smart collector allows users to configure the ED by setting the sampling rate. Moreover, it allows for the labels of the specific sports actions to be recognized. Finally, it is used to start the recording process. Upon stopping the recording, the app transmits the data (i.e., signal timeseries and their corresponding labels) to a cloud database. Data transmission to the cloud occurs over an HTTPS connection using a RESTful API POST method. Typically, data are sent in either JSON or CSV format [46]. To reduce the amount of data transferred over the internet, the Smart Collector exploits a compact CSV format [47]. CSV is particularly effective when the data size remains consistent across samples, which is the typical case of datasets.

Figure 6 shows the application page in which users can select the sport activity and the type of sport action that is going to be recorded (i.e., the class label of the sample in the dataset) and optionally specify the name of the time series.

During the recording, the user can check the number of collected samples in the bottom-left corner of the app (Figure 6). The Smart Collector will try to send data to the database as soon as the recording is stopped; in the case of a missing or unstable internet connection, the data are stored locally, and automatic upload attempts are executed once a stable internet connection is re-established.

While dataset creation is a peculiar functionality of the overall system, the FD can also work in pure data collection modality, thus sending the collected samples to a ML inference service that can run in the cloud but also directly on the FD.

3.3. Cloud

The cloud layer is the framework to store data received from the fog layer. For generic data collection, the framework should support time series, be efficient, and be capable of ingesting a large amount of data from the FD. A user management system is also required, to guarantee data security and user privacy.

The framework we chose is Measurify [48], an open-source, cloud-based, abstract, measurement-oriented framework designed to manage intelligent objects in IoT ecosystems [49,50].

Measurify exposes a RESTful API interface and supports machine-to-machine and user-to-machine communication, facilitating the exchange of large quantities of data in JSON and CSV formats [51].

For user authentication, Measurify provides a JSON Web Token (JWT) after successful authentication; EDs are authenticated by a distinct token provided upon device definition. This token offers limited access to database resources but has an unlimited lifetime, which eliminates the need for repeated login requests to renew the token at the beginning of each data collection session. Furthermore, Measurify aggregates data sent from various sources and provides a time series visualization dashboard (Figure 7).

3.4. Resulting Architecture

Figure 8 synthesizes our system architecture as described in the previous subsections.

The edge layer is composed of an Arduino Nano 33 BLE Sense connected to a battery shield. It collects data from the LSM9DS1 IMU sensor present onboard and transmits packets of 9 int16_t to the fog layer via BLE.

The fog layer serves as a user interface to configure and manage the recording process. Received data are then formatted as a CSV file and sent to Measurify through dedicated RESTful APIs. If Measurify is configured once for each new sport activity dataset that will be collected, then no additional effort is requested by the user.

4. Results

4.1. Applications

We assessed the proposed system by setting up four end-to-end dataset collection use-cases. We selected various sports (namely: fitwalking, tennis, soccer, and boxing), each one with a different set of recognizable actions, as reported in Table 2. It is important to highlight that the goal of this assessment is to verify the overall workflow of a versatile architecture, not the implementation of a single ML application that overcomes the state of the art. For this reason, the size of the collected datasets may be smaller than that of similar ML datasets in the literature.

Data were collected directly from the field while the athletes were training. EDs were placed on the player’s body or installed in a playing tool without interfering with user’s actions (Figure 9—ED prototype embedding for (a) fitwalking, (b) tennis, (c) soccer, and (d) boxing). The working frequency depends on the specific activity carried out. In all our test applications, 10 Hz was sufficient, as demonstrated in Section 4.2. Additionally, we verified that the system is able to correctly operate at a sampling frequency of 30 Hz.

In our typical application workflow, for data collection convenience, the athlete is asked to perform a series of actions of the same class (e.g., the forehand shot in tennis), whose label is selected on the Smart Collector. All the signal samples collected are stored in a single time series, so the result of the data collection is a number of labeled time series in each of which a single action is repeated several times.

4.2. Supervised ML Dataset Assessment

To assess the validity of the proposed end-to-end system, including the ED’s correct positioning in the different SAR use cases and capability to correctly capture the physical signals’ features, we tested the four datasets stored in the cloud with simple DL neural models.

For each dataset, its time series are split and trimmed into time windows of different sizes, each one exactly representing a single instance of the labeled action. Details on the Dynamic Time Warping (DTW) techniques we employed for aligning the different-length samples are provided in [18]. Each dataset was finally divided into a 70% training set, a 10% validation set, and a 20% test set.

For classifying the time windows, we used a class of DL models that are commonly used to classify time series [52], such as 1D CNNs. CNN classifiers typically include several convolutional layers aimed at extracting the feature, followed by a few dense layers for the final classification (Figure 10). Since deep models are typically more effective, even if they take longer to train [53], we decided to use models with a simple structure (stacking only convolutional and dense layers), few layers, and small kernel sizes (Table 3), so as to better reflect the quality of the dataset. Results in Table 3 show the high action recognition accuracy levels achieved in all the use cases.

4.3. Power Consumption Analysis

To assess the power consumption in relation to the firmware running on the ED, we tested the ED under four different working conditions. In each condition, the battery is first fully charged (45 min) then fully depleted. The first condition, indicated as Idle, is the baseline test in which the data collection firmware has been replaced with an empty sketch. In the second one, named Standalone, our firmware collects data without transmitting it to the FD. The third and fourth conditions involve continuous data collection and transmission to the FD, but at different rates: 10 Hz and 30 Hz, respectively. Reported values are averages of over five systems running under the same battery and working conditions. The power consumption from the battery is calculated as Power = Voltage × Current, measuring the current drawn by the board through a digital multimeter (Agilent 34401A, SGLabs, Perugia (PG), Italy), with the board powered by a 3.6 V DC power supply (Agilent E3631A, SGLabs, Perugia (PG), Italy) simulating a fully charged battery. Conversely, the current and power consumption from USB are measured using a USB-Multimeter (JT-UM120 [54]) connected to a Laptop PC. Results in Table 4 highlight the energy efficiency of the proposed system, allowing a battery duration of more than four hours, which makes it usable for several sport types. Standalone operation is the main consumption factor, while transmission impact is relatively lower, increasing with the sampling frequency. Table 4 also reports, for each ED condition, the average current and power consumptions in both the ED power modes: 3.6 V coin battery (columns 3 and 4, respectively) and 5 V USB cable (columns 5 and 6, respectively). When powered by a battery, the ED drains more current (hence power) than when it is powered by USB. This is caused by the additional energy absorbed by the battery control circuit and by the operative led D1 (Figure 2), which signals the functioning of the battery and could be removed to reduce energy consumption.

4.4. Comparison with State-of-the Art Solutions

As a final summary, Table 5 presents a comparison between our model and a selection of state-of-the-art solutions. The outlook highlights that while all the analyzed solutions are, at least in principle, applicable to different types of sports, our system has been defined as multi-sport by design. This is particularly due to the high configurability of all the components, such as ED/FD (where the sport action classes can be selected from a list or custom-defined) and cloud (where Measurify’s resources, such as a dataset, can be configured in a matter of a few minutes). This makes the overall system usable by anyone with basic computing skills, overcoming the technical difficulties involved in creating the system from scratch. This allowed our BSc Electronic Engineering students to deploy several end-to-end dataset creation applications in about two weeks of work, including system understanding and installation, while an expert developer can learn and deploy an instance of the system in about one working day. Moreover, our system allows users to inspect and manage uploaded datasets through a powerful GUI. On the other hand, it does not offer the possibility of using more than one ED and requires a FD to send data to the cloud.

5. Conclusions

This work has developed and studied an edge-to-cloud, end-to-end system architecture for SAR dataset collection. On the edge, the system relies on a battery-operated board based on a popular Arduino IoT device. Tests in authentic contexts of use in four different sports (40 athletes in fitwalking, tennis, soccer, and boxing, for a total of 650 recorded minutes) have shown the system’s ability to effectively collect ML-usable data with an energy consumption compatible with most of the sports, using a commercially available rechargeable battery.

Compared to other SAR data collection systems, our proposal has the peculiarity of targeting the collection of datasets, exploiting a key feature of the Measurify IoT framework for management of measurement data (i.e.,.csv dataset management [51]) in a versatile workflow supporting multi-frequency sampling and efficient labeling of signal timeseries. The architecture is ready to deploy, independent of any specific sport, and a new dataset generation application can be set up in a few days, even by novice developers.

In order to support further research in the field, the proposed framework is published as open source and available here: ED: https://github.com/measurify/edge-meter (accessed on 18 March 2024); FD: https://github.com/measurify/smart-collector (accessed on 18 March 2024); Cloud: https://github.com/measurify/server (accessed on 18 March 2024).

The experience and results presented in this paper open significant perspectives for the R&D community in the IoT area. First of all, we highlight the extension to other sports and the increase in the number of activities within each sport. Extensive user testing is also needed to assess the usability and impact of the new system.

A limitation to be addressed concerns the FD, which temporarily stores and forwards collected samples from the ED to the cloud. The ED lacks any user interface, thus a user could lose hours of data collection in case the Smart Collector application and, more generally, the FD are not frequently monitored (e.g., for a ED–FD connection loss). Power consumption analysis has shown that the ED firmware could be optimized as well, particularly for standalone operation.

As further extensions, signals from non-inertial sensors (e.g., light and temperature) may be considered to increase the action recognition capabilities and expand them for more comprehensive sport session evaluation. The collected datasets could be employed for more advanced ML SAR studies as well [55]. Other significant technological challenges concern the miniaturization of the hardware and the development or integration of a suitable energy-scavenging sub-system (possibly adapted by sport type) for battery-less operation.

Author Contributions

Conceptualization, R.B., F.B. and M.F.; methodology, R.B., F.B., M.F. and F.A.; hardware, F.A., F.B., R.B., M.F., A.C., A.D. and L.L.; software, R.B., M.F., A.D. and A.C.; validation, A.D., L.L. and A.C.; data curation, M.F.; writing—original draft preparation, F.B., M.F., A.C., A.D. and L.L.; writing—review and editing, F.B., M.F., A.C., R.B., A.D. and L.L.; supervision, R.B. and F.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The edge documentation and firmware along with the application “Smart-Collector” are available at: ED: https://github.com/measurify/edge-meter (accessed on 18 March 2024); FD: https://github.com/measurify/smart-collector (accessed on 18 March 2024); Cloud: https://github.com/measurify/server, accessed on 18 March 2024.

Acknowledgments

The authors would like to thank all the Engineers (at the time, BSc. students) that contributed to the application development and testing: Ludovico Lozza, Marco Vercellino, and Lorenzo Giampietro.

Conflicts of Interest

The authors declare no conflicts of interest.

References

McDevitt, S.; Hernandez, H.; Hicks, J.; Lowell, R.; Bentahaikt, H.; Burch, R.; Ball, J.; Chander, H.; Freeman, C.; Taylor, C.; et al. Wearables for Biomechanical Performance Optimization and Risk Assessment in Industrial and Sports Applications. Bioengineering 2022, 9, 33. [Google Scholar] [CrossRef]
Meng, Z.; Zhang, M.; Guo, C.; Fan, Q.; Zhang, H.; Gao, N.; Zhang, Z. Recent Progress in Sensing and Computing Techniques for Human Activity Recognition and Motion Analysis. Electronics 2020, 9, 1357. [Google Scholar] [CrossRef]
Ghasemzadeh, H.; Jafari, R. Coordination Analysis of Human Movements With Body Sensor Networks: A Signal Processing Model to Evaluate Baseball Swings. IEEE Sens. J. 2011, 11, 603–610. [Google Scholar] [CrossRef]
Ueda, M.; Negoro, H.; Kurihara, Y.; Watanabe, K. Measurement of Angular Motion in Golf Swing by a Local Sensor at the Grip End of a Golf Club. IEEE Trans. Hum.-Mach. Syst. 2013, 43, 398–404. [Google Scholar] [CrossRef]
Aubol, K.G.; Milner, C.E. Foot Contact Identification Using a Single Triaxial Accelerometer during Running. J. Biomech. 2020, 105, 109768. [Google Scholar] [CrossRef]
Banko, M.; Brill, E. Scaling to Very Very Large Corpora for Natural Language Disambiguation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, 6–11 July 2001; Association for Computational Linguistics: Toulouse, France; pp. 26–33. [Google Scholar]
Machine Learning Costs: Price Factors and Real-World Estimates | HackerNoon. Available online: https://hackernoon.com/machine-learning-costs-price-factors-and-real-world-estimates (accessed on 8 March 2024).
Santos-Gago, J.M.; Ramos-Merino, M.; Vallarades-Rodriguez, S.; Álvarez-Sabucedo, L.M.; Fernández-Iglesias, M.J.; García-Soidán, J.L. Innovative Use of Wrist-Worn Wearable Devices in the Sports Domain: A Systematic Review. Electronics 2019, 8, 1257. [Google Scholar] [CrossRef]
Hsu, Y.-L.; Yang, S.-C.; Chang, H.-C.; Lai, H.-C. Human Daily and Sport Activity Recognition Using a Wearable Inertial Sensor Network. IEEE Access 2018, 6, 31715–31728. [Google Scholar] [CrossRef]
Cust, E.E.; Sweeting, A.J.; Ball, K.; Robertson, S. Machine and Deep Learning for Sport-Specific Movement Recognition: A Systematic Review of Model Development and Performance. J. Sports Sci. 2019, 37, 568–600. [Google Scholar] [CrossRef]
Nadeem, A.; Jalal, A.; Kim, K. Automatic Human Posture Estimation for Sport Activity Recognition with Robust Body Parts Detection and Entropy Markov Model. Multimed. Tools Appl. 2021, 80, 21465–21498. [Google Scholar] [CrossRef]
Tabish, M.; Tanooli, Z.-R.; Shaheen, M. Activity Recognition Framework in Sports Videos. Multimed. Tools Appl. 2024, 83, 15101–15123. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arxiv 2014, arXiv:1409.1556. [Google Scholar] [CrossRef]
Jain, A.K.; Dubes, R.C. Algorithms for Clustering Data; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1988; ISBN 978-0-13-022278-7. [Google Scholar]
Rangasamy, K.; As’ari, M.A.; Rahmad, N.A.; Ghazali, N.F. Hockey Activity Recognition Using Pre-Trained Deep Learning Model. ICT Express 2020, 6, 170–174. [Google Scholar] [CrossRef]
Fresta, M.; Dabbous, A.; Bellotti, F.; Capello, A.; Lazzaroni, L.; Pighetti, A.; Berta, R. Low-Cost, Edge-Cloud, End-to-End System Architecture for Human Activity Data Collection. In Proceedings of the Applications in Electronics Pervading Industry, Environment and Society; Bellotti, F., Grammatikakis, M.D., Mansour, A., Ruo Roch, M., Seepold, R., Solanas, A., Berta, R., Eds.; Springer Nature Switzerland: Cham, Switzerland, 2024; pp. 444–449. [Google Scholar]
Cuperman, R.; Jansen, K.M.B.; Ciszewski, M.G. An End-to-End Deep Learning Pipeline for Football Activity Recognition Based on Wearable Acceleration Sensors. Sensors 2022, 22, 1347. [Google Scholar] [CrossRef]
Dabbous, A.; Fresta, M.; Bellotti, F.; Berta, R. Neural Architecture for Tennis Shot Classification on Embedded System. In Proceedings of the Applications in Electronics Pervading Industry, Environment and Society; Bellotti, F., Grammatikakis, M.D., Mansour, A., Ruo Roch, M., Seepold, R., Solanas, A., Berta, R., Eds.; Springer Nature Switzerland: Cham, Switzerland, 2024; pp. 97–102. [Google Scholar]
Berta, R.; Bellotti, F.; De Gloria, A.; Lazzaroni, L. Assessing Versatility of a Generic End-to-End Platform for IoT Ecosystem Applications. Sensors 2022, 22, 713. [Google Scholar] [CrossRef]
Liu, R.; Wang, Z.; Shi, X.; Zhao, H.; Qiu, S.; Li, J.; Yang, N. Table Tennis Stroke Recognition Based on Body Sensor Network. In Proceedings of the Internet and Distributed Computing Systems; Montella, R., Ciaramella, A., Fortino, G., Guerrieri, A., Liotta, A., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 1–10. [Google Scholar]
Nguyen, L.N.N.; Rodríguez-Martín, D.; Català, A.; Pérez-López, C.; Samà, A.; Cavallaro, A. Basketball Activity Recognition Using Wearable Inertial Measurement Units. In Proceedings of the XVI International Conference on Human Computer Interaction; Association for Computing Machinery: New York, NY, USA, 2015; pp. 1–6. [Google Scholar]
Haider, F.; Salim, F.A.; Postma, D.B.W.; van Delden, R.; Reidsma, D.; van Beijnum, B.-J.; Luz, S. A Super-Bagging Method for Volleyball Action Recognition Using Wearable Sensors. Multimodal Technol. Interact. 2020, 4, 33. [Google Scholar] [CrossRef]
Cornacchia, M.; Ozcan, K.; Zheng, Y.; Velipasalar, S. A Survey on Activity Detection and Classification Using Wearable Sensors. IEEE Sens. J. 2017, 17, 386–403. [Google Scholar] [CrossRef]
Kunze, K.; Lukowicz, P. Sensor Placement Variations in Wearable Activity Recognition. IEEE Pervasive Comput. 2014, 13, 32–41. [Google Scholar] [CrossRef]
Perumal, T.; Ramanujam, E.; Suman, S.; Sharma, A.; Singhal, H. Internet of Things Centric-Based Multiactivity Recognition in Smart Home Environment. IEEE Internet Things J. 2023, 10, 1724–1732. [Google Scholar] [CrossRef]
Testoni, A.; Di Felice, M. A Software Architecture for Generic Human Activity Recognition from Smartphone Sensor Data. In Proceedings of the 2017 IEEE International Workshop on Measurement and Networking (M&N), Naples, Italy, 27–29 September 2017; IEEE: Naples, Italy; pp. 1–6. [Google Scholar]
MongoDB: The Developer Data Platform. Available online: https://www.mongodb.com (accessed on 17 July 2023).
Balkhi, P.; Moallem, M. A Multipurpose Wearable Sensor-Based System for Weight Training. Automation 2022, 3, 132–152. [Google Scholar] [CrossRef]
Ghibellini, A.; Bononi, L.; Di Felice, M. Intelligence at the IoT Edge: Activity Recognition with Low-Power Microcontrollers and Convolutional Neural Networks. In Proceedings of the 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2022; IEEE: Las Vegas, NV, USA; pp. 707–710. [Google Scholar]
Liu, Y. Research and Development of GNSS Wearable Device for Sports Performance Monitoring by Example of Soccer Player Analysis∗. In Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering, Xiamen, China, 21–23 October 2022; ACM: Xiamen, China; pp. 901–906. [Google Scholar]
Zhang, H.; Zhang, Z.; Gao, N.; Xiao, Y.; Meng, Z.; Li, Z. Cost-Effective Wearable Indoor Localization and Motion Analysis via the Integration of UWB and IMU. Sensors 2020, 20, 344. [Google Scholar] [CrossRef]
Berta, R.; Mazzara, A.; Bellotti, F.; De Gloria, A.; Lazzaroni, L. Edgine, A Runtime System for IoT Edge Applications. In Applications in Electronics Pervading Industry, Environment and Society; Saponara, S., De Gloria, A., Eds.; Lecture Notes in Electrical Engineering; Springer International Publishing: Cham, Switzerland, 2021; Volume 738, pp. 261–266. ISBN 978-3-030-66728-3. [Google Scholar]
Farjana, M.; Fahad, A.B.; Alam, S.E.; Islam, M.M. An IoT- and Cloud-Based E-Waste Management System for Resource Reclamation with a Data-Driven Decision-Making Process. IoT 2023, 4, 202–220. [Google Scholar] [CrossRef]
Arcobelli, V.A.; Zauli, M.; Galteri, G.; Cristofolini, L.; Chiari, L.; Cappello, A.; De Marchi, L.; Mellone, S. mCrutch: A Novel m-Health Approach Supporting Continuity of Care. Sensors 2023, 23, 4151. [Google Scholar] [CrossRef] [PubMed]
Sudha Kumari, L.; Kouzani, A.Z. A Miniaturized Closed-Loop Optogenetic Brain Stimulation Device. Electronics 2022, 11, 1591. [Google Scholar] [CrossRef]
Sunehra, D.; Sreshta, V.S.; Shashank, V.; Kumar Goud, B.U. Raspberry Pi Based Smart Wearable Device for Women Safety Using GPS and GSM Technology. In Proceedings of the 2020 IEEE International Conference for Innovation in Technology (INOCON), Bangaluru, India, 6–8 November 2020; pp. 1–5. [Google Scholar]
Rihana, S.; Mondalak, J. Wearable Fall Detection System. In Proceedings of the 2016 3rd Middle East Conference on Biomedical Engineering (MECBME), Beirut, Lebanon, 6–7 October 2016; pp. 84–87. [Google Scholar]
Rodrigues, M.J.; Postolache, O.; Cercas, F. Wearable Smart Sensing and UWB System for Fall Detection in AAL Environments. In Proceedings of the 2023 IEEE Sensors Applications Symposium (SAS), Ottawa, ON, Canada, 18–20 July 2023; pp. 1–6. [Google Scholar]
Dabbous, A.; Fresta, M.; Bellotti, F.; Berta, R. Arduino Nano-Based System for Tennis Shot Classification. Lect. Notes Electr. Eng. 2024, 1113, 357–362. [Google Scholar] [CrossRef]
Soro, A.; Brunner, G.; Tanner, S.; Wattenhofer, R. Recognition and Repetition Counting for Complex Physical Exercises with Deep Learning. Sensors 2019, 19, 714. [Google Scholar] [CrossRef] [PubMed]
Xia, K.; Wang, H.; Xu, M.; Li, Z.; He, S.; Tang, Y. Racquet Sports Recognition Using a Hybrid Clustering Model Learned from Integrated Wearable Sensor. Sensors 2020, 20, 1638. [Google Scholar] [CrossRef] [PubMed]
Ghazali, N.F.; Shahar, N.; Rahmad, N.A.; Sufri, N.A.J.; As’ari, M.A.; Latif, H.F.M. Common Sport Activity Recognition Using Inertial Sensor. In Proceedings of the 2018 IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA), Penang, Malaysia, 9–10 March 2018; IEEE: Batu Feringghi, Malaysia; pp. 67–71. [Google Scholar]
Nano Family. Available online: https://store.arduino.cc/pages/nano-family (accessed on 13 February 2024).
Yang, C.-C.; Pandey, R.; Tu, T.-Y.; Cheng, Y.-P.; Chao, P.C.-P. An Efficient Energy Harvesting Circuit for Batteryless IoT Devices. Microsyst. Technol. 2020, 26, 195–207. [Google Scholar] [CrossRef]
Flutter-Build Apps for Any Screen. Available online: https://flutter.dev/ (accessed on 14 February 2024).
Thambawita, V.; Hicks, S.A.; Borgli, H.; Stensland, H.K.; Jha, D.; Svensen, M.K.; Pettersen, S.-A.; Johansen, D.; Johansen, H.D.; Pettersen, S.D.; et al. PMData: A Sports Logging Dataset. In Proceedings of the 11th ACM Multimedia Systems Conference, New York, NY, USA, 27 May 2020; ACM: Istanbul, Turkey; pp. 231–236. [Google Scholar]
Fresta, M.; Bellotti, F.; Capello, A.; Cossu, M.; Lazzaroni, L.; De Gloria, A.; Berta, R. Efficient Uploading of.Csv Datasets into a Non-Relational Database Management System. In Applications in Electronics Pervading Industry, Environment and Society; Berta, R., De Gloria, A., Eds.; Lecture Notes in Electrical Engineering; Springer Nature Switzerland: Cham, Switzerland, 2023; Volume 1036, pp. 9–15. ISBN 978-3-031-30332-6. [Google Scholar]
Berta, R.; Kobeissi, A.; Bellotti, F.; De Gloria, A. Atmosphere, an Open Source Measurement-Oriented Data Framework for IoT. IEEE Trans. Ind. Inform. 2021, 17, 1927–1936. [Google Scholar] [CrossRef]
Capello, A.; Fresta, M.; Bellotti, F.; Haghighi, H.; Hiller, J.; Mozaffari, S.; Berta, R. Exploiting Big Data for Experiment Reporting: The Hi-Drive Collaborative Research Project Case. Sensors 2023, 23, 7866. [Google Scholar] [CrossRef]
Lazzaroni, L.; Mazzara, A.; Bellotti, F.; De Gloria, A.; Berta, R. Employing an IoT Framework as a Generic Serious Games Analytics Engine. In Proceedings of the Games and Learning Alliance; Marfisi-Schottman; Bellotti, F.I., Hamon, L., Klemke, R., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 79–88. [Google Scholar]
Fresta, M.; Capello, A.; Bellotti, F.; Lazzaroni, L.; Cossu, M.; Berta, R. Supporting a .Csv-Based Workflow in MongoDB for Data Analysts. In Proceedings of the 2023 IEEE 32nd International Symposium on Industrial Electronics (ISIE), Helsinki-Espoo, Finland, 19–21 June 2023; IEEE: Helsinki, Finland; pp. 1–4. [Google Scholar]
Khalife, R.; Mrad, R.; Dabbous, A.; Ibrahim, A. Real-Time Implementation of Tiny Machine Learning Models for Hand Motion Classification. In Proceedings of the Applications in Electronics Pervading Industry, Environment and Society; Bellotti, F., Grammatikakis, M.D., Mansour, A., Ruo Roch, M., Seepold, R., Solanas, A., Berta, R., Eds.; Springer Nature Switzerland: Cham, Switzerland, 2024; pp. 487–492. [Google Scholar]
Vinayakumar, R.; Soman, K.P.; Poornachandran, P. Evaluating Effectiveness of Shallow and Deep Networks to Intrusion Detection System. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Manipal, India, 13–16 September 2017; pp. 1282–1289. [Google Scholar]
Products | Joy-IT. Available online: https://joy-it.net/en/products/JT-UM120 (accessed on 13 March 2024).
Xu, C.; Liu, Z.; Li, P.; Yan, J.; Yao, L. Bifurcation Mechanism for Fractional-Order Three-Triangle Multi-Delayed Neural Networks. Neural Process. Lett. 2023, 55, 6125–6151. [Google Scholar] [CrossRef]

Figure 1. Data collection system workflow.

Figure 2. Step-up circuit of the battery shield.

Figure 3. Battery charger circuit of the battery shield.

Figure 4. Arduino power connections.

Figure 5. Render of the ED. (a) Bottom view of the ED, with indication of the components on the battery shield; (b) Top view of the ED, showing the Arduino board placed on the battery shield.

Figure 6. Snapshot of the Flutter application.

Figure 7. Snapshot of the time series visualization on Measurify.

Figure 8. Developed architecture.

Figure 9. ED prototype embedding: (a) fitwalking; (b) tennis; (c) soccer; (d) boxing.

Figure 10. Example of CNN architecture.

Table 1. Comparison between GSM, Wi-Fi, and BLE.

Module	GSM	Wi-Fi	BLE
Peak current	~600 mA	~190 mA	~14 mA
Distance	NA ¹	<50 m	<400 m
Byte per message	NA ²	NA ²	20
Additional fog requirements	NA ¹	Wi-Fi Internet connection	Device with Bluetooth and Internet connection
Additional shield	Yes	Often onboard	Often onboard
Additional costs	SIM contract	No	No

¹ GSM does not require near-range devices. ² HTTP protocol has no limitation on body size.

Table 2. Tested applications.

Sport	Actions	ED Position	Number of Athletes	Recording Duration	Sampling Frequency
Fitwalking	Walking, Running, Climbing Stairs	Chest	10	150 min	10 Hz
Tennis	Forehand, Backhand, Serve	Racket	10	190 min	10 Hz
Soccer	Shot, Pass, Stop	Shin guard	10	180 min	10 Hz
Boxing	Cross, Left Hook, Right Hook	Belt	10	130 min	10 Hz

Table 3. Details on training and testing of the CNN with collected datasets.

Sport	Input Size ¹	Kernel Size	NN Layers	Test Accuracy
Fitwalking	[25, 6]	3	[Conv1D, Conv1D, Conv1D, Dense, Dense]	93.81%
Tennis	[15, 6]	3	[Conv1D, Conv1D, Dense, Dense]	97.54%
Soccer	[15, 6]	3	[Conv1D, Conv1D, Dense, Dense]	96.35%
Boxing	[15, 6]	3	[Conv1D, Conv1D, Dense, Dense]	95.97%

¹ The first axis is the length of the time window, the second one is the signal dimensionality (i.e., 3-axis accelerometer, 3-axis gyroscope).

Table 4. Power consumption analysis results.

ED Condition	Battery Duration [min]	Battery-Powered		USB-Powered
ED Condition	Battery Duration [min]	Average Current [mA]	Average Power Consumption [mW]	Average Current [mA]	Average Power Consumption [mW]
Idle	355	44.41	159.88	21.33	106.67
Standalone	301	52.22	187.99	25.02	125.10
10 Hz Continuous	293	52.65	189.54	25.14	125.69
30 Hz Continuous	286	53.02	190.87	25.19	125.97

Table 5. Functional feature comparison between data collection systems.

Solution	Wearable	End-to-End Architecture	Open Source	Fog Device	Multiple EDs	GUI Dashboard	Cloud Connection Required	Multi-Sport Ready Deployability
[26]		✔					✔
[29]	✔	✔
[28]	✔
[30]	✔	✔			✔		✔
[31]	✔	✔
[9]	✔	✔		✔	✔	✔
[25]	✔	✔		✔	✔	✔	✔
[32]	✔	✔	✔	✔			✔
Ours	✔	✔	✔	✔		✔	¹	✔

¹ Internet is not required while recording data, data will be stored locally until a (stable) internet connection is available.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fresta, M.; Bellotti, F.; Capello, A.; Dabbous, A.; Lazzaroni, L.; Ansovini, F.; Berta, R. End-to-End Dataset Collection System for Sport Activities. Electronics 2024, 13, 1286. https://doi.org/10.3390/electronics13071286

AMA Style

Fresta M, Bellotti F, Capello A, Dabbous A, Lazzaroni L, Ansovini F, Berta R. End-to-End Dataset Collection System for Sport Activities. Electronics. 2024; 13(7):1286. https://doi.org/10.3390/electronics13071286

Chicago/Turabian Style

Fresta, Matteo, Francesco Bellotti, Alessio Capello, Ali Dabbous, Luca Lazzaroni, Flavio Ansovini, and Riccardo Berta. 2024. "End-to-End Dataset Collection System for Sport Activities" Electronics 13, no. 7: 1286. https://doi.org/10.3390/electronics13071286

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

End-to-End Dataset Collection System for Sport Activities

Abstract

1. Introduction

2. Related Works

3. System Architecture

3.1. Edge

3.1.1. Communication

3.1.2. Sensor and Microcontroller

3.1.3. Power Supply

3.2. Fog

3.3. Cloud

3.4. Resulting Architecture

4. Results

4.1. Applications

4.2. Supervised ML Dataset Assessment

4.3. Power Consumption Analysis

4.4. Comparison with State-of-the Art Solutions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI