Next Article in Journal
Interdependence of Technical and Technological Parameters in Polymer Ultrasonic Welding
Previous Article in Journal
Intramedullary Skeletal Distraction Robot: Novel Design and Optimization of Implantable Lengthening Nail
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

An ROS Architecture for Autonomous Mobile Robots with UCAR Platforms in Smart Restaurants

The Key Laboratory of Advanced Manufacturing Technology, Ministry of Education, Guiyang 550025, China
School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China
School of Economics and Management, Ningbo University of Technology, Ningbo 315211, China
Author to whom correspondence should be addressed.
Machines 2022, 10(10), 844;
Submission received: 22 August 2022 / Revised: 19 September 2022 / Accepted: 20 September 2022 / Published: 23 September 2022
(This article belongs to the Section Automation and Control Systems)


To lessen the spread of COVID-19 and other dangerous bacteria and viruses, contactless distribution of different items has gained widespread popularity. In order to complete delivery tasks at a catering facility, this paper explores the development of an autonomous mobile robot. The robot, in particular, plans its path and maintains smooth and flexible mobility using a Time Elastic Band (TEB) motion control method and an upgraded Dijkstra algorithm. On the open-source AI platform of iFLYTEK, a voice recognition module was trained to recognize voice signals of different tones and loudness, and an image recognition capability was attained using YOLOv4 and SIFT. The UCAR intelligent vehicle platform, made available by iFLYTEK, served as the foundation for the development of the mobile robot system. The robot took part in China’s 16th National University Student Intelligent Car Race, an experimental demonstration test of the developed mobile robotics. The results of the experiments and task tests demonstrated that the proposed robot architecture was workable. In addition, we designed and put together a mobile robot utilizing components from the Taobao website. Compared to UCAR, this robot is less expensive and has the flexibility to be used in a variety of real-world settings.

1. Introduction

With the recent advancements in autonomous mobility technology for mobile robots, multiple service robots have been proposed to give assistance in a variety of industries, including manufacturing, warehousing, health care, agriculture, and restaurants [1]. The rapid development of various technologies facilitates people’s lives. However, the urgent requirements caused by COVID-19 pandemic brought a lot of change to people’s lives. In that case, contact tracing systems were proposed to curb the development of the epidemic [2]. In fact, people’s experience and acceptance of artificial intelligence are constantly improving with the emergence and application of artificial intelligence (AI) products such as robotic entities [3]. In addition, staff shortages, the ongoing demand for productivity development, and the change of the service sectors have pushed the restaurant business to adopt robotics. In an effort to increase labor productivity, service robots, such as the automated guided vehicle (AGV) operation system, have been implemented in Japanese restaurants [4,5]. AGV systems have evolved into autonomous mobile robots (AMR) to provide more operational flexibility and improved productivity as a result of the development of ai technologies, powerful onboard processors, ubiquitous sensors, and simultaneous location and mapping (SLAM) technology. Traditional low-cost service robots move by patrolling the line. In recent years, radar-based autonomous navigation robots have been widely studied, but their prices are even as high as $17,000 [6]. Although the use of artificial intelligence and robots in restaurants is still in its infancy, restaurant managers are seeking assistance in using modern technologies to improve service [7]. COVID-19 has altered the restaurant industry because the contactless paradigm may provide customers and staff with comfortability when dining and working [8].
Many restaurant managers are considering implementing service AMRs in order to achieve frictionless service and mitigate the staff shortage. Cheong et al. [6] developed a prototype of a Mecanum wheel-based mobile waiter robot for testing in a dining food outlet. Yu et al. [9] designed a restaurant service robot for ordering, fetching, and sending food. In response to the challenges presented by COVID-19, more and more restaurants utilizing robots have emerged in China and the U.S. [10]. Yang et al. [11] proposed a robotic delivery authentication system, which includes the client, the server, and the robot. Multiple robots for varied use cases, including hotpot, Chinese food, and coffee, have been developed to meet the needs of a range of restaurants [12]. Nonetheless, several institutions are attempting to build a universal, unified, and practical robot system that can be deployed to a variety of restaurant application situations. At the 2022 Winter Olympics in Beijing, one of the most successful applications of robots handling plates was implemented. This promising example is anticipated to raise demand for restaurant robots even more. However, the cost of robotics is essential for restaurant management to consider before adopting the technology.
As an alternative to the aforementioned development strategies, restaurant service robots could be built on a generic framework to save development expenses. Chitta et al. [13] provided a generic and straightforward control framework to implement and manage robot controllers to improve both real-time performance and sharing of controllers. Ladosz et al. [14] presented a robot operating system (ROS) to develop and test autonomous control functions. This ROS can support an extensive range of robotic systems and can be regarded as a low cost, simple, highly flexible, and re-configurable system. Chivarov et al. [15] aimed to develop a cost-oriented autonomous humanoid robot to assist humans in daily life tasks and studied the interaction and cooperation between robots and Internet of Things (IoT) devices. Tkáčik et al. [16] presented a prototype for a modular mobile robotic platform for universal use in both research and teaching activities. Noh et al. [17] introduced an autonomous navigation system for indoor mobile robots based on open source and achieved three main modules, including mapping, localization, and planning. Based on the robot design, customer features, and service task characteristics, it is feasible to determine the optimal adaption to different service tasks [18]. Silva et al. [19] proposed an embedded architecture to utilize cognitive agents in cooperation with the ROS. Ref. Wang et al. [20] integrated augmented reality in human–robot collaboration and suggested an advanced compensation mechanism for accurate robot control. Oliveira et al. [21] proposed a multi-modal ROS framework for multi-sensor fusion. They established a transformation model between multiple sensors and achieved interactive positioning of sensors and labeling of data to facilitate the calibration procedure. Fennel et al. [22] discussed the method of seamless and modular real-time control under the ROS system.
In addition to the restaurant business, more and more industries are contemplating the implementation of AMRs. Fragapane et al. [23] summarized five application scenes of AMRs in hospital logistics, and they found AMRs can increase the value-added time of hospital personnel for patient care. In order to suit the trend of Industry 4.0, manufacturing systems have effectively adopted AMRs for material feeding [24]. However, it is difficult to integrate robots into the smart environment [25]. Furthermore, the service scenarios had an effect on service robot adoption intention [26]. da Rosa Tavares and Victória Barbosa [27] observed a trend of using robots to improve environmental and actionable intelligence. Still necessary is the development of a general mobile robot system based on open-source robot platforms for the management of a variety of difficult service scenarios. In this circumstance, the developers migrate the prior scenario’s environment perception in order to implement the related functionality.
In this study, we employ the iFLYTEK-provided UCAR intelligent car platform to create a ROS-based robot control architecture. In addition, we present the open-source platforms used to implement the robot’s functionalities, including path planning, image recognition, and voice interaction. Particularly, we incorporate the feedback data of lidar and acceleration sensors to alter the settings of the Time Elastic Band (TEB) algorithm to assure positioning accuracy. Notably, we examine the viability of the proposed mobile robotic system by analyzing the outcomes of demonstration scenario tests conducted at the 16th Chinese University Student Intelligent Car race. In addition, we assemble a mobile robot as an alternative to iFLYTEK’s UCAR, and our robot offers a significant cost advantage.
The main contributions of our study are as follows:
  • We present a novel framework for the hardware and software development of mobile robotic systems. To achieve regulated functionalities, we incorporate a Mecanum-wheeled chassis, mechanical damping suspension, sensors, camera, and microphone. In addition, we implement autonomous navigation, voice interaction, and image recognition using open-source platforms such as YOLOV4 and SIFT.
  • We assembled a mobile robot in accordance with the functional specifications outlined by the Chinese University Student Intelligent Car competition. The components are acquired from the website Taobao. We closely regulate the pricing of these components and compile a consumption list for other developers’ reference.
  • We test the viability of the introduced mobile robotic system utilizing the demonstration test scenario. We also present the test results of the robot manufactured in-house and guarantee that it is capable of performing the necessary activities at a cheaper cost.
  • This paper’s intended scientific contribution focuses mostly on how to resolve waste and expense issues generated by new technology in the smart restaurant industry. We replace the acquired full-featured robot with the assembly robot with core functions in order to construct a digital twin system and alleviate the resource waste-related cost issue.
The rest of the paper is organized as follows: The following section provides the hardware architecture of the developed robot and introduces the implementation details of our self-assembled mobile robot. Section 3 provides a systematic framework for the developed robot in the software design. After that, Section 4 presents the experimental results for our framework. Finally, conclusions and future research directions are given in Section 5.

2. Robots Design

2.1. Hardware and Software Design of UCAR

We designed an autonomous mobile robot for accomplishing delivery jobs in smart restaurants. The mobile robot recognizes its surroundings in real-time while traveling and autonomously avoids obstructions. According to its open-source schematic diagram, its hardware structure includes a master computer used to deploy the model and analyze the comprehensive data and a slave computer used to operate the external circuit directly. Information sharing between them is done using serial communication. The master computer is a control board with an x86 architecture installing the Ubuntu development environment, which is suited for development using C++, C, Python, and other programming languages. The slave computer is an STM32f407 control board, which controls the base of a car through the I/O circuit by the interruption response mode, collects pertinent information from sensors, and is responsible for motion control and environment perception as shown in Table 1.
The structure of the UCAR platform upon which the robot is based is shown in Figure 1. The chassis of the mobile robot uses Mecanum wheels. A Mecanum wheel can move omnidirectionally by tilting the rim, so enabling many forms of motion, including translation and rotation. In our project, the posture estimation approach is based on the motion form of the Mecanum wheel. The current posture state is obtained via continuous iteration of the initial pose conditions and is employed for attitude correction during autonomous navigation.
We employed attitude sensors to measure the linear acceleration of the three coordinate axes and the rotational angular acceleration of the three axes in the Cartesian coordinate system. The collected dynamic force and torque in the Cartesian coordinate system are changed into matching electrical quantities according to Newton’s rules through the force electric conversion module. After that, the collected acceleration values are obtained. After taking UCAR as a rigid body kinematics model, the displacement and velocity values in six directions may be acquired by integration according to the six accelerations of its inertial measurement unit (IMU), and the current pose information is gradually gathered from the beginning point.
UCAR equips a triangular ranging lidar, which scans the environment at a frequency of 5–12 Hz at the height of 30 cm from the ground. The radar creates point cloud information through the return value. It then uses the Simultaneous Localization and Mapping (SLAM) technique to build a real-time surrounding point cloud map employed for dynamic local path planning. The specific parameters of lidar are displayed in Table 2.
Limited by the mechanical structure of the car body, UCAR is equipped with a modest wide-angle camera to deal with computer vision tasks in complex situations. The camera takes photographs in a vast-angle range and applies machine vision models using a neural network. It is of significant utility when robots need to assess real-world scenes. Table 3 provides the specifications of the wide-angle camera.
With the advancement of human–computer interaction capabilities, more and more application settings are supporting voice-based human–computer interactions. Voice input has led to faster task performance and fewer errors than keyboard typing. UCAR provides a developer interface based on the iFLYTEK AI platform. In this study, we trained a voice recognition model by registering the online voice module training tool, which can be used as an intelligent vehicle’s voice interface function package.

2.2. Robot Assembly

After establishing the basic functionalities of an autonomous navigation robot for restaurant delivery based on the UCAR platform given by iFLYTEK, we wanted to construct an autonomous mobile robot with lower cost and guaranteed performance. The rationale for deciding to construct a low-cost robot separately is that the development of the ROS-based system increases the cost of maintaining and learning the operation of the system function package, as indicated before. Mastering another robot operating system is unnecessary and not in step with company production reality. When altering algorithms, a high degree of encapsulation and modularity leads to challenges with compatibility and version commonality. In addition, underlying hardware interfaces may increase the equipment price. ROS functions are redundant for single-field researchers. Our team’s research focuses on cluster communication and path planning. Although we quickly learned the ROS development method in the UCAR-based development process, it is not particularly beneficial for non-ROS devices meant for real-world application. The robot we assemble in-house basically comprises the following parts. The final built robot is illustrated in Figure 2.
  • Motion part
The motion part is used to realize the omnidirectional movement of the robot on plane terrain for various mobile tasks. The Mecanum wheel is used in this mobile robot, although it has a large power loss in forwarding motion. A low-cost wheel with a 60 cm diameter of plastic hub and rubber roller surface is used. The wheel coupling is a six-sided cylindrical structure with internal thread made by 3D printing. In the current market, set screws are typically used to directly match the motor D-shaped shaft with the wheel boss hole. In contrast, the matching method adopted in this study is more modular and involves separating the bushing that bears a large amount of torque from the wheel. This change does not have much value for metal wheels and materials with greater tensile strength, but it significantly improves the service life of plastic wheels, which are less rigid. This improves the fatigue strength of plastic wheels subjected to alternating loads, and the damage is more concentrated on the low-cost printed hexagonal plastic column part. The sports chassis is supported by four hexagonal copper columns. The lower chassis is arranged with a battery and drive circuit, and the motor bracket is installed symmetrically on both sides. The motor bracket and D-axis DC motor are connected by screw fastening, and a through-hole thread connects the DC motor and hexagonal coupling. The coupling is externally connected to a Mecanum wheel with hexagonal connection holes. There are four Mecanum wheels for omnidirectional movement.
  • Electric drive part
The power supply for the robot is a multi-channel voltage stabilizer and voltage division module, which can access the power supply of 12 V - 2 A and divide it into 15 voltage channels to suit different needs, such as 3.3 V, 5 V, and 12 V (adjustable) for the sensor, main control board, and motor drive voltage, respectively. The battery combines numerous output voltages and currents through multiple 3V lithium batteries in series and in parallel; this battery is used in most rechargeable consumer electronics designs in China. The milliamperes of the primary control board output cannot directly drive a high-current component such as a DC motor. Therefore, a TB6612N proportional current power amplifier is employed to generate sufficient driving current. By extending the latter, the steady speed control of four 12-0.6A DC reduction motors is obtained to fulfill the needs of autonomous robot movement.
  • Control part
Multi-thread control, including a central decision-making main thread and many unit sub-threads, is adopted for the robot. Each thread is responsible for an independent scanning task, which implies that each output and input port can work separately. Each sub-thread offers a system state to the main process by altering global variables. The main thread can carry out sensor fusion and control by scanning the real-time changing state table. Navigation decisions are decided on real-time IMU data and camera picture data provided by sub-threads in motion control, which will be discussed in the next portion.
  • Sensing part
Most current navigation technologies are based on radar or camera. Due to the constraint of the sampling frequency of time-varying signals, there will always be cumulative inaccuracies between the real state change and the measured state change for inertial components such as gyroscopes and accelerometers. The longer the operation period, the less precise the measured displacement and angle changes will be. Scanning the surrounding environment in real-time can allow the actual position to be known so that sensor mistakes can be corrected frequently. The advantage of lidar calibration is that the accurate position value can be acquired by employing the stability of the laser, and the correction is accurate. The downside is that the laser signal frequency is too high to be particularly sensitive to tiny changes. When the radar equipment exhibits weak vibration, and the scanning surface is not parallel to the ground for a short time, substantial drift will occur. Algorithms such as DWA or TEB often lead to complete system instability and crash in the face of large drift. In order to further lower the development cost, we employed the camera for calibration and used the neural network approach to achieve the target identification of feature points in the image, including data set arrangement, network design, model training, and model invocation. The position of the landmark pattern laid at the site of the camera is acquired through the convolution and deconvolution of the input image. The cumulative mistake of IMU is appraised and repaired. Target detection is still carried out by YOLOv4 introduced in the previous phase. The sole difference is that the detecting object and the data set during training are modified.
Figure 3 is the architecture schematic of our built robot. Based on this architecture diagram, the equipment assembled in the experimental phase considerably regulates the cost and reduces the production cycle.

3. UCAR System

The architecture for this robot is based on the UCAR, an open-source programmable intelligent vehicle device developed by iFLYTEK based on the Linux environment under the Ubuntu 18.04 system. UCAR is a compact apparatus dedicated to scientific research launched by iFLYTEK. It is provided with a complete variety of sensing and execution equipment, which can meet the functional implementation needs included in the paper. We designed this framework on the basis of UCAR to fulfill the functions of path planning, autonomous navigation, two-dimensional code recognition, voice interaction, and visual detection.

3.1. Module Development

Figure 4 depicts the task flowchart of the full voice interaction procedure. Once the system starts, the mobile robot is woken via spoken communication and moves until it reaches the initial detect spot. Then, it starts to identify the ArUco code and broadcasts the dished name via voice. The task is classified as a failure if the robot misses the ArUco code location or cannot continue the identification. Then, the robot continues advancing until it gets to the second recognition position. The feature matching operation starts to determine the characteristics of the diners, such as glasses and long hair. The number of properly-recognized characteristics can be calculated in the final score. Similarly, if the robot misses the position or fails to recognize the features, the diner feature identification is deemed as a failure. After that, the robot continues to travel until it reaches the endpoint. At this time, it parks in the conclusion place and broadcasts the done task details via the voice. While fulfilling duties, the robot needs to achieve five essential functions: path generation, autonomous navigation, voice interaction, two-dimensional code identification, and feature detection.

3.1.1. TEB Local Path Planner

Path planning is separated into global path planning and local path planning. Global path planning refers to determining the shortest path between two points. In this investigation, we used the Dijkstra method. The main principle of the Dijkstra algorithm is to identify the shortest path from each node of the map to its initial location. The Dijkstra method employs the shortest path point as the bridge to update the shortest distance from the unvisited point to the initial location. The algorithm iterates until the shortest global distance from the initial location to the goal point is attained. Based on the coupling of this Dijkstra algorithm and the robot control strategy, the program implements the movement technique under a defined beginning pose and with no back-off limitation. According to the odometer, the mobile robot determines the exact position and posture in the map. After traversing all of the essential spots, it ultimately stops at the end position.
The two predominant ways of local path planning are to employ a dynamic window approach (DWA) or TEB local path planner. The objective of the TEB algorithm is to sample different speeds in the speed space, simulate the motion trajectories of these speeds in a certain time, score these trajectories by an evaluation function, select the optimal speed, and send the optimal speed to the slave computer. This method makes subsequent changes to the initial trajectory created by the global path planner to optimize the mobility trajectory of the robot. During the navigation process, the TEB algorithm plans the local path by following the global path and the step size parameter settings. It continuously updates the obstacle information to ensure that the trajectory is far away from the obstacle with the expansion distance as the minimum value. To achieve multi-objective optimization, the TEB algorithm uses obstacle avoidance constraints, speed and acceleration dynamic constraints, and global path constraints. The output of the TEB algorithm can be used to navigate the car reaching the prespecified points. The optimization goal of this study is to ensure that the autonomous mobile robot can avoid obstacles in real-time, travel to numerous target points, cross U-shaped curves, and accomplish the mission in the least time.

3.1.2. Autonomous Navigation

Multi-point navigation is the base for achieving the autonomous movement of the produced robot. The robot uses nodes to publish target points in order, moves to each target point in turn based on global path planning, and avoids obstacles in real-time. Based on the map produced in advance, the SLAM algorithm is used to synchronously find and draw the real-time map of the surrounding environment so that the robot can plan its path according to the map. The IMU posture sensor is used to gather inertial information to adjust for position and pose inaccuracies. Combined with the distance sensor based on lidar, impediments like walls and objects are identified, and the ideal path between two points and chassis motion mode is computed. The adaptive Monte Carlo localization (AMCL) approach is utilized to aid the real-time estimation of the robot’s position, and the built-in encoder of the robot is employed to increase the precision of planning the robot’s motion.

3.1.3. Voice Interaction

Voice interaction is a service that uses voice and text to establish a dialogue interface in an application. Currently, many robots can perform advanced automatic voice recognition features based on deep learning [28], which can transform speech into text and provide natural language processing. The development framework proposed in this study leverages the microphone array to sample and transform an external voice into a digital signal. After feature extraction, the signal is transformed from the time domain to the frequency domain. Next, the retrieved feature vector is translated into text by pattern matching. The auditory and language models in the pattern matching connection determine the ultimate recognition effect. The labeled data for training are utilized to generate an audio model and a language model. Based on the pre-training model offered by the iFLYTEK AI platform, the training workload in this study (e.g., model training and parameter debugging) can be significantly decreased.

3.1.4. Two-Dimensional Code Identification

Mobile robots are often required to analyze QR codes to obtain specific information, such as the characteristics of dishes and the delivery positions in the smart restaurant environment, workpiece information and equipment status in the smart factory environment, and cargo information and distribution requirements in the smart storage environment. At present, the widely used graphical coding is two-dimensional code. Here, we used the ArUco code. According to the assumed task requirements, each ArUco tag in this project is a 4 × 4 bit binary pattern, whose binary reference mark can be used for camera pose estimation. The ArUco code is a composite square mark composed of a wide black border and an internal binary matrix that determines its identifier. The black boundary facilitates its rapid detection in an image, and binary coding allows for its recognition and the application of error detection and correction technology. We implemented the ArUco module based on OpenCV to enable pose estimation and camera correction features.

3.1.5. Feature Detection

Feature detection is not only one of the basic tasks to be tackled in computer vision but also the basic task of video surveillance technology. Feature detection is complex since the targets in a video may have multiple positions, may be occluded, and may display erratic motion. Furthermore, feature detection is affected by the depth of field, resolution, weather, illumination, and other variables of the surveillance video, as well as the scene’s diversity. The findings of the feature detection method directly affect the subsequent tracking, action recognition, and behavior description. The typical algorithms of feature detection can be split into two types. One category is the R-CNN algorithm based on region proposals. The technique generates target candidate boxes and then does classification and regression on the candidate boxes. The other group is one-stage algorithms, such as YOLO and SSD, which merely use a convolutional neural network (CNN) to predict the categories and positions of distinct targets directly. Algorithms in the first category have higher accuracy but lower speed, whereas algorithms in the second category have faster speed but lower accuracy. In this study, YOLOv4 target detection and SIFT feature matching are employed for target detection, both of which can realize the goal of picture recognition, but the outcome is different.

4. Implementation and System Performance Evaluation

This section describes the experimental simulation scenarios and then analyzes the implementation method and results of path planning, feature detection, and voice recognition under the effect of different parameters. On the assumption of attaining satisfactory results, in order to lower the experimental cost, a self-assembly robot that can meet the following functions is designed and built, and assembly modules are introduced.

4.1. Test Scenario

The test condition for our robot is a simulated setting of a smart restaurant. The site is made with 5 mm grey PP boards and board connectors. After the robot is placed on the site, it is triggered by an operator’s voice, and then the machine vision function is checked by placing identifiers at certain areas. According to the identification results, the actual position of the endpoint and the voice broadcast results are established to imitate the working scenario in the smart restaurant environment. The test site is shown in Figure 5. The starting point A of the robot is in the green round box in the upper right corner. The red circle on the left is the end position of the robot. The QR code is inside the yellow circle on the lower left.

4.2. Performance Evaluation

4.2.1. Motion Adjustment Based on TEB Algorithm

We basically altered the parameters of the TEB local planner. The graphical user interface (GUI) of rqt reconfigure is used. The system timer obtains the running time results during debugging. After the test, it is found that there is a specific correlation between the parameters, and a particular parameter cannot be altered independently. Table 4 outlines the responsibilities of some of the important parameters. We initially identified the approximate range of each parameter and then collected data through experiments. Moreover, we employed the range analysis method to assess the parameters’ preselected values. The influence weight and optimal value of each parameter on the job completion time are derived, debugging suggestions are given, and a collection of optimal parameter combinations are obtained. Taking the four parameters in Table 5 as an example, we used an SPSS online version for range analysis. Each parameter has three different levels, and the orthogonal test collects the data to obtain the ideal value of each parameter. The R value is the difference between the maximum and minimum values of K values, called extreme difference of factors. Combined with the size comparison of different R values, it can be observed that max vel x is the best factor, followed by dt ref. Figure 6 illustrates the average value of each parameter gained through experiments, and a set of the optimal motion parameters after long-term debugging is shown in Table 5. The final best complete running time achieved on the site is 20.58 s. The ideal parameters will be different for different devices due to varying hardware. Herein, we offered the optimal parameters obtained based on the tried debugging method. Although there will be differences in different contexts, the differences should be managed within a specified range.

4.2.2. Feature Recognition Based on Machine Vision

The smart restaurant setting depicted in this paper is based on the 16th China University Student Smart Car Competition. The competition challenges players to utilize cameras to identify foam boards with human images randomly placed on the simulated track and use two traits, namely, long hair and spectacles, to separate the images and identify them. The final voice announces the number of characters with long hair and those wearing spectacles.
The competition officially provided eight reference portrait photos as the standard. Based on the UCAR robot, we implemented target detection and image matching task objectives. The target detection is based on the characteristics of long hair and glasses and is marked on 600 labels. Based on image data, a target detection neural network model is trained, and the features in the image to be recognized are directly marked and selected. Image matching is based on similarity calculation. Under the condition that the image features are basically unchanged, the image to be recognized and the eight reference images are calculated for similarity, and they are classified with the group with the highest similarity to determine whether they belong to the category of “long hair or glasses”. Considering the effects of several common target detection frameworks shown in Table 6 based on the data in the literature [29], we selected YOLOv4 as the target detection framework by balancing the factors of “average precision (mAP)” and “detection speed”.
We trained YOLOv4 on a personal computer with an Intel CORE i5 9300 processor through the Darknet framework and then deployed the trained model on the UCAR framework. The training set and the test set contain about 1000 pictures, divided according to the ratio of nine to one, and the training pictures are all personal photos of the whole body. The training data set comes from the network pictures and is marked with the labelme tool. After 3 h of training on the personal computer, the recognition rate is stable at 95.8% of the available models. Figure 7 shows the accuracy (mAP) at the end of the training, where the blue line is the change in average loss, and the command line window is the estimated time of training and real-time parameters such as average loss and batch size at that time. When setting different learning rates for training, the obtained model accuracy is shown in Table 7. Figure 8 depicts the rendering of the recognition process of the test set on the personal computer. The left side is the output of the model running, and the right side is the detected long hair, glasses, humans with these features, and confidence level of feature detection.
The amount of model computation is so great that general embedded devices cannot simultaneously support recognition models and high-level image acquisition. Running the YOLOv4 model in UCAR will cause the frame rate to increase. If the frame rate is reduced to 1 fps, only one photo can be taken per second. Too low of a frame rate will directly lead to blurred shooting results, which will be unsuitable for recognition. Therefore, we adopt another image processing-based method to meet the competition’s requirements. The SIFT feature matching algorithm is implemented by calling the OpenCV toolkit. The result of the algorithm is shown in Figure 9. The female portrait in the picture is one of the pictures to be recognized as required by the task. The green line in Figure 9 is the template picture. It is connected with the feature points in the real-time image, and the effect of feature matching is intuitively displayed.

4.2.3. Voice Interaction Based on iFLYTEK AI

The voice-related modules of UCAR can be started through the microphone module at the top of the UCAR. The voice settings of the UCAR are xf_mic_asr_offline and xf_mic_tts_offline function packages. Voice wake-up is implemented by using the xf_mic_asr_offline client in the iFLYTEK SDK. After a series of microphone initialization operations are performed on UCAR, we can speak keywords to wake up the car, and then call the offline command word recognition interface through the get_offline_recognise_result_srv service. The speech recognition function can be easily used by setting the recognition keyword and downloading the files generated by platform training to UCAR.

4.2.4. Experimental Lessons

Blindly attempting many remedies during the experiment is not only incredibly inefficient but also difficult for determining the main cause of the problem. For example, in the process of debugging the parameters of the TEB algorithm, the parameters affect each other, and it is nearly impossible to blindly adjust the parameters to have a beneficial effect. Thinking about the theory behind the experimental phenomenon is the method to win.
At the same time, make full use of the devices placed in the equipment. For example, instead of employing radar alone for detection, combine information from radar and IMU sensors. The full usage of the equipment will yield greater outcomes.

4.3. Assembled Robot

We used SolidWorks to perform 3D modeling of the robot chassis to prepare the robot’s layout before the assembly. The 3D modeling may reduce the difficulties of wiring and avoid the physical interference of each module and metal contact caused by short circuits, local overheating, and other defects, as demonstrated in Figure 10. The complete machine employs a low deformation aluminum plate as the basic frame of the chassis. In order to make the structure compact, a two-layer arrangement is adopted for the component deployment. The upper and bottom layers are linked by using threads through hexagonal copper columns.
The upper layer equips cameras, two motor drivers, one voltage regulator module, and a touch display, which are utilized for vision, drive, and status display. The lowest layer equips a 12 V battery, a raspberry PI control board, eight light sensors, and a wire collector, which are used for power supply, control, and cable fixing. The four Mecanum wheels are fixed to the motor via separated hexagonal coupling. The motor is attached to the chassis through the motor bracket of the sheet metal. Considering that the robot may be utilized in intelligent work scenarios other than a smart restaurant, a four-DOF mechanical arm is attached to boost its interaction capabilities. Moreover, a gray-scale sensor module is also used to realize tracking based on the detection of landmark/boundary lines. This mobile robot is created economically based on parts obtained from the Taobao website. The pricing of the key pieces of the robot are listed in Table 8, and the overall price of the robot is only 1635 RMB.
In this paper, the radar and API interface of a Chinese Company (called Silan) are utilized to realize the fuzzy state estimation of the surrounding obstacles, and the method is debugged in the same field, and eventually, the effect similar to UCAR is produced. The specific experimental comparison data are presented in Table 9. The similarity of basic motion parameters is the average of the three indications of the two robots’ maximum speed, maximum acceleration, and minimum turning radius, and UCAR is used as a reference.
To further simulate the actual restaurant scene, we use multiple assembled robots to mimic the food delivery process on the grid map. Figure 11 is an example scene of a 6 × 6 m grid map simulating a simplified restaurant. The three marked points on the lower side of the picture are the meal pickup points, which are marked with numbers 1 to 3, respectively; the upper side is marked with three meal delivery target points. Figure 11 shows a picture of the test scene of the assembled robot. The three assembled robots start from different meal pickup points, move to the position corresponding to the meal delivery task, and arrive at the three places in Figure 1 to complete the simulated handling. The test selected 30 randomly given food delivery tasks; each AGV needs to complete 10 points of task delivery, and the average total time of 20 repeated tests is 912 s.
Compared with UCAR, our built robot has a large cost advantage, which would be advantageous for academics focused on robot theory and specialized function algorithms. Our design may also be readily installed in-house for convenience. Moreover, our robot may be easily adapted for application to diverse settings with different functional requirements.

5. Conclusions and Future Work

In this study, we participated in an intelligent car competition to test the performance of the open-source robot platform built by iFLYTEK. Furthermore, we deployed the sophisticated robot function package through the upper computer in the ROS general framework. We effectively implemented path planning, autonomous navigation, voice interaction, two-dimensional code identification, feature detection and other functions on this new device. We proved that the UCAR mobile robot platform could promote robot research and project development. We obtained the optimal parameters for deploying the TEB algorithm on the device through experiments and determined the actual influence of various parameters on the operation effect of the robot based on practical debugging, and obtained the operation result of 20.58s in the 16 square meter test site. In addition, we deployed two machine vision approaches, including deep learning and image processing, to test the feasibility and practical benefits of image detection on UCAR robots for discovering a low-cost open robot platform based on ROS. Moreover, we tested the AI voice recognition tool built by iFLYTEK based on UCAR in terms of two functionalities, speech recognition, and audio converter. Furthermore, we designed and assembled a mobile robot using parts acquired from the Taobao website; this robot was cheaper than the UCAR offered by iFLYTEK.
In the future, we will develop our mobile robot design with reference to environmental sensing and aesthetics. By the way, more realistic tests will be developed to evaluate the performance of our assembled robots. Moreover, developing the mobile robot control software will be an important component of our future effort. We will also examine employing context histories for recording the data generated during the operation of robots in smart settings. Context prediction can be used to forecast better courses and future instances where the robot can act. For example, predict the prioritization to attend in a smart restaurant and crises in the restaurant.

Author Contributions

Conceptualization, P.G., H.S. and S.W.; methodology, H.S. and S.W.; software, P.G. and H.S.; validation, H.S., S.W., L.T. and Z.W.; formal analysis, S.W.; investigation, H.S. and Z.W.; resources, P.G.; writing—original draft preparation, H.S. and P.G.; writing—review and editing, P.G. and L.T.; visualization, H.S.; supervision, P.G.; project administration, P.G.; funding acquisition, P.G. All authors have read and agreed to the published version of the manuscript.


This research was funded by Sichuan Science and Technology Program Grant No. X2022NSFSC0459 and the Open Research Fund of Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education in China Grant No. GZUAMT2021KF05.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Fragapane, G.; de Koster, R.; Sgarbossa, F.; Strandhagen, J.O. Planning and control of autonomous mobile robots for intralogistics: Literature review and research agenda. Eur. J. Oper. Res. 2021, 292, 405–426. [Google Scholar] [CrossRef]
  2. Guazzini, A.; Fiorenza, M.; Panerai, G.; Duradoni, M. What went wrong? predictors of contact tracing adoption in Italy during COVID-19 pandemic. Future Internet 2021, 13, 286. [Google Scholar] [CrossRef]
  3. Sindermann, C.; Sha, P.; Zhou, M.; Wernicke, J.; Schmitt, H.S.; Sariyska, R.; Stavrou, M.; Becker, B.; Montag, C. Assessing the attitude towards artificial intelligence: Introduction of a short measure in German, Chinese, and English language. KI—Künstliche Intell. 2021, 35, 109–118. [Google Scholar] [CrossRef]
  4. Shimmura, T.; Ichikari, R.; Okuma, T. Human-Robot Hybrid Service System Introduction for Enhancing Labor and Robot Productivity. In Advances in Production Management Systems. Towards Smart and Digital Manufacturing; Lalic, B., Majstorovic, V., Marjanovic, U., von Cieminski, G., Romero, D., Eds.; IFIP Advances in Information and Communication Technology; Springer International Publishing: Cham, Switzerland, 2020; pp. 661–669. [Google Scholar]
  5. Shimmura, T.; Ichikari, R.; Okuma, T.; Ito, H.; Okada, K.; Nonaka, T. Service robot introduction to a restaurant enhances both labor productivity and service quality. Procedia CIRP 2020, 88, 589–594. [Google Scholar] [CrossRef]
  6. Cheong, A.; Lau, M.; Foo, E.; Hedley, J.; Bo, J.W. Development of a Robotic Waiter System. IFAC-PapersOnLine 2016, 49, 681–686. [Google Scholar] [CrossRef]
  7. Blöcher, K.; Alt, R. AI and robotics in the European restaurant sector: Assessing potentials for process innovation in a high-contact service industry. Electron. Mark. 2021, 31, 529–551. [Google Scholar] [CrossRef]
  8. Jeong, M.; Kim, K.; Ma, F.; DiPietro, R. Key factors driving customers’ restaurant dining behavior during the COVID-19 pandemic. Int. J. Contemp. Hosp. Manag. 2022, 34, 836–858. [Google Scholar] [CrossRef]
  9. Yu, Q.; Yuan, C.; Fu, Z.; Zhao, Y. An autonomous restaurant service robot with high positioning accuracy. Ind. Robot. 2012, 39, 271–281. [Google Scholar] [CrossRef]
  10. Ma, E.; Bao, Y.; Huang, L.; Wang, D.; Kim, M.S. When a Robot Makes Your Dinner: A Comparative Analysis of Product Level and Customer Experience between the U.S. and Chinese Robotic Restaurants. Cornell Hosp. Q. 2021. [Google Scholar] [CrossRef]
  11. Yang, J.; Gope, P.; Cheng, Y.; Sun, L. Design, analysis and implementation of a smart next generation secure shipping infrastructure using autonomous robot. Comput. Netw. 2021, 187, 107779. [Google Scholar] [CrossRef]
  12. Pudu. Smart Delivery Robot-Pudu Robotics. 2021. Available online: (accessed on 20 December 2021).
  13. Chitta, S.; Marder-Eppstein, E.; Meeussen, W.; Pradeep, V.; Rodríguez Tsouroukdissian, A.; Bohren, J.; Coleman, D.; Magyar, B.; Raiola, G.; Lüdtke, M.; et al. ros_control: A generic and simple control framework for ROS. J. Open Source Softw. 2017, 2, 456. [Google Scholar] [CrossRef]
  14. Ladosz, P.; Coombes, M.; Smith, J.; Hutchinson, M. A Generic ROS Based System for Rapid Development and Testing of Algorithms for Autonomous Ground and Aerial Vehicles. In Robot Operating System (ROS): The Complete Reference; Studies in Computational Intelligence; Springer International Publishing: Cham, Switzerland, 2019; Volume 3, pp. 113–153. [Google Scholar]
  15. Chivarov, S.; Kopacek, P.; Chivarov, N. Cost oriented humanoid robot communication with iot devices via mqtt and interaction with a smart home hub connected devices. IFAC-PapersOnLine 2019, 52, 104–109. [Google Scholar] [CrossRef]
  16. Tkáčik, M.; Březina, A.; Jadlovská, S. Design of a Prototype for a Modular Mobile Robotic Platform. IFAC-PapersOnLine 2019, 52, 192–197. [Google Scholar] [CrossRef]
  17. Noh, S.; Park, J.; Park, J. Autonomous Mobile Robot Navigation in Indoor Environments: Mapping, Localization, and Planning. In Proceedings of the 2020 International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Korea, 21–23 October 2020; pp. 908–913. [Google Scholar]
  18. Belanche, D.; Casaló, L.V.; Flavián, C.; Schepers, J. Service robot implementation: A theoretical framework and research agenda. Serv. Ind. J. 2020, 40, 203–225. [Google Scholar] [CrossRef]
  19. Silva, G.R.; Becker, L.B.; Hübner, J.F. Embedded architecture composed of cognitive agents and ros for programming intelligent robots. IFAC-PapersOnLine 2020, 53, 10000–10005. [Google Scholar] [CrossRef]
  20. Wang, X.V.; Wang, L.; Lei, M.; Zhao, Y. Closed-loop augmented reality towards accurate human–robot collaboration. CIRP Ann. 2020, 69, 425–428. [Google Scholar] [CrossRef]
  21. Oliveira, M.; Castro, A.; Madeira, T.; Pedrosa, E.; Dias, P.; Santos, V. A ros framework for the extrinsic calibration of intelligent vehicles: A multi-sensor, multi-modal approach. Robot. Auton. Syst. 2020, 131, 103558. [Google Scholar] [CrossRef]
  22. Fennel, M.; Geyer, S.; Hanebeck, U.D. Rtcf: A framework for seamless and modular real-time control with ros. Softw. Impacts 2021, 9, 100109. [Google Scholar] [CrossRef]
  23. Fragapane, G.; Hvolby, H.-H.; Sgarbossa, F.; Strandhagen, J.O. Autonomous Mobile Robots in Hospital Logistics. In Advances in Production Management Systems. The Path to Digital Transformation and Innovation of Production Management Systems; Lalic, B., Majstorovic, V., Marjanovic, U., von Cieminski, G., Romero, D., Eds.; IFIP Advances in Information and Communication Technology; Springer International Publishing: Cham, Switzerland, 2020; pp. 672–679. [Google Scholar]
  24. Simonetto, M.; Sgarbossa, F. Introduction to Material Feeding 4.0: Strategic, Tactical, and Operational Impact. In Advances in Production Management Systems. The Path to Digital Transformation and Innovation of Production Management Systems; Lalic, B., Majstorovic, V., Marjanovic, U., von Cieminski, G., Romero, D., Eds.; IFIP Advances in Information and Communication Technology; Springer International Publishing: Cham, Switzerland, 2020; Volume 591, pp. 158–166. [Google Scholar]
  25. Santos, N.B.; Bavaresco, R.S.; Tavares, J.E.R.; Ramos, G.d.O.; Barbosa, J.L.V. A systematic mapping study of robotics in human care. Robot. Auton. Syst. 2021, 144, 103833. [Google Scholar] [CrossRef]
  26. Liu, Y.; Wang, X.; Wang, S. Research on service robot adoption under different service scenarios. Technol. Soc. 2022, 68, 101810. [Google Scholar] [CrossRef]
  27. da Rosa Tavares, J.; Victória Barbosa, J. Ubiquitous healthcare on smart environments: A systematic mapping study. J. Ambient. Intell. Smart Environ. 2020, 12, 513–529. [Google Scholar] [CrossRef]
  28. Zhang, Z.; Geiger, J.; Pohjalainen, J.; Mousa, A.E.-D.; Jin, W.; Schuller, B. Deep learning for environmentally robust speech recognition: An overview of recent developments. ACM Trans. Intell. Syst. Technol. 2018, 9, 1–28. [Google Scholar] [CrossRef]
  29. Yu, Z.; Shen, Y.; Shen, C. A real-time detection approach for bridge cracks based on yolov4-fpm. Autom. Constr. 2021, 122, 103514. [Google Scholar] [CrossRef]
Figure 1. UCAR structure diagram.
Figure 1. UCAR structure diagram.
Machines 10 00844 g001
Figure 2. Photograph of the assembled robot.
Figure 2. Photograph of the assembled robot.
Machines 10 00844 g002
Figure 3. Architecture diagram of robot assembled in house.
Figure 3. Architecture diagram of robot assembled in house.
Machines 10 00844 g003
Figure 4. Task flowchart.
Figure 4. Task flowchart.
Machines 10 00844 g004
Figure 5. Test site.
Figure 5. Test site.
Machines 10 00844 g005
Figure 6. The means of each level of the factors.
Figure 6. The means of each level of the factors.
Machines 10 00844 g006
Figure 7. Training process.
Figure 7. Training process.
Machines 10 00844 g007
Figure 8. Recognition result.
Figure 8. Recognition result.
Machines 10 00844 g008
Figure 9. Feature matching.
Figure 9. Feature matching.
Machines 10 00844 g009
Figure 10. Low-cost design scheme of robot.
Figure 10. Low-cost design scheme of robot.
Machines 10 00844 g010
Figure 11. The actual test scene.
Figure 11. The actual test scene.
Machines 10 00844 g011
Table 1. Overall hardware composition.
Table 1. Overall hardware composition.
master computerJetson nano
CPUQuad-core a57 (dominant frequency: 1.43 GHz)
storage4GB lpddr4, 16GB onboard EMMC
signal communicationWIFI 802.11a/b/g/n/ac, Bluetooth4.0 BT3.0
hardware interface1 interface, 2 USB, 1 network interface
Slave computerStm32f4 Series MCU
Table 2. Values of lidar parameters.
Table 2. Values of lidar parameters.
Ranging frequency4000–9000 Hz
Scanning frequency5–12 Hz
Ranging range0.1–16 m
Maximum scanning angle360°
Angular resolution0.26–0.3
Table 3. Parameters of a wide-angle camera.
Table 3. Parameters of a wide-angle camera.
Horizontal viewing angle124.8°
Vertical viewing angle67°
Frame rate30 fps (1920 × 1080)
Transmission interfaceUSB2.0
Table 4. Parameter results of the TEB algorithm (part).
Table 4. Parameter results of the TEB algorithm (part).
Main ParametersEffectOptimal Value
planner_frequencyThe execution frequency of global planning operation affects the sensitivity of real-time updates of global planning5
cost_factorThe performance of global path planning is affected; too high or too low will reduce the path quality0.55
inflation_radiusDetermine the size of the expansion layer and control how far the zero cost point is from the obstacle0.3
cost_scaling_factorSetting a high value will make the attenuation steeper. When the slope of the curve is high, the robot tends to approach obstacles5
update_frequencyCost map update frequency. This parameter can be used for dynamic obstacle avoidance of local cost map10
xy_goal_toleranceThe distance tolerance in the X/Y direction from the target point is too small, which may cause robot oscillation0.06
yaw_goal_toleranceThe arc tolerance with the target point is too small, which may cause robot oscillation0.06
enable_multithreadingWhether to allow multi-threaded parallel processingFalse
min_obstacle_distThe minimum distance from the obstacle is too large, which may prevent the robot from moving0.14
weight_kinematics_nhThe optimization weight of nonholonomic kinematics can be satisfied and the Mecanum wheel characteristics can be realized by using this parameter in combination with Y-direction speed10
weight_optimaltimeThe greater the time weight, the faster the car accelerates to the highest speed to realize a straight-line sprint20
max_vel_yMaximum speed in the Y direction, which can be used for oblique driving of wheat wheel0.7
max_vel_xMaximum speed in the X direction, which can be used for the omnidirectional movement of the wheat wheel1.1
dt_refThe step size of the local path, the shorter the step size, the higher the accuracy0.35
best running time20.58 s
Table 5. Range analysis.
Table 5. Range analysis.
K values0.3---23.30
best level1.110200.35
Table 6. Metrics comparison.
Table 6. Metrics comparison.
ModelsmAPDetection Speed
Faster-RCNN0.909277.8 ms
SSD3000.882178.6 ms
YOLOv40.91216.4 ms
CrackDN0.964161.3 ms
Table 7. Learning rate selection.
Table 7. Learning rate selection.
Learning RatemAP
Table 8. The prices of the main parts of the assembled robot.
Table 8. The prices of the main parts of the assembled robot.
PartUnit Price (RMB)QuantitySum (RMB)
12 V geared motor of Chihai 370504200
RPLIDAR A1M84981498
TB6612lng motor driver16232
Yabo intelligent four-way photoelectric sensor20240
Risym mpu6050 inertial sensor90190
Longqiu 12/5/3.3 multi-power module20120
Aluminum Alloy Metal Mecanum Wheel Chassis from Hummingbird Labs2131213
Guanyuetang Raspberry Pi 4b+16 G SD card4311431
Metal fasteners of Jinchao flagship store51151
12 V battery for QISUO60160
Total 1635
Table 9. Performance comparison of self-assembling robot and UCAR.
Table 9. Performance comparison of self-assembling robot and UCAR.
Compare ItemsUCARSelf-Assembling Robot
The shortest completion time of the TEB algorithm/s25.437.2
The shortest completion time of the DWA algorithm/s30.546.4
The task success rate of the TEB algorithm85%84%
The task success rate of the DWA algorithm94%96%
Average crash rate15%10%
Character recognition rate68%95%
Total cost20,0001635
Algorithm debugging time (Day)3010
Similarity of basic motion parameters187%
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Guo, P.; Shi, H.; Wang, S.; Tang, L.; Wang, Z. An ROS Architecture for Autonomous Mobile Robots with UCAR Platforms in Smart Restaurants. Machines 2022, 10, 844.

AMA Style

Guo P, Shi H, Wang S, Tang L, Wang Z. An ROS Architecture for Autonomous Mobile Robots with UCAR Platforms in Smart Restaurants. Machines. 2022; 10(10):844.

Chicago/Turabian Style

Guo, Peng, Haichao Shi, Shijie Wang, Liansheng Tang, and Zipeng Wang. 2022. "An ROS Architecture for Autonomous Mobile Robots with UCAR Platforms in Smart Restaurants" Machines 10, no. 10: 844.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop