A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network

Abubaker, Brwa Abdulrahman; Razmara, Jafar; Karimpour, Jaber

doi:10.3390/app132413145

Open AccessArticle

A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network

by

Brwa Abdulrahman Abubaker

,

Jafar Razmara

^* and

Jaber Karimpour

Department of Computer Science, University of Tabriz, Tabriz 51666-16471, Iran

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(24), 13145; https://doi.org/10.3390/app132413145

Submission received: 4 November 2023 / Revised: 27 November 2023 / Accepted: 29 November 2023 / Published: 11 December 2023

(This article belongs to the Special Issue Artificial Intelligence and Advances in Smart Internet of Things (IoT))

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In recent years, implementing reinforcement learning in autonomous mobile robots (AMRs) has become challenging. Traditional methods face complex trials, long convergence times, and high computational requirements. This paper introduces an innovative strategy using a customized spiking neural network (SNN) for autonomous learning and control of mobile robots (AMR) in unknown environments. The model combines spike-timing-dependent plasticity (STDP) with dopamine modulation for learning. It utilizes the Izhikevich neuron model, leading to biologically inspired and computationally efficient control systems that adapt to changing environments. The performance of the model is evaluated in a simulated environment, replicating real-world scenarios with obstacles. In the initial training phase, the model faces significant challenges. Integrating brain-inspired learning, dopamine, and the Izhikevich neuron model adds complexity. The model achieves an accuracy rate of 33% in reaching its target during this phase. Collisions with obstacles occur 67% of the time, indicating the struggle of the model to adapt to complex obstacles. However, the model’s performance improves as the study progresses to the testing phase after the robot has learned. Its accuracy surges to 94% when reaching the target, and collisions with obstacles reduce it to 6%. This shift demonstrates the adaptability and problem-solving capabilities of the model in the simulated environment, making it more competent for real-world applications.

Keywords:

reinforcement learning; SNN; STDP; dopamine modulation; adaptability

1. Introduction

Mobile robots, which can move independently, are of utmost importance in robotics. The intricate systems of mobile robots are equipped with sensors to detect the surroundings of the systems, a control system to plan the movements, and the ability to execute these systems. Various types of mobile robots exist, including humanoid robots [1], unmanned vehicles [2], entertainment companions [3], and drones [4]. What sets mobile robots apart is their capacity to make decisions based on their perception of their environment. To navigate unfamiliar surroundings, they require a robust cognitive system to process information from diverse sources and adapt to changing circumstances [5]. Mobile robots require a steady flow of input data, decoding methods, and action plans to respond to an ever-changing environment effectively. They can be classified into different categories based on their design, operation setting, and mode of operation. The most prevalent classification is autonomous robots, which can navigate unrestricted environments without physical or electromechanical guidance devices [6].

Ordinary differential equations frequently serve as models for neurons. Numerous mathematical descriptions of spiking neural models can be found in the literature, including the Hodgkin–Huxley model [7], the Integrate-and-Fire (IF) model [8], and the Izhikevich model [9]. The present study employs the Izhikevich model [10], which combines the biological plausibility of Hodgkin–Huxley dynamics with the computational efficiency of the IF neuron. In SNNs, the learning process is predicated on alterations in the strength of synapses between neurons, similar to biological neural networks. If a presynaptic neuron fires before a postsynaptic neuron, the synaptic strength is enhanced; conversely, if the order is reversed, a decrease occurs. The timing of neuronal impulses is crucial in the learning process and is called STDP. The study utilizes the STDP algorithm modulated by dopamine as a learning algorithm [11].

Mobile robots rely significantly on sensor input to gather information about their surroundings, enabling them to execute actions like moving, stopping, or rotating. These sensors encompass ultrasonic, laser, torque, and vision sensors, transmitting data to the robot’s controller. Research conducted by Ko and Kuc (2015) and Müller (2017) has put forward algorithms to interpret this incoming data, typically transmitted as reflected light beams or sound signals, thus aiding the robot in gaining insights about its environment [12]. Nevertheless, object signal obstructions can lead to occasional inaccuracies in the data these sensors receive [13].

Several studies have investigated the use of SNNs in designing robot control systems. For example, Azimirad and Sani [14] analyzed using SNNs inspired by the brain’s circuitry for robot behavioral learning. They found that teaching the spiking architecture of these circuits can lead to successful target attraction. Similarly, Lu et al. [15] used modified neuron models to create an autonomous learning paradigm for robots, allowing them to explore complicated settings independently. Liu et al. [16] developed a mobile robot control system based on SNNs and a reward-modulated algorithm, enabling multitask autonomous learning. Wang, Z. has come up with a new way to make decisions for swarm robots. They use a system called ESB-MADDPG, which combines an expert system with deep reinforcement learning. This approach makes training faster and allows the robots to use pre-trained paths without needing to be retrained often [17].

On the other hand, Lobov et al. [18] investigated how the spatial features of STDP (spike-timing-dependent plasticity) can facilitate associative learning in small SNNs. Jiang et al. [19] proposed Hough Transform-based SNNs for target recognition in non-visual sensor (NVS) fed asynchronous event streams, demonstrating their potential for identifying targets in real-world settings. Harandi et al. [20] introduced a technique that uses photographs to provide environmental data for robot control, reducing dimensionality while maintaining performance. Also, SNNs have been explored in obstacle avoidance and navigation. Wang et al. [21] presented an SNN-based algorithm for mobile robot obstacle avoidance, showing its effectiveness in navigating complex environments. Volodymyr et al. [22] used Deep Reinforcement Learning (DRL) methods to successfully teach a mobile robot to navigate crowded indoor environments. Ge et al. [23] proposed an SNN-based algorithm for obstacle avoidance using ultrasonic sensors and demonstrated its success in avoiding obstacles in a simulated environment. These studies highlight the potential of SNNs and DRL methods in developing effective robot control systems for various tasks, including target attraction, navigation, and obstacle avoidance, in real-world settings. In the exploration of reinforcement learning (RL), an endeavor laden with challenges due to the necessity for numerous interactions with the environment, Yu et al. address the challenges of real-world reinforcement learning (RL) tasks by proposing an innovative offline RL algorithm. This algorithm combines hindsight relabeling and supervised regression to predict actions autonomously, eliminating the need for oracle information. The authors recognize the expense and potential danger associated with real-world tasks, prompting the development of this solution [24]. In their pursuit of enhancing the sparsity of the standard broad learning system for pattern classification and regression, Yu, Kang et al. introduce the smoothing group L1/2 regularized discriminative broad learning system. This system aims to improve recognition and generalization performance by incorporating the ε-dragging technique and group L1/2 regularization. To address the non-convex and non-smooth nature of the original regularization objective, the authors propose an effective smoothing technique. Simulation results validate the algorithm’s success in redundancy control, highlighting its achievement in improving recognition and generalization [25].

Using Q-learning and deep Q-networks (DQNs) for robot control in unknown environments presents several noteworthy challenges. While effective in certain applications, both methods suffer from inefficiencies related to trial-and-error learning, resulting in significant time consumption. Achieving real-time control remains challenging, particularly for Q-learning, limiting its practical applicability in dynamic and rapidly changing environments. The complex control structures required by these methods can be intricate, posing difficulties in designing strategies that seamlessly guide robots through various dynamic scenarios. Additionally, extended convergence times and high computational demands constrain the practical implementation of DQNs, especially in resource-constrained settings. Furthermore, the limited adaptability of these methods to evolving environments and the dependence on extensive training data present significant hurdles. Addressing these gaps is crucial for enhancing the robustness and applicability of Q-learning and DQNs in robot control in unknown and dynamic environments [26].

SNN is a new approach that uses unique learning methods and characteristics to address these challenges. This proposed AMR learning and control system aims to assist robots in effectively navigating and operating within unfamiliar surroundings. It utilizes a customized SNN based on the biologically accurate Izhikevich neuron model. The SNN incorporates STDP and dopamine modulation features to enhance learning and adaptation. STDP adjusts neuron connections based on their spiking timing, while dopamine reinforcement encourages desired behaviors. These mechanisms improve the adaptability and the performance of the robot in new environments, enabling it to process sensory information and make decisions in a biologically realistic manner [27]. This study presents a significant advancement in developing more autonomous and intelligent robots capable of effectively operating in real-world scenario as it focuses on optimizing the performance of a robotic system through fine-tuning hyper parameters and introducing a novel algorithm.

Our article comprises four main sections. Section 1 introduces and examines relevant studies related to robot behavioral learning. Section 2 offers a detailed explanation of our methodology. Section 3 focuses on the tests conducted. In order to showcase the effectiveness of our proposal, Section 4 presents a comparative analysis with related works. Lastly, in Section 5, we conclude the article and provide insights for future research. The structure and the content of the article have been visually represented.

2. Methodology

This section provides a detailed explanation of the SNN architecture and introduces our new approach for AMR control, which is based on SNNs.

2.1. The Mobile Robot System

This paper describes a mobile robot system that uses a learning algorithm based on SNN. The algorithm employs dopamine-modulated STDP for robot navigation. The system also includes two ultrasonic sensors and two color detection sensors for obstacle detection and target identification. Sensory signals are encoded into sensory neurons, and motor neurons control the movement of the robot. Through training, the synaptic weights between the sensory and motor neurons are modified to enable the robot to navigate and avoid obstacles in unknown environments. Previous studies have also explored using neural networks for mobile robot navigation and obstacle avoidance. For example, Arena [28] used a multi-layer neural network to map sensory inputs to motor commands for navigation.

Similarly, Pandey et al. [29] developed a neural network-based controller using sonar sensors. Shamsfakhr and Bigham [30] used a neural network approach with a laser rangefinder sensor for obstacle detection and avoidance. Mobile robot systems commonly use various sensors, such as rangefinders, sonar, and visual sensors, to perceive the environment and avoid obstacles. Using multiple sensors allows the robot to gather more comprehensive data and make better decisions about its movement. Incorporating neural networks into mobile robot navigation enables more precise and efficient control by adapting the behavior of the robot based on its experiences.

Our work focuses on a two-dimensional (2D) robot design in this initial phase. Figure 1 illustrates this design, showcasing a two-wheeled mobile robot with separate velocities for its right (V_R) and left (V_L) wheels. The robot’s orientation is represented as (Ꝋ), and the distance between its wheels is denoted as ‘b’. This visual representation effectively conveys the kinematic behavior of the robot. Within the 2D environment, static rectangular obstacles are introduced randomly, adding complexity to the robot’s pathfinding task. The primary objective of the robot is to autonomously navigate to a specific destination, called the “Target” or “Goal” location. The robot commences its mission from a predetermined “Start” location. An essential performance metric is the navigation time, which quantifies the duration required for the robot to successfully traverse from the initial “Start” point to its intended “Goal” destination.

2.2. Sensor Types and Specifications

Our SNN-based mobile robot navigation and obstacle avoidance system utilizes two main types of sensors: ultrasonic sensors and color detection sensors.

Ultrasonic sensors: These sensors emit sound waves and calculate the time it takes for the waves to bounce back, enabling them to detect obstacles and measure distances. The signals from these sensors are encoded in the sensory neurons of our SNN-based system, which control the movements of the robot. Ultrasonic sensors have been proven effective in obstacle detection and avoidance through previous studies [31].

The main specifications of ultrasonic sensors include the frequency at which they operate, the detection range they can cover, the beam angle that determines the scanning area, the accuracy of distance measurements, the response time for detection, the output format of distance measurements, the power requirements, environmental considerations, and integration and connectivity options. These specifications can vary depending on the specific model and manufacturer of the ultrasonic sensor.

Color detection sensors: These sensors analyze the wavelengths of light reflected from objects to identify targets and obstacles. The main specifications of this sensor include the wavelength range they operate within, their resolution for distinguishing colors, accuracy in color identification, detection speed, output formats for color information, integration, connectivity options, considerations for ambient lighting and environmental factors, and power requirements. Like ultrasonic sensors, the signals from these sensors are encoded in the sensory neurons of our network, guiding the movements of the robot. Research has also demonstrated the effectiveness of color detection sensors in obstacle detection and avoidance [32]. Together, these sensors and our SNN model form the basis of our methodology. Our system can make informed decisions by processing data from these sensors, enabling effective navigation and obstacle avoidance in real-world environments.

Additionally, our system includes a mapping component that allows the robot to build a representation of its environment. By using the data gathered from the sensors, the robot can create a map that includes information about obstacles, landmarks, and other relevant features. This map is a reference for the robot’s navigation, enabling it to plan efficient paths and avoid potential obstacles. To ensure real-time operation, our SNN-based mobile robot navigation system is implemented on a powerful embedded platform capable of handling the computational demands of the neural network and sensor data processing. This platform also provides the necessary interfaces to control the robot’s actuators and communicate with external devices.

2.3. Network Architecture and Training Algorithm

An SNN is a network of interconnected neurons communicating through electrical activity spikes. It initiates with the input layer receiving signals from the surrounding environment, which are then converted into electrical impulses by the input neurons. These impulses traverse the network and ultimately lead to the generation of a signal in the output layer, predicated on the spikes produced by the output neurons. The process involves neurons emitting pulses to downstream neurons when they reach a specific threshold, establishing a communication chain [33]. Training SNNs involves adjusting synaptic weights connecting neurons to enhance the network’s overall performance. STDP is a prevalent training algorithm for SNNs. STDP operates by modifying the synaptic weights based on the relative timing of spikes. This learning mechanism takes inspiration from biological learning principles, reinforcing connections between neurons that fire synchronously while weakening connections between neurons that fire at differing times. This intricate process allows the network to comprehend the temporal relationships between input and output signals, a critical aspect for practical applications [34].

Figure 2 serves as the proposed framework, which utilizes a spiking neural network (SNN) architecture, incorporating spike-timing-dependent plasticity (STDP), the Izhikevich model, and reinforcement learning (RL), to enable a robot to navigate towards a target while avoiding obstacles. The SNN consists of an input layer, hidden layer(s), and an output layer. The synaptic weights, representing the connections between neurons, are updated using STDP, which adjusts the weights based on the timing of spikes between connected neurons. Spike generation and propagation functions facilitate signal processing and information flow within the network. Neuron activation levels are determined by an activation function, while the Izhikevich model captures the spiking behavior of individual neurons based on inputs and activation levels. The Izhikevich model plays a crucial role in simulating the dynamics of neurons by considering important parameters, including a, b, c, and d. Parameter ‘a’ controls the sensitivity of the membrane recovery variable to the threshold fluctuations, while ‘b’ determines the sensitivity of the recovery variable to the membrane potential. Parameter ‘c’ represents the after-spike reset value of the membrane potential, and “d” governs the after-spike reset value of the recovery variable.

The “Izhikevich model” and Table 1 and Table 2 demonstrate how parameters influence the spiking patterns and excitability of neurons, enabling the representation of a wide range of neuronal behaviors. In the context of robot control, reinforcement learning incorporates feedback signals, such as rewards based on performance criteria (e.g., distance to the target, angle to the target, and distance to the nearest obstacle), to optimize the network’s performance. By integrating these components and considering the parameters of the Izhikevich model, the SNN framework can adapt its behavior by adjusting synaptic weights. This adaptation allows the robot to effectively navigate towards the target while avoiding collisions. With this comprehensive approach, the SNN framework offers a promising solution for achieving efficient and safe robot control in unknown environments. Refer to Figure 2 for an illustration of the general SNN framework with STDP and our model.

2.4. The Izhikevich Model Used for Neuron Modeling

In computational neuroscience, the computational model used in this study is known for its simplicity and effectiveness in capturing essential neuron dynamics. This model, introduced by Eugene Izhikevich in 2003, is widely used because it can emulate essential characteristics of different neuron types while being computationally efficient. The model is described by two related differential equations that provide insights into the behavior of neurons. The first equation represents the neuron’s membrane potential, which is the electric potential across the neuronal membrane. Using a recovery variable, the second equation describes how quickly the membrane potential returns to its resting state after a spike. Four parameters, ‘a’, ‘b’, ‘c’, and ‘d’, govern these equations. These parameters play a crucial role in shaping the characteristics of a neuron’s membrane potential and influencing its recovery rate after a spike. It is worth noting that these parameters can be adjusted dynamically, allowing the model to replicate various firing patterns exhibited by different types of neurons. This adaptability enables the model to emulate firing patterns such as regular spiking, rapid spiking, and bursting, as described in Table 1. This flexibility is essential in our study, where we use the model to simulate neuron behavior and support our SNN system.

The Izhikevich model has been extensively used in neuroscience research, including studying synaptic plasticity, neural coding, and network dynamics. Researchers have employed the model to investigate how neurons encode and decode information, how synaptic connections between neurons change during learning, and how neural networks generate emergent behaviors [35]. It is important to note that adjusting the parameter values significantly impacts the firing patterns of neurons. Consequently, based on these parameter values, neocortical and thalamic neurons exhibit diverse firing behaviors. The mathematical equations mentioned earlier are just a small part of a larger mathematical model depicted in the flowchart. Figure 3 illustrates the sequence of operations, computations, and decisions involving these equations as the system evolves. These equations describe how specific variables change over time, while the flowchart visually represents the steps and dependencies within the mathematical model.

The mathematical expressions shown in Figure 3A–C show how a computational model works. The first equations describe the positions of objects in a two-dimensional space using distance calculations. The following equations represent the electrical activity in a neuron using variables as shown in Figure 3B for the membrane potential (v) and adaptive current (u). These variables change over time according to specific rules and external input (I). The input (I) depends on the positions of the objects. The neuron sends a signal when the membrane potential reaches a certain level called the firing threshold (v_thresh), as mentioned in Table 2 “Membrane potential threshold to spike”. The target error measures the difference between the current and desired positions. The STDP rule adjusts the connections between neurons based on their timing. Lastly, the equations for robot movement update its orientation and position using small parameter changes. These equations describe how a neural network responds to sensory input and controls the robot’s movement.

2.5. Performance Evaluation Metrics

The detection rate is a crucial metric for assessing the accuracy of target and obstacle detection in both SNNs and STDPs. It measures the percentage of accurately detected targets and obstacles out of the total number entered [36]. Similarly, the false positive rate is an important metric that calculates the percentage of incorrectly detected targets and obstacles relative to the total number of non-targets and non-obstacles in the input [37]. When evaluating the ability of SNNs and STDPs to detect targets and obstacles accurately, the false-positive rate is a key consideration.

Algorithm 1 illustrates the initial algorithm for the simulation of an SNN controlling a mobile robot in an RL task. This algorithm outlines the structure of the simulation, allowing the SNN to learn how to control the mobile robot through trial and error. The SNN improves its performance over time by adjusting its synaptic weights based on spike timing, Within this algorithm, the line “For j in fired_neurons” plays a crucial role. It signifies a loop that iterates over a set of neurons that have exhibited firing activity during a specific iteration or time step. By utilizing the loop variable “j”, each fired neuron within the set can be accessed and processed individually. The specific operations performed within the loop, such as updating firing thresholds or adjusting synaptic weights, depend on the algorithm’s unique requirements and objectives (as mentioned in Step 1: Time Frame, as explained in Figure 4).

Algorithm 1. SNN Algorithm with STDP
Steps	Code
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:	Determine fired neurons. Determine motor torques. Update robot location. Update target and obstacle position in the robot view. Give currents to sensory neurons based on robot location. Update STDP (Spike-Timing-Dependent Plasticity) based on fired neurons. if Target Error < specific threshold then Set dopamine to 1. if Obstacle Distance < specific threshold then Set dopamine to −1. for j in fired neurons do Update synaptic weights derivation based on LTP (Long-Term Potentiation) and LTD (Long-Term Depression) rules. Update membrane potentials based on currents. Update synaptic weights based on values of synaptic weights derivation and dopamine. Decrease STDP, dopamine, and synaptic weights derivation exponentially.

3. Results and Discussion

The performance of our proposed SNN-based approach has been extensively evaluated to determine its effectiveness. This section provides unique details about our experiments and engages in in-depth discussions regarding the results obtained.

3.1. Parameters in SNN Learning Algorithm

The SNN learning algorithm relies on adjustable parameters that significantly impact the behavior and performance of the SNN during the learning process. These parameters include the learning rate, number of neurons in each layer, initial network weights, activation function, synapse type, and STDP learning rule parameters. Modifying these parameters can improve the SNN’s learning capabilities for specific tasks or enhance its overall performance. Table 2 presents a comprehensive list of these parameters and their corresponding values. Among the parameters, the number of sensory neurons (100) and motor neurons (200) are particularly crucial as they form the basic building blocks of the SNN model and determine its complexity. The maximal synaptic weight (4) and the recovery variable’s time scale (0.02) are also essential in defining the network’s connection strength and the time it takes for a neuron to recover after firing.

Table 2 also includes essential learning parameters such as initial synaptic weights (1), reward constants (1), and punishment constants (−1). These parameters influence the rewards and punishments given to the network during learning and the time steps for various adjustments. They play a critical role in determining the network’s learning rate and ability to adapt to the environment. Additionally, the time step for encoding sensor signals into the network and the time steps for decreasing reward/punishment values or synaptic weight derivation values are important parameters that significantly impact the network’s learning performance.

3.2. Optimizing Control Systems for Mobile Robots

This section will delve into a flowchart thoughtfully designed to investigate the combined influence of proprioceptive feedback and learning algorithms on SNN-based mobile robot control systems operating in unfamiliar environments. Our exploration begins with an experiment in which robot, target, and obstacle positions are randomly generated and systematically assessed for effectiveness. Figure 4, presented below, showcases the new flowchart, which introduces a structured and comprehensive approach to conducting this study. This flowchart underscores the significance of using structured visual representations to explore complex systems like mobile robot control systems. It is essential to highlight that the findings of this study hold the potential for replication and further development by other researchers.

Figure 4 illustrates the steps to achieve the system’s objectives. The distance between the robot, target, and obstacles is calculated using sensors. Accurate sensor readings are crucial for determining the experiment’s success, as they trigger dopamine or adverse dopamine into the system, acting as a reward or punishment. This helps the system learn and adapt its behavior based on sensory inputs. The diagram utilizes sensor input to calculate target and obstacle errors, which modify synaptic weights. The firing patterns of sensor and motor neurons are crucial in this process. The branch that calculates target error determines if the robot has reached its target, rewarding dopamine if the error is below a certain threshold. If the error is higher than the threshold and the robot collides with an obstacle, adverse dopamine is given as punishment. Both reward and punishment contribute to modifying synaptic weights.

Figure 4 shows a flowchart that explains the process clearly. It highlights important steps such as time management, sensory data calculation, reward–punishment mechanisms, motor skill exploration, synaptic weight adjustment, and iterative learning. This research aims to provide insight into the complex process that allows robots to achieve their objectives while skillfully avoiding obstacles, with a specific focus on our system.

➢: Step 1: Time Frame

This simulation divides time into milliseconds (ms) and seconds (s). Milliseconds are used to update neuron firing (as mentioned in Algorithm 1) and robot location for precise and timely updates. Seconds are used to average synaptic weights, providing smoother representation for plotting and analysis. This approach allows for precise and efficient simulation of the navigation process of the robot.

➢: Step 2: Experiment Initialization

Each experiment randomly defines three crucial elements: the initial position of the robot, the target position, and the obstacle position. These positions determine the starting conditions for the navigation task of the robot. Randomizing these positions creates various scenarios to evaluate the efficiency of the SNN-based learning algorithm. This variability allows the robot to learn and adapt to different navigation challenges.

➢: Step 3: Sensor Calculations

Sensors, such as ultrasonic and color detection sensors, are essential for a robot’s navigation process. Ultrasonic sensors emit sound waves and measure their reflections to calculate distances between the robot, the target, and obstacles. Color detection sensors analyze object colors to classify the color of the obstacles and the color of the target, providing additional information for navigation. Sensor data are crucial for decision making, including obstacle detection and proximity assessment of the target. The system calculates two error distances: one between the robot and the target, and another between the robot and obstacles. Spatial or ultrasonic sensors are used for this purpose. If the distance to obstacles exceeds a predefined threshold (0.1 cm), the system evaluates proximity and adjusts navigation actions to avoid collisions. By integrating sensor data and using threshold-based evaluations, the framework enhances the robot’s navigation capabilities. This sensor-driven decision-making process is critical for successful navigation in dynamic environments.

➢: Step 4: Reward–Punishment Mechanism

The robot of performance depends on sensory data, particularly the target error and obstacle distance. The target error reflects the difference between the position of the robot and the desired target position, while the obstacle distance measures the proximity of the robot to obstacles. If the target error is below a specified limit, the robot receives a dopamine reward, strengthening the synaptic connections related to successful navigation. Conversely, negative dopamine is given as punishment if the obstacle distance exceeds a specific threshold, indicating a collision. This punishment weakens the synaptic connections associated with collisions, discouraging the robot from repeating the same behavior. These feedback signals adjust the weights between neurons in the learning algorithm, guiding the navigation of the robot.

➢: Step 5: Motor Babbling for Exploration

If the robot fails to reach the target due to not meeting the specified threshold, it utilizes “motor babbling”. This technique involves introducing random motor actions to the movements of the robot, allowing it to explore different motor actions and potentially discover new navigation strategies. By introducing variability, the robot is encouraged to explore its capabilities and the environment, enhancing its navigation paths and overall performance.

➢: Step 6: Firing Patterns and Synaptic Weight Adjustment

The firing patterns of sensor and motor neurons are essential for modifying synaptic weights. During training, the weights between sensory and motor neurons change based on firing patterns and reward or punishment signals. Successful firing patterns leading to successful navigation strengthen corresponding connections, while patterns resulting in collisions weaken those connections. This process helps the robot learn navigation strategies, reinforcing success and inhibiting detrimental patterns.

➢: Step 7: Motor Neurons and Location Calculation

Firing motor neurons is crucial for navigation. These neurons activate and control the movements of the robot. Analyzing neuron firing patterns allows us to determine the location of the robot using inverse kinematic equations. This information helps the robot make decisions and adjust its movements based on obstacles and the target. The direct link between neural activity and physical movements enables precise control over navigation.

➢: Step 8: Iterative Experimentation

The process involves multiple iterations of experiments, where the robot receives reward or punishment signals based on its task performance. The synaptic weights are adjusted accordingly, and the modified weights are used in subsequent experiments. The number of experiments (or “time”) increases with each iteration until the robot successfully reaches the target or collides with an obstacle. The experiment concludes when the number of iterations exceeds a predetermined threshold, indicating successful navigation. The navigation skills of the robot improve by continually refining synaptic weights, increasing its chances of reaching the target and avoiding obstacles. It is worth noting that the experiment duration was 1000 simulated seconds. The process involves procedures enabling a robot to move towards a specific destination while avoiding obstacles. It involves calculating sensor information, adjusting weights, exploring different movements, and learning through trial and error. Following these steps improves the navigation abilities of the robot until it successfully reaches the target while effectively avoiding obstacles. The thresholds for target accuracy and obstacle distance may vary based on system implementation and requirements.

One approach is to use predefined thresholds based on prior knowledge or expertise. These thresholds can be set to achieve the desired level of accuracy and safety. For example, a small target error threshold can be set for precise target position information. Similarly, the obstacle distance threshold can be determined based on the physical dimensions of the robot and the operating environment.

Another approach involves adaptive thresholds that dynamically change during the learning process. Initially, high thresholds promote exploration and learning. As the robot gains experience and improves navigation, the thresholds gradually tighten to encourage more accurate target reaching and obstacle avoidance. Adaptive thresholds can be determined by analyzing the robot of performance over multiple experiments or using reinforcement learning to optimize values. In some cases, thresholds can be determined through trial-and-error experimentation. The robot undergoes multiple iterations of the task with different threshold values, and the results are measured to adjust the thresholds for optimal navigation behavior iteratively.

When setting thresholds, it is important to balance accuracy and robustness. Very strict thresholds can make the robot overly cautious and hinder navigation in complex environments. On the other hand, lenient thresholds can lead to imprecise navigation and frequent collisions. The choice of threshold values depends on the navigation requirements and the desired balance between accuracy and efficiency.

Significant advancements have been made in optimizing our robotic system, resulting in enhanced adaptability and performance. One key aspect of this optimization involved fine-tuning hyper parameters, which are crucial coefficients governing neuronal currents. Through experimentation and triangulation, an optimal balance was achieved to ensure stable firing patterns without the issues of non-firing or excessive bursting. Additionally, a novel algorithm was introduced to address a challenge during learning phases. Unlike previous setups, this algorithm maintains stability, preserves firing patterns, and enhances system reliability. These strategic modifications have had a positive impact on the overall performance of the robotic system, as demonstrated in Figure 4.

3.3. The Robot’s Navigation and Obstacle Avoidance Performance

In this study, a mobile robot is trained to navigate an unfamiliar environment with obstacles and reach a specific destination. The proposed approach is evaluated using testing, which collects data from various viewpoints to understand the behavior of the robot in the simulated environment. This allows for a detailed analysis of the robot of performance and can help improve its control system. Figure 5A,B depict the training and testing stages to teach the robot how to navigate and reach the desired destination. The Webots Simulator 3D is used for testing, providing a realistic simulation environment. Multiple data are collected to observe the actions of the robot in the simulated environment.

During the training process, we initiate by initializing all synaptic weights to one. However, as time progresses, there is a notable transformation depicted in Figure 6B, illustrating a significant and meaningful distinction between the left and right motor neurons with sensors. These alterations indicate that the system is functioning effectively. Consequently, the robot exhibits specific reactions to targets or obstacles. The firing pattern, illustrated in Figure 6C, elucidates how motor neurons adhere to a rule during training, selectively activating only a specific group of motor neurons. Simultaneously, the sequence of sensor firings, detecting targets and obstacles at specific intervals, demonstrates the initial random movements of the robot towards the target and responses to obstacles. Figure 6A provides a visual representation of the location of the robot throughout this training phase.

We tested the system after training to see how it works. In the test, we gave random inputs to the motor neurons on both sides. But it was difficult to always see the targets and obstacles because the range of the sensor was limited. To solve this, we added random background currents to the neurons, which affected movement without favoring right or left motors (see Figure 7C). We wanted to see how the sensory neurons control the direction of the robot by firing. In Figure 7B, the synaptic weights remained the same during testing because there were no changes made. So, there was no diagram for this part. Next, we focused on studying how the firing patterns of sensory neurons during testing affected the motor neurons’ activation and, therefore, controlled the movement of the robot and directed it to the target. Figure 7A shows where the robot is during the testing phase.

3.4. Synaptic Weight at Training Time

This robot uses a type of learning called STDP, a form of Hebbian learning. It adjusts the connections between the sensors of the robot and motors based on the timing of their signals. When a sensor signal comes before a motor signal, the connection is strengthened, while the opposite weakens it. This allows the robot to adapt to its environment in real-time, improving its ability to navigate complex and changing surroundings. In the training phase of the mobile robot’s SNN, Figure 8 illustrates the synaptic changes triggered by the first release of dopamine when the robot reaches the target. This is the initial phase of the experiment. Examining the synaptic changes within this diagram reveals that most synaptic weights undergo modifications. However, it is important to note that these changes occur in the early stages, and the differences are relatively small, remaining close to their initial values, which are set to 1.0. While these initial changes are promising, further experiments are needed to achieve more substantial alterations and fine-tune the system.

Figure 9 clearly illustrates how the synaptic weights are adjusted. These weights reflect the connections between the sensors and motors of the robot and are constantly tweaked to help the robot learn and adjust to its surroundings. Overall, there are distinct patterns in the synaptic weight adjustments that can be summarized as follows:

✓: The synaptic connection between the left target sensor and the suitable motor begins with a train-time synaptic weight of 0 and gradually increases, ultimately reaching 3.45 over time.
✓: The left motor is linked to the left target sensor and exhibits a stable train-time synaptic weight of 200, varying from 1 to 2.2.
✓: Similarly, the left motor connects to the suitable target sensor with a train-time synaptic weight of 400, and this weight undergoes fluctuations from 1.02 to 2.09.
✓: The synaptic connection between the left motor and the right obstacle sensor begins with a train-time synaptic weight of 600 and varies between 1 and 3.52 over time.
✓: Conversely, the synaptic connection between the suitable motor and the left obstacle sensor starts with a train-time synaptic weight of 800 and gradually decreases, reaching 0.2.
✓: The left motor is associated with the left obstacle sensor, maintaining a stable train-time synaptic weight of 1000, with fluctuations within the range of 1 to 1.2.
✓: Finally, the right obstacle sensor connects to the suitable motor, initially featuring a train-time synaptic weight of 0, increasing to 1.8 over time.
✓: These dynamic adjustments in synaptic weights illustrate the capacity of the SNN to continually refine its connections, enabling the robot to adapt and learn in response to sensory input and motor actions as it navigates its complex environment.

The arrangement of synaptic weights plays a crucial role in shaping the behavior of a robot and decision-making process. These weights dictate how sensory inputs impact motor activation by strengthening or weakening specific sensor-motor connections. This enables the robot to learn and adapt its responses, allowing it to navigate and overcome challenges in ever-changing environments. During training, the synaptic weights are adjusted based on input and output signals, enhancing the ability of the robot to handle dynamic situations effectively.

Figure 9 explains the connection between the sensor motors of the robot in Table 3. These linkages rely on synaptic weights that significantly affect the decision-making abilities and behavior of the robot. These weights are adjusted during training to improve the robot of performance and determine the neural network weights. When the synaptic weight is increased, the neural network prioritizes the motor input of the sensor, and higher neural network weight correlates the sensor input with the planned output. The mobile robot has learned to prioritize sensor input when making decisions or directing its behavior. As a result, the sensors significantly impact the motor responses of the robot, and the synaptic weights play a crucial role in enabling the robot to adapt and learn from its environment.

4. Verification with Counterparts

In this section, we compare our final results with six other articles that explored similar approaches to training and testing autonomous robots. The accuracy of our model is an impressive 94%, outperforming the results presented in all other articles. This highlights the effectiveness of our training methodology and the robustness of our model in accurately reaching the target. Additionally, our model demonstrates a considerably reduced collision rate, with only 6% of interactions resulting in obstacles. It is worth noting that the average experiment duration was 1000 simulated seconds. This achievement is notable compared to the collision rates reported in the other articles, ranging from 6.79% to 19.98%. These results indicate that our model can navigate through obstacles and avoid collisions, further highlighting its practical applicability in real-world scenarios.

The comparison presented in Table 4 and Figure 10 demonstrates the advancements made in our study, positioning it as a significant contribution to the field. The combination of high accuracy and low collision rates places our model at the forefront of autonomous robot navigation research, paving the way for safer and more efficient robotic systems.

In Table 4, we compared the effectiveness of our procedure with others by conducting an experiment that involved two different types of neural networks: spiking and non-spiking. The SNN used the LIF model, while the non-SNN utilized the McCulloch–Pitts neuron model [38]. Ramne’s study tested the SNN, which had 100 neurons, in 1000 scenarios. The SNN reached the target in 842 cases, achieving an 84% accuracy rate. However, it also collided with obstacles 16% of the time, resulting in 158 collisions. On the other hand, the non-spiking network, also with 100 neurons, outperformed the spiking network by reaching the target successfully in 914 cases, achieving a 91% accuracy rate with only 86 collisions (9% of cases). Both networks had sensory inputs from IR sensors, an odometer, and a compass, and they shared a single challenge and objective [38].

Deep learning has transformed computer vision, which uses hierarchical models inspired by the human brain to improve recognition and cognitive tasks. However, its application in decision-making and control has been limited. To demonstrate the effectiveness of a hierarchical structure that combines a convolutional neural network (CNN) with a decision process, Tai et al.’s study [39] focused on indoor obstacle avoidance. The study proposed a compact network that takes raw depth images as input and generates control commands as output, enabling model-less obstacle avoidance behavior. Real-world indoor environments were used to test the approach, which showed effective results.

To demonstrate the effectiveness of the proposed strategy, we compared its efficiency with other state-of-the-art methods. According to the results presented in Table 4, Lei Tai et al. [39] obtained an overall accuracy of 80.2% after testing a deep-network solution to avoid obstacles in an indoor setting. In this regard, an experiment was conducted indoors, first with a mark and then without one, and the results were compared. The final test result demonstrates that our model can achieve high accuracy when avoiding obstacles for mobile robots. Furthermore, this level of accuracy may be improved by increasing the number of markings in the environment.

In this article, authored by Yang et al., a novel learning strategy for Autonomous Mobile Robots (AMRs) is proposed in Table 4. The strategy addresses obstacle avoidance in uncertain environments by utilizing a Reinforcement Q-Learning (RNQL) algorithm. A Single Hidden Layer Feedforward Network (SLFN) is employed as a Q-function approximator, with the SLFN’s parameters being tuned by the OS-ELM algorithm. Initial output weights are estimated under a quadratic inequality constraint. Compared to the BPQL algorithm, RNQL demonstrates superior learning performance, evidenced by faster convergence, reduced training time, and enhanced generalization ability [41].

We look at the results with a few other outcomes to illustrate that this tactic is more successful than others. For example, Liu et al. [40] performed better than Tai et al. [39] As shown in Table 4, the findings can be broken down as follows: Liu et al.’s work [40] achieved a higher accuracy of 93.21% in reaching the target, while Tai et al. [39] had an accuracy of 81.72%. Additionally, Liu et al. [40] experienced fewer collisions, with only 6.79% of the attempts resulting in collisions, compared to Tai et al.’s [39] 18.28%. These results suggest that the approach or methodology employed by Liu et al. [40] was more effective in reaching the target and avoiding collisions.

It is essential to mention that Yang et al.’s research centers on how to help a mobile robot avoid obstacles using the backpropagation Q-learning algorithm [41]. In Table 4, the performance of two learning algorithms, backpropagation and Q-learning, is shown in terms of accuracy. The accuracy is measured by “reaching the target” and “collisions”. As per the table, the backpropagation algorithm achieved a 91.8% accuracy rate for reaching the target but had 7.4% collisions. These numbers demonstrate the effectiveness of backpropagation in navigating to the target accurately while minimizing collisions. However, the Q-learning algorithm does not provide specific accuracy percentages for reaching the target and collisions in the table.

We conducted various comparisons to determine if our findings could enhance the ability to avoid obstacles, further supporting the use of SNNs in robots. SNNs can improve robotic intelligence by mimicking the mechanisms of the brain, leading to faster speeds, energy efficiency, and computational capabilities. This article explores the biological evidence and motivations behind SNNs in robotics, modeling, and training approaches. We also discuss common modeling methods, including neurons, synaptic connections, and networks. As Bing et al. [42] described, SNNs have two types of learning solutions based on the Hebbian rule and Reinforcement Learning [40].

Table 4 presents a comparative analysis of various algorithms utilized for the control and navigation of an Autonomous Mobile Robot (AMR) in the presence of obstacles. Each entry in the table corresponds to a distinct reference, algorithm, and associated performance metrics. Notably, the proposed method, employing an SNN and a sensor setup consisting of ultrasonic and color detection, demonstrated remarkable efficacy. It achieved a 94% success rate in reaching the target, coupled with a low collision rate of 6%. Comparatively, Ramne’s SNN, relying on a camera sensor, achieved an 84% success rate and a 16% collision rate. The non-SNN algorithm by Ramne performed with 91% accuracy in reaching the target and a 9% collision rate. Additionally, Convolutional Neural Network (CNN) algorithms, such as those by Tai et al. [39], Liu et al. [40], and Yang et al. [41] showcased varying degrees of accuracy and collision rates, each employing camera sensors for perception. These findings offer valuable insights into the strengths and limitations of different algorithms in the specific context of AMR navigation amidst obstacles.

5. Conclusions

This study presents an innovative approach for training and controlling AMRs in unknown environments. The strategy employs a customized SNN that integrates biologically inspired principles, such as STDP with dopamine modulation and the Izhikevich neuron model, for autonomous learning and control. The proposed system is extensively simulated in the Webots robotic environment to test the algorithm. The results demonstrate the effectiveness of the approach in achieving high accuracy and minimizing collisions. Specifically, the SNN algorithm achieves an accuracy of 94% in reaching the target with only a 6% collision rate when faced with three obstacles after 1000 simulated seconds. These findings highlight the method’s superior navigational capabilities for AMRs. The proposed approach is notable for incorporating ultrasonic and color detection sensors, which contribute to a sophisticated sensor fusion strategy. This sensor fusion enhances the algorithm’s effectiveness in complex environments. Furthermore, the study reveals that the SNN model outperforms previous models, achieving heightened efficacy while utilizing minimal energy and time resources. This is a significant step towards revolutionizing various industries and domains where autonomous robots are crucial. In future studies, the deployment of the model on real-world autonomous robots is the next phase and will involve addressing additional challenges such as sensor noise and dynamic obstacles. Evaluating the model’s performance in practical conditions will provide valuable insights and help integrate it into real-world autonomous systems.

Author Contributions

Writing—original draft, B.A.A.; Writing—review & editing, J.R. and J.K.; Supervision, J.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

García, J.; Shafie, D. Teaching a humanoid robot to walk faster through Safe Reinforcement Learning. Eng. Appl. Artif. Intell. 2020, 88, 103360. [Google Scholar] [CrossRef]
Wang, H.; Yuan, S.; Guo, M.; Chan, C.Y.; Li, X.; Lan, W. Tactical driving decisions of unmanned ground vehicles in complex highway environments: A deep reinforcement learning approach. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2021, 235, 1113–1127. [Google Scholar] [CrossRef]
Adams, C.S.; Rahman, S.M. Design and Development of an Autonomous Feline Entertainment Robot (AFER) for Studying Animal-Robot Interactions. In Proceedings of the SoutheastCon 2021, Atlanta, GA, USA, 10–13 March 2021; pp. 1–8. [Google Scholar]
Dooraki, A.R.; Lee, D.J. An innovative bio-inspired flight controller for quad-rotor drones: Quad-rotor drone learning to fly using reinforcement learning. Robot. Auton. Syst. 2021, 135, 103671. [Google Scholar] [CrossRef]
Randazzo, M.; Ruzzenenti, A.; Natale, L. Yarp-ros inter-operation in a 2d navigation task. Front. Robot. AI 2018, 5, 5. [Google Scholar] [CrossRef] [PubMed]
Panigrahi, P.K.; Bisoy, S.K. Localization strategies for autonomous mobile robots: A review. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 6019–6039. [Google Scholar] [CrossRef]
y Arcas, B.A.; Fairhall, A.L.; Bialek, W. Computation in a single neuron: Hodgkin and Huxley revisited. Neural Comput. 2003, 15, 1715–1749. [Google Scholar] [CrossRef] [PubMed]
Burkitt, A.N. A review of the integrate-and-fire neuron model: I. Homogeneous synaptic input. Biol. Cybern. 2006, 95, 1–19. [Google Scholar] [CrossRef]
Izhikevich, E.M. Which model to use for cortical spiking neurons? IEEE Trans. Neural Netw. 2004, 15, 1063–1070. [Google Scholar] [CrossRef]
Izhikevich, E.M. Simple Model of Spiking Neurons. IEEE Trans. Neural Netw. 2003, 14, 1569–1572. [Google Scholar] [CrossRef]
Gerstner, W.; Kistler, W.M.; Naud, R.; Paninski, L. Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
de Ponte Müller, F. Survey on ranging sensors and cooperative techniques for relative positioning of vehicles. Sensors 2017, 17, 271. [Google Scholar] [CrossRef]
Ko, N.Y.; Kuc, T.Y. Fusing range measurements from ultrasonic beacons and a laser range finder for localization of a mobile robot. Sensors 2015, 15, 11050–11075. [Google Scholar] [CrossRef] [PubMed]
Azimirad, V.; Sani, M.F. Experimental study of reinforcement learning in mobile robots through spiking architecture of thalamo-cortico-thalamic circuitry of mammalian brain. Robotica 2020, 38, 1558–1575. [Google Scholar] [CrossRef]
Lu, H.; Liu, J.; Luo, Y.; Hua, Y.; Qiu, S.; Huang, Y. An autonomous learning mobile robot using biological reward modulate STDP. Neurocomputing 2021, 458, 308–318. [Google Scholar] [CrossRef]
Liu, J.; Lu, H.; Luo, Y.; Yang, S. Spiking neural network-based multitask autonomous learning for mobile robots. Eng. Appl. Artif. Intell. 2021, 104, 104362. [Google Scholar] [CrossRef]
Wang, Z.; Jin, X.; Zhang, T.; Li, J.; Yu, D.; Cheong, K.H.; Chen, C.P. Expert system-based multiagent deep deterministic policy gradient for swarm robot decision making. IEEE Trans. Cybern. 2022. [Google Scholar] [CrossRef] [PubMed]
Lobov, S.A.; Mikhaylov, A.N.; Shamshin, M.; Makarov, V.A.; Kazantsev, V.B. Spatial properties of STDP in a self-learning spiking neural network enable controlling a mobile robot. Front. Neurosci. 2020, 14, 88. [Google Scholar] [CrossRef]
Jiang, Z.; Bing, Z.; Huang, K.; Knoll, A. Retina-based pipe-like object tracking implemented through spiking neural network on a snake robot. Front. Neurorobot. 2019, 13, 29. [Google Scholar] [CrossRef]
Harandi, F.A.; Derhami, V.; Jamshidi, F. A new feature selection method based on task environments for controlling robots. Appl. Soft Comput. 2019, 85, 105812. [Google Scholar] [CrossRef]
Wang, X.; Hou, Z.G.; Lv, F.; Tan, M.; Wang, Y. Mobile robots׳ modular navigation controller using spiking neural networks. Neurocomputing 2014, 134, 230–238. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Hassabis, D. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
Ge, C.; Kasabov, N.; Liu, Z.; Yang, J. A spiking neural network model for obstacle avoidance in simulated prosthetic vision. Inf. Sci. 2017, 399, 30–42. [Google Scholar] [CrossRef]
Yu, X.; Bai, C.; Wang, C.; Yu, D.; Chen, C.P.; Wang, Z. Self-Supervised Imitation for Offline Reinforcement Learning with Hindsight Relabeling. IEEE Trans. Syst. Man Cybern. Syst. 2023, 53, 7732–7743. [Google Scholar] [CrossRef]
Yu, D.; Kang, Q.; Jin, J.; Wang, Z.; Li, X. Smoothing group L1/2 regularized discriminative broad learning system for classification and regression. Pattern Recognit. 2023, 141, 109656. [Google Scholar] [CrossRef]
Yang, Y.; Juntao, L.; Lingling, P. Multi-robot path planning based on a deep reinforcement learning DQN algorithm. CAAI Trans. Intell. Technol. 2020, 5, 177–183. [Google Scholar] [CrossRef]
Lobo, J.L.; Del Ser, J.; Bifet, A.; Kasabov, N. Spiking neural networks and online learning: An overview and perspectives. Neural Netw. 2020, 121, 88–100. [Google Scholar] [CrossRef] [PubMed]
Arena, P.; Fortuna, L.; Frasca, M.; Patané, L. Learning anticipation via spiking networks: Application to navigation control. IEEE Trans. Neural Netw. 2009, 20, 202–216. [Google Scholar] [CrossRef] [PubMed]
Pandey, A.; Pandey, S.; Parhi, D.R. Mobile robot navigation and obstacle avoidance techniques: A review. Int. Rob. Auto J. 2017, 2, 22. [Google Scholar] [CrossRef]
Shamsfakhr, F.; Bigham, B.S. A neural network approach to navigation of a mobile robot and obstacle avoidance in dynamic and unknown environments. Turk. J. Electr. Eng. Comput. Sci. 2017, 25, 1629–1642. [Google Scholar] [CrossRef]
Zheng, Y.; Yan, B.; Ma, C.; Wang, X.; Xue, H. Research on obstacle detection and path planning based on visual navigation for mobile robot. J. Phys. Conf. Ser. 2020, 1601, 062044. [Google Scholar] [CrossRef]
Benavidez, P.; Jamshidi, M. Mobile robot navigation and target tracking system. In Proceedings of the 2011 6th International Conference on System of Systems Engineering, Albuquerque, NM, USA, 27–30 June 2011; pp. 299–304. [Google Scholar]
Diehl, P.U.; Cook, M. Frontiers in Computational Neuroscience. Front. Comput. Neurosci. 2015, 9, 99. [Google Scholar]
Wu, Y.; Deng, L.; Li, G.; Zhu, J.; Shi, L. Spatio-temporal backpropagation for training high-performance spiking neural networks. Front. Neurosci. 2018, 12, 331. [Google Scholar] [CrossRef] [PubMed]
Izhikevich, E.M. Dynamical Systems in Neuroscience; MIT Press: Cambridge, MA, USA, 2007. [Google Scholar]
Subbulakshmi Radhakrishnan, S.; Sebastian, A.; Oberoi, A.; Das, S.; Das, S. A biomimetic neural encoder for spiking neural network. Nat. Commun. 2021, 12, 2143. [Google Scholar] [CrossRef] [PubMed]
Bing, Z.; Baumann, I.; Jiang, Z.; Huang, K.; Cai, C.; Knoll, A. Supervised learning in SNN via reward-modulated spike-timing-dependent plasticity for a target reaching vehicle. Front. Neurorobot. 2019, 13, 18. [Google Scholar] [CrossRef] [PubMed]
Ramne, M. Spiking Neural Network for Targeted Navigation and Collision Avoidance in an Autonomous Robot. Master’s Thesis, Chalmers University of Technology, Göteborg, Sweden, 2020. [Google Scholar]
Tai, L.; Li, S.; Liu, M. A deep-network solution towards model-less obstacle avoidance. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 2759–2764. [Google Scholar]
Liu, C.; Zheng, B.; Wang, C.; Zhao, Y.; Fu, S.; Li, H. CNN-based vision model for obstacle avoidance of mobile robot. MATEC Web Conf. 2017, 139, 7. [Google Scholar] [CrossRef]
Yang, J.; Shi, Y.; Rong, H.J. Random neural Q-learning for obstacle avoidance of a mobile robot in unknown environments. Adv. Mech. Eng. 2016, 8, 1687814016656591. [Google Scholar] [CrossRef]
Bing, Z.; Meschede, C.; Röhrbein, F.; Huang, K.; Knoll, A.C. A survey of robotics control based on learning-inspired spiking neural networks. Front. Neurorobot. 2018, 12, 35. [Google Scholar] [CrossRef]

Figure 1. A schematic model of a two-wheeled mobile robot.

Figure 2. The general SNN framework with STDP and our model.

Figure 3. The three main Izhikevich models are the mathematical model (A), the basic Izhikevich model of spiking neurons (B), and the flowchart of the system’s Izhikevich model activities (C).

Figure 4. The flowchart of the proposed system.

Figure 5. (A) The robot during the training phase to teach, (B) The robot during the testing phase to reach the target.

Figure 6. The robot in the training phase to teach itself about surrounding.

Figure 7. A mobile robot undergoing testing in an environment with three obstacles and one target.

Figure 8. Synaptic weight changes in initial steps.

Figure 9. Synaptic weights between the sensors and motors in the SNN of a mobile robot during the training phase.

Figure 10. Accuracy and collision comparison [38,39,40,41].

Table 1. Values of parameters of the Izhikevich model.

Parameter	a	b	c	d
Value	0.02	0.2	−65	8

Table 2. The parameters used in SNN models and learning algorithms.

Parameter Description	Value
Number of sensory neurons	100
Number of motor neurons	200
Number of steps to send signals to postsynaptic neurons	1
Maximum synaptic weight	4
The time scale of the recovery variable	0.02
The sensitivity of the recovery variable to the subthreshold fluctuations of the membrane potential	0.2
Reset the value of the membrane potential	−65 mv
Reset the recovery variable	8
Initial synaptic weights	1
Membrane potential threshold to spike	30 mv
Number of experiments	100
Constant of reward	1
Constant of punishment	−1
Reward/punishment decreasing factor in each time step:	0.995
Time step to decrease reward/punishment value	1 ms
Synaptic weights derivation decreasing factor in each time step	0.95
STDP values decreasing factor in each time step	0.99
STDP factor in LTP part	1
STDP factor in LTD part	−1.1
Time step to decrease synaptic weights derivation value	5 ms
The time step to encode the sensor signals to the network	10 ms
Time step to decrease STDP values	1 ms

Table 3. Analysis of the relationship between Figure 9 and the sensor motors of the robot.

Connection Words	Connection (Sensor, Motor)	Refer to Line Colors in Figure 9
tarL->MR	Left target sensor to the right motor	Blue
tarL->ML	Left target sensor to the left motor	orange
tarR->MR	Right target sensor to the right motor	Green
tarR->ML	The right target sensor to the left motor	Red
obsL->MR	Left obstacle sensor to the right motor	Purple
obsL->ML	Left obstacle sensor to the left motor	Brown
obsR->MR	Right obstacle sensor to the right motor	Pink
obsR->ML	Right obstacle sensor to the left motor	Gray

Table 4. The last finding of the current study in comparison with the accuracy of obstacle avoidance of some counterparts.

No.	Reference	Algorithm	Accuracy
No.	Reference	Algorithm	Reached the Target (%)	Collisions (%)
1	Proposed method (3 obstacles)	SNN	94%	6%
2	Ramne [38]	SNN	84	16
3	Ramne [38]	Non-SNN	91	9
4	Tai et al. [39]	CNN	80.2	19.98
5	Liu et al. [40]	CNN-based vision model	81.72%	18.28%
6	Liu et al. [40]	CNN-based vision model	93.21%	6.79%
7	Yang et al. [41]	Backpropagation Q-learning	91.8%	7.4%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Abubaker, B.A.; Razmara, J.; Karimpour, J. A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network. Appl. Sci. 2023, 13, 13145. https://doi.org/10.3390/app132413145

AMA Style

Abubaker BA, Razmara J, Karimpour J. A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network. Applied Sciences. 2023; 13(24):13145. https://doi.org/10.3390/app132413145

Chicago/Turabian Style

Abubaker, Brwa Abdulrahman, Jafar Razmara, and Jaber Karimpour. 2023. "A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network" Applied Sciences 13, no. 24: 13145. https://doi.org/10.3390/app132413145

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Approach for Target Attraction and Obstacle Avoidance of a Mobile Robot in Unknown Environments Using a Customized Spiking Neural Network

Abstract

1. Introduction

2. Methodology

2.1. The Mobile Robot System

2.2. Sensor Types and Specifications

2.3. Network Architecture and Training Algorithm

2.4. The Izhikevich Model Used for Neuron Modeling

2.5. Performance Evaluation Metrics

3. Results and Discussion

3.1. Parameters in SNN Learning Algorithm

3.2. Optimizing Control Systems for Mobile Robots

3.3. The Robot’s Navigation and Obstacle Avoidance Performance

3.4. Synaptic Weight at Training Time

4. Verification with Counterparts

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI