Exploring Saliency for Learning SensoryMotor Contingencies in LocoManipulation Tasks
Abstract
:1. Introduction
 We devise an algorithm to extract the contingent spacetime relations intercuring within intertwined streams of actions and perceptions in the form of a tree of sensorimotor contingencies (SMC).
 The algorithm is based on the introduction of an attention mechanism that helps in identifying salient features within SensoriMotor Traces (SMTs). This mechanism aids in segmenting continuous streams of sensory and motor data into meaningful fragments, thereby enhancing learning and decisionmaking processes.
 Moreover, the algorithm leverages the introduction of suitable metrics to assess the relationship between different SMTs and between an SMT and the current context. These metrics are crucial for understanding how actions relate to sensory feedback and how these relationships adapt to new contexts.
 The underlying implicit representation is encoded in a tree structure, which organizes the SMCs in a manner that reflects their contingent relationships. This structured representation enables robots to navigate through the tree, identifying the most relevant SMTs based on the current context, thereby facilitating decisionmaking across diverse scenarios.
 The versatility and adaptability of the framework are demonstrated through its integration into various robotic platforms, including a 7degreeoffreedom collaborative robotic arm and a mobile manipulator. This adaptability underscores the potential for applying the proposed methods across a wide spectrum of robotic applications.
2. Problem Statement
3. Proposed Solution
 A metric to measure the distance between perceptions ${d}_{P}({P}_{1}\left(t\right),{P}_{2}\left(t\right))$ and actions ${d}_{A}({A}_{1}\left(t\right),{A}_{2}\left(t\right))$;
 Contingency relations between SMTs, denoted as ${C}_{T}({T}^{1},{T}^{2})$ and between a context and a SMT, denoted as ${C}_{C}(c,T)$;
 An operation to adapt an action to a different context.${A}_{\ast}\left(t\right)=M({T}_{0},{c}_{\ast},t)=M(({P}_{0}\left(t\right),{A}_{0}\left(t\right)),{c}_{\ast},t)$.
3.1. Temporal Saliency
3.2. Spatial Saliency
3.3. Perception Saliency and InterPerception Distance
 (a)
 Given a simple intrinsic perception ${P}^{i,j}={}_{b}\tilde{p}^{i,j}$ or a set of intrinsic perceptions ${P}^{i,j}=\left({}_{b}\tilde{p}^{i,j}\right)$, all the intrinsic perceptions are considered candidate$${}_{b}\tilde{p}^{i,j}\in {}^{S}P^{i,j}$$
 (b)
 Given a simple localized perception ${P}^{i,j}=\left({}_{1}\overline{p}^{i,j}\right)$$${}_{1}\overline{p}^{i,j}\in {}^{S}P^{i,j}\iff {}_{1}d_{X}({}_{1}x^{i,j},{x}_{POI})<{}_{1}\u03f5$$
 (c)
 Given a set of simple localized perceptions ${P}^{i,j}=({}_{1}\overline{p}^{i,j},\dots ,{}_{n}\overline{p}^{i,j})$$${}_{k}\overline{p}^{i,j}\in {}^{S}P^{i,j}\iff {}_{k}d_{X}({}_{k}x^{i,j},{x}_{POI})<{}_{k}\u03f5$$$$k=arg\underset{k=1,\dots ,n}{min}\left({}_{k}d_{X}({}_{k}x^{i,j},{x}_{POI})\right)$$
 (d)
 Given a multiple localized perception ${P}^{i,j}=\left({}_{1}\overline{p}_{\circ}^{i,j}\right)=\left(\{{}_{1}\overline{p}_{1}^{i,j},\dots ,{}_{1}\overline{p}_{m}^{i,j}\}\right)$$${}_{1}\overline{p}_{l}^{i,j}\in {}^{S}P^{i,j}\iff {}_{1}d_{X}({}_{1}x_{l}^{i,j},{x}_{POI})<{}_{1}\u03f5$$$$l=arg\underset{l=1,\dots ,m}{min}\left({}_{l}d_{X}({}_{1}x_{l}^{i,j},{x}_{POI})\right)$$
 (e)
 Given a set of multiple localized perception ${P}^{i,j}=({}_{1}\overline{p}_{\circ}^{i,j},\dots ,{}_{h}\overline{p}_{\circ}^{i,j})$$${}_{k}\overline{p}_{l}^{i,j}\in {}^{S}P^{i,j}\iff {}_{k}d_{x}({}_{\mathrm{k}}x_{l}^{i,j},{x}_{POI})<{}_{1}\u03f5$$$$l=arg\underset{l=1,\dots ,r}{min}\left({}_{l}d_{X}({}_{\mathrm{k}}x_{l}^{i,j},{x}_{POI})\right),k=1,\dots ,h$$$$k=arg\underset{k=1,\dots ,h}{min}\left({}_{l}d_{X}({}_{\mathrm{k}}x_{l}^{i,j},{x}_{POI})\right)$$r denotes the elements number of ${}_{k}\overline{p}_{\circ}^{i,j}$, that could be different for each multiple perception.
3.4. Action Saliency and InterAction Distance
 ${\omega}^{i,j}$ is a parameterization of the salient action.$${\omega}^{i,j}={f}_{\omega}\left({A}^{i,j}\left(t\right)\right):[{t}_{j1},{t}_{j}]\to \mathbb{W}$$
 ${x}_{s}$ is an empty variable which will be replaced by ${x}_{POI}$ in the starting configuration during the autonomous execution, e.g., the endeffector pose ${x}_{EE}$ or the state x, before executing the learned action.
 ${x}_{f}$ stands for the final value that ${x}_{POI}$ will assume at the end of subtask. This value depends on the type of perceptions involved. If only salient intrinsic perceptions are at play, then ${x}_{f}={x}_{POI}\left({t}_{j}\right)$, with ${x}_{POI}\left({t}_{j}\right)$ denoting the final value assumed by ${x}_{POI}$ at the end of the jth subtask during the SMT registration. Conversely, if at least one localized perception is involved, ${x}_{f}$ becomes parameterized with the perception’s location, as explained in the Contingent Action Mapping section (Section 3.6).
3.5. Contingency Relation
3.6. Contingent Action Mapping
3.7. SensoryMotor Contingency
3.8. SensoryMotor Contingencies Tree
Algorithm 1 SMCs Tree 

3.9. SMCsAware Control
Algorithm 2 Autonomous Execution 

4. Example Scenarios
4.1. Multiple Localized Perception—ML
4.1.1. SMTs Registration
 During the first SMT ${T}^{1}$ registration, the camera perceives two clusters (orange cube and a green cylinder) ${}_{1}\overline{p}_{1}^{1}$ and ${}_{1}\overline{p}_{2}^{1}$. The user provides instructions to the robot, demonstrating how to grasp the cube in one salient moment (as depicted in Figure 5a), and then showing where to place it, above the green cylinder, in another salient temporal moment (illustrated in Figure 5b).
 During the second SMT ${T}^{2}$ registration, the camera perceives two clusters (grey pyramid and a yellow cylinder) ${}_{1}\overline{p}_{1}^{2}$ and ${}_{1}\overline{p}_{2}^{2}$. The user provides instructions to the robot, demonstrating how to grasp the pyramid in one salient moment (as depicted in Figure 5d), and then showing where to place it, above the yellow cylinder, in another salient temporal moment (illustrated in Figure 5e).
4.1.2. SMCs Extraction
 As the user provided two distinct salient temporal moments, the SMT is segmented in two fragments, ${T}^{1,1},{T}^{1,2}$. Each fragment contains associated perceptions, ${P}^{1,j}$, which are defined as ${P}^{1,j}=\left({}_{1}\overline{p}_{\circ}^{1,j}\right)=\left(\{{}_{1}\overline{p}_{1}^{1,j},{}_{1}\overline{p}_{2}^{1,j}\}\right):j=1,2$ where ${}_{1}\overline{p}_{1}^{1,j}$ represents the orange cube and ${}_{1}\overline{p}_{2}^{1,j}$ represents the green cylinder. For the two subtasks, the salient perceptions are determined by the criteria presented in (21) and (22). In this case, the robot only learns the color histogram of the object closest to the E.E. ( i.e., orange cube for ${T}^{1,1}$ and green cylinder for ${T}^{1,2}$):$$l=arg\underset{l=1,2}{min}\left({}_{1}d_{X}({}_{1}x_{l}^{1,j},{x}_{EE}\left({t}_{j}\right))\right):j=1,2$$$${}_{1}d_{X}({}_{1}x_{l}^{1,j},{x}_{EE}\left({t}_{j}\right))<{}_{1}\u03f5\Rightarrow {}_{1}\overline{p}_{l}^{1,j}\in {}^{S}P^{1,j}:j=1,2$$$${\omega}^{1,j}={f}_{\omega}\left({A}^{1,j}\left(t\right)\right):t\in ({t}_{j1},{t}_{j}):j=1,2$$
 As the user provided two distinct salient temporal moments, the SMT is segmented in two fragments, ${T}^{2,1},{T}^{2,2}$. Each fragment contains associated perceptions, ${P}^{2,j}$, which are defined as ${P}^{2,j}=\left({}_{1}\overline{p}_{\circ}^{2,j}\right)=\left(\{{}_{1}\overline{p}_{1}^{2,j},{}_{1}\overline{p}_{2}^{2,j}\}\right):j=1,2$ where ${}_{1}\overline{p}_{1}^{2,j}$ represents the grey pyramid and ${}_{1}\overline{p}_{2}^{2,j}$ represents the yellow cylinder. For the two subtasks, the salient perceptions are determined by the criteria presented in (21) and (22). In this case, the robot only learns the color histogram of the object closest to the E.E. ( i.e., grey pyramid for ${T}^{1,1}$ and yellow cylinder for ${T}^{1,2}$):$$l=arg\underset{l=1,2}{min}\left({}_{1}d_{X}({}_{1}x_{l}^{2,j},{x}_{EE}\left({t}_{j}\right))\right):j=1,2$$$${}_{1}d_{X}({}_{1}x_{l}^{2,j},{x}_{EE}\left({t}_{j}\right))<{}_{1}\u03f5\Rightarrow {}_{1}\overline{p}_{l}^{2,j}\in {}^{S}P^{2,j}$$$${\omega}^{2,j}={f}_{\omega}\left({A}^{2,j}\left(t\right)\right):t\in ({t}_{j1},{t}_{j}).j=1,2$$
4.1.3. SMCs Tree
4.1.4. SMCs Control Execution
4.2. Robotic Mobile Base Merging Branches Example
4.2.1. SMTs Registration
 During the first SMT ${T}^{1}$ registration, in a first temporal salient moment, the camera perceives the initial color histogram of the room ${}_{1}\tilde{p}^{1,1}$ and the Lidar provides the room point cloud ${}_{2}\tilde{p}^{1,1}$ acquired in ${}_{2}x^{1,1}$. The user decides to move the robot from its initial configuration to ${\mathbf{x}}^{1,1}$, Figure 6a. Then, in a second salient temporal moment, the camera perceives a green arrow ${}_{1}\tilde{p}^{1,2}$ and the Lidar acquire a new point cloud ${}_{2}\tilde{p}^{1,2}$ in ${}_{2}x^{1,2}$. The user decides to move the robot from its initial configuration to ${\mathbf{x}}^{1,2}$, Figure 6b.
 During the second SMT ${T}^{2}$ registration, in a first temporal salient moment, the camera perceives the initial color histograms of the room ${}_{1}p^{2,1}$ ($\sim {}_{1}\tilde{p}^{1,1}$) and the Lidar provides the room point cloud ${}_{2}\tilde{p}^{2,1}$ ($\sim {}_{2}\tilde{p}^{1,1}$) in ${}_{2}x^{2,1})$. The user decides to move the robot from its initial configuration to ${\mathbf{x}}^{2,1}$ (Figure 6d). Then, in a second salient temporal moment, the camera perceives a yellow arrow ${}_{1}\tilde{p}^{2,2}$ and the Lidar acquire a new point cloud ${}_{2}\tilde{p}^{2,2}$ ($\sim {}_{2}\tilde{p}^{1,2}$) in ${}_{2}x^{2,2}$. The user decides to move the robot from its initial configuration to ${\mathbf{x}}^{2,2}$ (Figure 6e).
4.2.2. SMCs Extraction
 As the user provided two distinct salient temporal moments, the SMT is segmented in two fragments, ${T}^{1,1},{T}^{1,2}$. Each fragment contains associated perceptions, ${P}^{1,j}$, which are defined as ${P}^{1,j}=({}_{1}\tilde{p}^{1,j},{}_{2}\overline{p}^{1,j}):j=1,2$. For the two subtasks, the salient perceptions are determined by the criteria presented in (17) and (18):$${}_{1}\tilde{p}^{1,j}\in {}^{j}P^{1,j}:j=1,2$$$${}_{2}d_{X}({}_{2}x^{1,j},\mathbf{x})<{}_{2}\u03f5\Rightarrow {}_{2}\overline{p}^{1,j}\in {}^{S}P^{1,j}:j=1,2$$$${\omega}^{1,j}={f}_{\omega}\left({A}^{1,j}\left(t\right)\right):t\in ({t}_{j1},{t}_{j})$$
 As the user provided two distinct salient temporal moments, the SMT is segmented in two fragments, ${T}^{2,1},{T}^{2,2}$. Each fragment contains associated perceptions, ${P}^{2,j}$, which are defined as ${P}^{2,j}=({}_{1}\tilde{p}^{2,j},{}_{2}\overline{p}^{2,j}):j=1,2$. For the two subtasks, the salient perceptions are determined by the criteria presented in (17) and (18):$${}_{1}\tilde{p}^{2,j}\in {}^{S}P^{2,j}:j=1,2$$$${}_{2}d_{X}({}_{2}x^{2,j},\mathbf{x})<{}_{2}\u03f5\Rightarrow {}_{2}\overline{p}^{2,j}\in {}^{S}P^{2,j}:j=1,2$$$${\omega}^{2,j}={f}_{\omega}\left({A}^{2,j}\left(t\right)\right):t\in ({t}_{j1},{t}_{j})$$
4.2.3. SMCs Tree
4.2.4. SMCs Control Execution
4.3. Robotic Arm Merging Branches Example
4.3.1. SMTs Registration
 During the first SMT ${T}^{1}$ registration, the camera perceives one cluster (orange cube) ${}_{1}\overline{p}_{1}^{1,1}$ while the weight sensor detects a zero weight ${}_{2}\tilde{p}^{1,1}$. The user shows the robot how to grasp the cube, Figure 7a, and place it above the weight sensor, Figure 7b. Then, the camera still perceives the same cluster ${}_{1}\overline{p}_{1}^{1,3}$ and the weight sensor acquires the orange cube weight ${}_{2}\tilde{p}^{1,3}$. Finally, the user shows the robot where to put the object, Figure 7c.
 During the second SMT ${T}^{2}$ registration, the camera perceives one cluster (orange cube) ${}_{1}\overline{p}_{1}^{2,1}$ while the weight sensor detects a zero weight ${}_{2}\tilde{p}^{2,1}$. The user shows the robot how to grasp the cub, Figure 7e, and place it above the weight sensor, Figure 7f. Then, the camera still perceives the same cluster ${}_{1}\overline{p}_{1}^{2,3}$ and the weight sensor acquires the orange cube weight ${}_{2}\tilde{p}^{2,3}$ ($\ne {}_{2}\tilde{p}^{1,3}$). Finally, the user shows the robot where to put the object $({}_{1}x_{3}^{2,3}\ne {}_{1}x_{3}^{1,3})$, Figure 7f.
4.3.2. SMCs Extraction
 The salient perception for the subtasks are given by (21), (22) and (17)$$j=arg\underset{l=1}{min}\left({}_{1}d_{X}({}_{1}x_{l}^{1,j},{x}_{EE}\left({t}_{j}\right))\right):j=1,2,3$$$${}_{1}d_{X}({}_{1}x_{j}^{1,j},{x}_{EE}\left({t}_{i}\right))<{}_{1}\u03f5\Rightarrow {}_{1}\overline{p}_{j}^{1,j}\in {}^{S}P^{1,j}:j=1,2,3$$$${}_{2}\tilde{p}^{1,j}\in {}^{S}P^{1,j}:j=1,2,3$$$${\omega}^{1,j}={f}_{\omega}\left({A}^{1,j}\left(t\right)\right):t\in ({t}_{j1},{t}_{j}):j=1,2,3$$
 The salient perception for the subtasks are given by (21), (22) and (17)$$i=arg\underset{l=1}{min}\left({}_{1}d_{X}({}_{1}x_{l}^{2,j},{x}_{EE}\left({t}_{j}\right))\right):j=1,2,3$$$${}_{1}d_{X}({}_{1}x_{j}^{2,j},{x}_{EE}\left({t}_{j}\right))<{}_{2}\u03f5\Rightarrow {}_{1}\overline{p}_{j}^{2,j}\in {}^{S}P^{2,j}:j=1,2,3$$$${}_{2}\tilde{p}^{2,j}\in {}^{S}P^{2,j}:j=1,2,3$$$${\omega}^{2,j}={f}_{\omega}\left({A}^{2,j}\left(t\right)\right):t\in ({t}_{j1},{t}_{j}):j=1,2,3$$
4.3.3. SMCs Tree
4.3.4. SMCs Control Execution
5. Experimental Validation
5.1. Validation in Simulation
5.1.1. Experimental Setup
5.1.2. Task Description
5.1.3. Results
5.2. Validation in a Real Manipulation Task
5.2.1. Experimental Setup
5.2.2. Task Description
5.2.3. SMTs Registration
5.2.4. Execution Phase
5.3. Validation in a Real LocoManipulation Task
5.3.1. Experimental Setup
5.3.2. Task Description
5.3.3. SMTs Registration
5.3.4. Execution Phase
6. Discussion
Affordances in SMCsTree
7. Conclusions
8. Patents
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
 Johansson, R.S.; Edin, B.B. Predictive feedforward sensory control during grasping and manipulation in man. Biomed. Res. 1993, 14, 95–106. [Google Scholar]
 Jacquey, L.; Baldassarre, G.; Santucci, V.G.; O’regan, J.K. Sensorimotor contingencies as a key drive of development: From babies to robots. Front. Neurorobot. 2019, 13, 98. [Google Scholar] [CrossRef] [PubMed]
 Buhrmann, T.; Di Paolo, E.A.; Barandiaran, X. A dynamical systems account of sensorimotor contingencies. Front. Psychol. 2013, 4, 285. [Google Scholar] [CrossRef] [PubMed]
 O’Regan, J.K.; Noë, A. What it is like to see: A sensorimotor theory of perceptual experience. Synthese 2001, 129, 79–103. [Google Scholar] [CrossRef]
 Maye, A.; Engel, A.K. A discrete computational model of sensorimotor contingencies for object perception and control of behavior. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, IEEE, Shanghai, China, 3–13 May 2011; pp. 3810–3815. [Google Scholar]
 Maye, A.; Engel, A.K. Using sensorimotor contingencies for prediction and action planning. In Proceedings of the From Animals to Animats 12: 12th International Conference on Simulation of Adaptive Behavior, SAB 2012, Odense, Denmark, 27–30 August 2012; Proceedings 12; Springer: Berlin/Heidelberg, Germany, 2012; pp. 106–116. [Google Scholar]
 Hoffmann, M.; Schmidt, N.M.; Pfeifer, R.; Engel, A.K.; Maye, A. Using sensorimotor contingencies for terrain discrimination and adaptive walking behavior in the quadruped robot puppy. In Proceedings of the From Animals to Animats 12: 12th International Conference on Simulation of Adaptive Behavior, SAB 2012, Odense, Denmark, 27–30 August 2012; Proceedings 12; Springer: Berlin/Heidelberg, Germany, 2012; pp. 54–64. [Google Scholar]
 Lübbert, A.; Göschl, F.; Krause, H.; Schneider, T.R.; Maye, A.; Engel, A.K. Socializing sensorimotor contingencies. Front. Hum. Neurosci. 2021, 15, 624610. [Google Scholar] [CrossRef]
 Gibson, J.J. The Ecological Approach to Visual Perception: Classic Edition; Psychology Press: London, UK, 2014. [Google Scholar]
 Maye, A.; Engel, A.K. Extending sensorimotor contingency theory: Prediction, planning, and action generation. Adapt. Behav. 2013, 21, 423–436. [Google Scholar] [CrossRef]
 Ardón, P.; Pairet, È.; Lohan, K.S.; Ramamoorthy, S.; Petrick, R.P.A. Building Affordance Relations for Robotic Agents—A Review. arXiv 2021, arXiv:2105.06706. [Google Scholar]
 Montesano, L.; Lopes, M.; Bernardino, A.; SantosVictor, J. Modeling affordances using bayesian networks. In Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, San Diego, CA, USA, 29 October–2 November 2007; pp. 4102–4107. [Google Scholar]
 Krüger, N.; Geib, C.; Piater, J.; Petrick, R.; Steedman, M.; Wörgötter, F.; Ude, A.; Asfour, T.; Kraft, D.; Omrčen, D.; et al. Object–action complexes: Grounded abstractions of sensory–motor processes. Robot. Auton. Syst. 2011, 59, 740–757. [Google Scholar] [CrossRef]
 Dogar, M.R.; Ugur, E.; Sahin, E.; Cakmak, M. Using learned affordances for robotic behavior development. In Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 19–23 May 2008; pp. 3802–3807. [Google Scholar] [CrossRef]
 Ardón, P.; Pairet, È.; Lohan, K.S.; Ramamoorthy, S.; Petrick, R. Affordances in robotic tasks–a survey. arXiv 2020, arXiv:2004.07400. [Google Scholar]
 Datteri, E.; Teti, G.; Laschi, C.; Tamburrini, G.; Dario, G.; Guglielmelli, E. Expected perception: An anticipationbased perceptionaction scheme in robots. In Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453), Las Vegas, NV, USA, 27–31 October 2003; Volume 1, pp. 934–939. [Google Scholar] [CrossRef]
 Qi, J.; Ran, G.; Wang, B.; Liu, J.; Ma, W.; Zhou, P.; NavarroAlarcon, D. Adaptive shape servoing of elastic rods using parameterized regression features and autotuning motion controls. IEEE Robot. Autom. Lett. 2023, 9, 1428–1435. [Google Scholar] [CrossRef]
 Yang, C.; Zhou, P.; Qi, J. Integrating visual foundation models for enhanced robot manipulation and motion planning: A layered approach. arXiv 2023, arXiv:2309.11244. [Google Scholar]
 Kober, J.; Bagnell, J.A.; Peters, J. Reinforcement learning in robotics: A survey. Int. J. Robot. Res. 2013, 32, 1238–1274. [Google Scholar] [CrossRef]
 Ghadirzadeh, A.; Bütepage, J.; Maki, A.; Kragic, D.; Björkman, M. A sensorimotor reinforcement learning framework for physical HumanRobot Interaction. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 2682–2688. [Google Scholar] [CrossRef]
 Hay, N.; Stark, M.; Schlegel, A.; Wendelken, C.; Park, D.; Purdy, E.; Silver, T.; Phoenix, D.S.; George, D. Behavior Is Everything: Towards Representing Concepts with Sensorimotor Contingencies. Proc. AAAI Conf. Artif. Intell. 2018, 32. [Google Scholar] [CrossRef]
 DulacArnold, G.; Levine, N.; Mankowitz, D.J.; Li, J.; Paduraru, C.; Gowal, S.; Hester, T. Challenges of realworld reinforcement learning: Definitions, benchmarks and analysis. Mach. Learn. 2021, 110, 2419–2468. [Google Scholar] [CrossRef]
 Maye, A.; Trendafilov, D.; Polani, D.; Engel, A. A visual attention mechanism for autonomous robots controlled by sensorimotor contingencies. In Proceedings of the IROS 2015 Workshop on Sensorimotor Contingencies For Robotics, Hamburg, Germany, 2 October 2015. [Google Scholar]
 Ravichandar, H.; Polydoros, A.S.; Chernova, S.; Billard, A. Recent advances in robot learning from demonstration. Annu. Rev. Control. Robot. Auton. Syst. 2020, 3, 297–330. [Google Scholar] [CrossRef]
 Correia, A.; Alexandre, L.A. A Survey of Demonstration Learning. arXiv 2023, arXiv:2303.11191. [Google Scholar]
 Li, J.; Wang, J.; Wang, S.; Yang, C. Human–robot skill transmission for mobile robot via learning by demonstration. Neural Comput. Appl. 2021, 35, 23441–23451. [Google Scholar] [CrossRef]
 Zhao, J.; Giammarino, A.; Lamon, E.; Gandarias, J.M.; De Momi, E.; Ajoudani, A. A Hybrid Learning and Optimization Framework to Achieve Physically Interactive Tasks With Mobile Manipulators. IEEE Robot. Autom. Lett. 2022, 7, 8036–8043. [Google Scholar] [CrossRef]
 Somers, T.; Hollinger, G.A. Human–robot planning and learning for marine data collection. Auton. Robot. 2016, 40, 1123–1137. [Google Scholar] [CrossRef]
 Zeng, A.; Florence, P.; Tompson, J.; Welker, S.; Chien, J.; Attarian, M.; Armstrong, T.; Krasin, I.; Duong, D.; Sindhwani, V.; et al. Transporter networks: Rearranging the visual world for robotic manipulation. In Proceedings of the Conference on Robot Learning, PMLR, London, UK, 8–11 November 2021; pp. 726–747. [Google Scholar]
 Zhang, T.; McCarthy, Z.; Jow, O.; Lee, D.; Chen, X.; Goldberg, K.; Abbeel, P. Deep imitation learning for complex manipulation tasks from virtual reality teleoperation. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), IEEE, Brisbane, Australia, 21–25 May 2018; pp. 1–8. [Google Scholar]
 Zhu, Y.; Wang, Z.; Merel, J.; Rusu, A.; Erez, T.; Cabi, S.; Tunyasuvunakool, S.; Kramár, J.; Hadsell, R.; de Freitas, N.; et al. Reinforcement and imitation learning for diverse visuomotor skills. arXiv 2018, arXiv:1802.09564. [Google Scholar]
 Li, Y.; Song, J.; Ermon, S. Infogail: Interpretable imitation learning from visual demonstrations. In Advances in Neural Information Processing Systems 30 (NIPS 2017), Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Neural Inforamation Processing Systems: San Diego, CA, USA, 2017; Volume 30. [Google Scholar]
 Pan, Y.; Cheng, C.A.; Saigol, K.; Lee, K.; Yan, X.; Theodorou, E.; Boots, B. Agile autonomous driving using endtoend deep imitation learning. arXiv 2017, arXiv:1709.07174. [Google Scholar]
 Loquercio, A.; Maqueda, A.I.; DelBlanco, C.R.; Scaramuzza, D. Dronet: Learning to fly by driving. IEEE Robot. Autom. Lett. 2018, 3, 1088–1095. [Google Scholar] [CrossRef]
 Calandra, R.; Gopalan, N.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. Bayesian Gait Optimization for Bipedal Locomotion. In Proceedings of the LION, Learning and Intelligent Optimization: 8th International Conference, Lion 8, Gainesville, FL, USA, 16–21 February 2014. [Google Scholar]
 Gopalan, N.; Moorman, N.; Natarajan, M.; Gombolay, M. Negative Result for Learning from Demonstration: Challenges for EndUsers Teaching Robots with Task and Motion Planning Abstractions. In Proceedings of the Robotics: Science and Systems (RSS), New York, NY, USA, 27 June–1 July 2022. [Google Scholar]
 Maye, A.; Engel, A.K. Contextdependent dynamic weighting of information from multiple sensory modalities. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 2812–2818. [Google Scholar] [CrossRef]
 O’regan, J.K.; Noë, A. A sensorimotor account of vision and visual consciousness. Behav. Brain Sci. 2001, 24, 939–973. [Google Scholar] [CrossRef] [PubMed]
 Ijspeert, A.J.; Nakanishi, J.; Hoffmann, H.; Pastor, P.; Schaal, S. Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Comput. 2013, 25, 328–373. [Google Scholar] [CrossRef] [PubMed]
 Sinaga, K.P.; Yang, M.S. Unsupervised Kmeans clustering algorithm. IEEE Access 2020, 8, 80716–80727. [Google Scholar] [CrossRef]
 Brugnara, F.; Falavigna, D.; Omologo, M. Automatic segmentation and labeling of speech based on hidden Markov models. Speech Commun. 1993, 12, 357–370. [Google Scholar] [CrossRef]
 Zhang, H.; Han, X.; Zhang, W.; Zhou, W. Complex sequential tasks learning with Bayesian inference and Gaussian mixture model. In Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), IEEE, Kuala Lumpur, Malaysia, 12–15 December 2018; pp. 1927–1934. [Google Scholar]
 EscuderoRodrigo, D.; Alquezar, R. Distancebased kernels for dynamical movement primitives. In Artificial Intelligence Research and Development; IOS Press: Amsterdam, The Netherlands, 2015; pp. 133–142. [Google Scholar]
 Koenig, N.; Howard, A. Design and use paradigms for Gazebo, an opensource multirobot simulator. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), Sendai, Japan, 28 September–2 October 2004; Volume 3, pp. 2149–2154. [Google Scholar] [CrossRef]
 Bhattacharyya, A. On a Measure of Divergence between Two Multinomial Populations. Sankhya Indian J. Stat. 1946, 7, 401–406. [Google Scholar]
 Ajoudani, A. Teleimpedance: Teleoperation with impedance regulation using a bodymachine interface. In Transferring Human Impedance Regulation Skills to Robots; Springer: Berlin/Heidelberg, Germany, 2016; pp. 19–31. [Google Scholar]
 Catalano, M.G.; Grioli, G.; Farnioli, E.; Serio, A.; Piazza, C.; Bicchi, A. Adaptive synergies for the design and control of the Pisa/IIT SoftHand. Int. J. Robot. Res. 2014, 33, 768–782. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Stefanini, E.; Lentini, G.; Grioli, G.; Catalano, M.G.; Bicchi, A. Exploring Saliency for Learning SensoryMotor Contingencies in LocoManipulation Tasks. Robotics 2024, 13, 58. https://doi.org/10.3390/robotics13040058
Stefanini E, Lentini G, Grioli G, Catalano MG, Bicchi A. Exploring Saliency for Learning SensoryMotor Contingencies in LocoManipulation Tasks. Robotics. 2024; 13(4):58. https://doi.org/10.3390/robotics13040058
Chicago/Turabian StyleStefanini, Elisa, Gianluca Lentini, Giorgio Grioli, Manuel Giuseppe Catalano, and Antonio Bicchi. 2024. "Exploring Saliency for Learning SensoryMotor Contingencies in LocoManipulation Tasks" Robotics 13, no. 4: 58. https://doi.org/10.3390/robotics13040058