{"id":800,"date":"2023-10-29T11:09:48","date_gmt":"2023-10-29T11:09:48","guid":{"rendered":"https:\/\/todaysainews.com\/index.php\/2023\/10\/29\/from-motor-control-to-embodied-intelligence-2\/"},"modified":"2025-04-27T07:32:16","modified_gmt":"2025-04-27T07:32:16","slug":"from-motor-control-to-embodied-intelligence-2","status":"publish","type":"post","link":"https:\/\/todaysainews.com\/index.php\/2023\/10\/29\/from-motor-control-to-embodied-intelligence-2\/","title":{"rendered":"From motor control to embodied intelligence"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<div class=\"article-cover\">\n<div class=\"article-cover__header\">\n<p class=\"article-cover__eyebrow glue-label\">Research<\/p>\n<dl class=\"article-cover__meta\">\n<dt class=\"glue-visually-hidden\">Published<\/dt>\n<dd class=\"article-cover__date glue-label\">\n              <time datetime=\"2022-08-31\"><br \/>\n                31 August 2022<br \/>\n              <\/time>\n            <\/dd>\n<dt class=\"glue-visually-hidden\">Authors<\/dt>\n<dd class=\"article-cover__authors\">\n<p data-block-key=\"skhtg\">Siqi Liu, Leonard Hasenclever, Steven Bohez, Guy Lever, Zhe Wang, S. M. Ali Eslami, Nicolas Heess<\/p>\n<\/dd>\n<\/dl>\n<section class=\"glue-social glue-social--zippy share share--left article-cover__share\" data-glue-expansion-panel-expand-tooltip=\"Share: Expand to see social channels\" data-glue-expansion-panel-collapse-tooltip=\"Share: Hide social channels\" id=\"share-2f1cf4e2-74d5-4c99-b6a5-ef5873c52b41\">\n<\/section><\/div>\n<\/p><\/div>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"ws9fp\">Using human and animal motions to teach robots to dribble a ball, and simulated humanoid characters to carry boxes and play football<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"s4lj0\">Humanoid character learning to traverse an obstacle course through trial-and-error, which can lead to idiosyncratic solutions. Heess, et al. &#8220;Emergence of locomotion behaviours in rich environments&#8221; (2017).<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"znasd\">Five years ago, we took on the challenge of teaching a fully articulated humanoid character to <a href=\"https:\/\/youtu.be\/hx_bgoTF7bs?t=88\" rel=\"noopener\" target=\"_blank\">traverse obstacle courses<\/a>. This demonstrated what reinforcement learning (RL) can achieve through trial-and-error but also highlighted two challenges in solving <i>embodied<\/i> intelligence:<\/p>\n<ol>\n<li data-block-key=\"yrjav\"><b>Reusing previously learned behaviours:<\/b> A significant amount of data was needed for the agent to \u201cget off the ground\u201d. Without any initial knowledge of what force to apply to each of its joints, the agent started with random body twitching and quickly falling to the ground. This problem could be alleviated by reusing previously learned behaviours.<\/li>\n<li data-block-key=\"6r7wa\"><b>Idiosyncratic behaviours:<\/b> When the agent finally learned to navigate obstacle courses, it did so with unnatural (<a href=\"https:\/\/www.youtube.com\/watch?v=EI3gcbDUNiM&amp;t=258s\" rel=\"noopener\" target=\"_blank\">albeit amusing<\/a>) movement patterns that would be impractical for applications such as robotics.<\/li>\n<\/ol>\n<p data-block-key=\"cw2fi\">Here, we describe a solution to both challenges called neural probabilistic motor primitives (NPMP), involving guided learning with movement patterns derived from humans and animals, and discuss how this approach is used in our <a href=\"https:\/\/www.science.org\/doi\/10.1126\/scirobotics.abo0235\" rel=\"noopener\" target=\"_blank\">Humanoid Football paper,<\/a> published today in Science Robotics.<\/p>\n<p data-block-key=\"1fm4u\">We also discuss how this same approach enables humanoid full-body manipulation from vision, such as a humanoid carrying an object, and robotic control in the real-world, such as a robot dribbling a ball.<\/p>\n<h2 data-block-key=\"8b3l7\">Distilling data into controllable motor primitives using NPMP<\/h2>\n<p data-block-key=\"0a0wj\">An NPMP is a general-purpose motor control module that translates short-horizon motor intentions to low-level control signals, and it\u2019s <a href=\"https:\/\/openreview.net\/forum?id=BJl6TjRcY7\" rel=\"noopener\" target=\"_blank\">trained offline<\/a> or <a href=\"https:\/\/proceedings.mlr.press\/v119\/hasenclever20a.html\" rel=\"noopener\" target=\"_blank\">via RL<\/a> by imitating motion capture (MoCap) data, recorded with trackers on humans or animals performing motions of interest.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"7kvem\">An agent learning to imitate a MoCap trajectory (shown in grey).<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"uyxip\"><b>The model has two parts:<\/b><\/p>\n<ol>\n<li data-block-key=\"qvvwl\">An encoder that takes a future trajectory and compresses it into a motor intention.<\/li>\n<li data-block-key=\"obxsr\">A low-level controller that produces the next action given the current state of the agent and this motor intention.<\/li>\n<\/ol>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"0vxvh\">Our NPMP model first distils reference data into a low-level controller (left). This low-level controller can then be used as a plug-and-play motor control module on a new task (right).<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"d2xep\">After training, the low-level controller can be reused to learn new tasks, where a high-level controller is optimised to output motor intentions directly. This enables efficient exploration \u2013 since coherent behaviours are produced, even with randomly sampled motor intentions \u2013 and constrains the final solution.<\/p>\n<h2 data-block-key=\"eacri\">Emergent team coordination in humanoid football<\/h2>\n<p data-block-key=\"y8dnd\">Football has been <a href=\"https:\/\/link.springer.com\/chapter\/10.1007\/3-540-64473-3_46\" rel=\"noopener\" target=\"_blank\">a long-standing challenge<\/a> for embodied intelligence research, requiring individual skills and coordinated team play. In our latest work, we used an NPMP as a prior to guide the learning of movement skills.<\/p>\n<p data-block-key=\"yi82w\">The result was a team of players which progressed from learning ball-chasing skills, to finally learning to coordinate. Previously, in a <a href=\"https:\/\/openreview.net\/forum?id=BkG8sjR5Km\" rel=\"noopener\" target=\"_blank\">study with simple embodiments<\/a>, we had shown that coordinated behaviour can emerge in teams competing with each other. The NPMP allowed us to observe a similar effect but in a scenario that required significantly more advanced motor control.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"vmdqw\">Agents first mimic the movement of football players to learn an NPMP module (top). Using the NPMP, the agents then learn football-specific skills (bottom).<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"5ly0x\">Our agents acquired skills including agile locomotion, passing, and division of labour as demonstrated by a range of statistics, including metrics used in <a href=\"https:\/\/www.researchgate.net\/profile\/William-Spearman\/publication\/327139841_Beyond_Expected_Goals\/links\/5b7c3023a6fdcc5f8b5932f7\/Beyond-Expected-Goals.pdf\" rel=\"noopener\" target=\"_blank\">real-world sports analytics<\/a>. The players exhibit both agile high-frequency motor control and long-term decision-making that involves anticipation of teammates\u2019 behaviours, leading to coordinated team play.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"2z7ei\">An agent learning to play football competitively using multi-agent RL.<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<h2 data-block-key=\"933rv\">Whole-body manipulation and cognitive tasks using vision<\/h2>\n<p data-block-key=\"1axkx\">Learning to interact with objects using the arms is another difficult control challenge. The NPMP can also enable this type of whole-body manipulation. With a small amount of MoCap data of interacting with boxes, we\u2019re able to <a href=\"https:\/\/www.youtube.com\/watch?v=2rQAW-8gQQk\" rel=\"noopener\" target=\"_blank\">train an agent to carry a box<\/a> from one location to another, using egocentric vision and with only a sparse reward signal:<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"3m0oc\">With a small amount of MoCap data (top), our NPMP approach can solve a box carrying task (bottom).<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"11zvx\">Similarly, we can teach the agent to catch and throw balls:<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"c2t30\">Simulated humanoid catching and throwing a ball.<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"voyrj\">Using NPMP, we can also tackle <a href=\"https:\/\/openreview.net\/forum?id=BJfYvo09Y7\" rel=\"noopener\" target=\"_blank\">maze tasks involving locomotion, perception and memory<\/a>:<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"jkvg8\">Simulated humanoid collecting blue spheres in a maze.<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<h2 data-block-key=\"xtfr0\">Safe and efficient control of real-world robots<\/h2>\n<p data-block-key=\"f5yp5\">The NPMP can also help to control real robots. Having well-regularised behaviour is critical for activities like walking over rough terrain or handling fragile objects. Jittery motions can damage the robot itself or its surroundings, or at least drain its battery. Therefore, significant effort is often invested into designing learning objectives that make a robot do what we want it to while behaving in a safe and efficient manner.<\/p>\n<p data-block-key=\"u5cuo\">As an alternative, we investigated whether using <a href=\"https:\/\/arxiv.org\/abs\/2203.17138\" rel=\"noopener\" target=\"_blank\">priors derived from biological motion<\/a> can give us well-regularised, natural-looking, and reusable movement skills for legged robots, such as walking, running, and turning that are suitable for deploying on real-world robots.<\/p>\n<p data-block-key=\"ge3o8\">Starting with MoCap data from humans and dogs, we adapted the NPMP approach to train skills and controllers in simulation that can then be deployed on real humanoid (OP3) and quadruped (ANYmal B) robots, respectively. This allowed the robots to be steered around by a user via a joystick or dribble a ball to a target location in a natural-looking and robust way.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"trztx\">Locomotion skills for the ANYmal robot are learned by imitating dog MoCap.<\/p>\n<\/figcaption><\/figure>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"uvzgc\">Locomotion skills can then be reused for controllable walking and ball dribbling.<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<h2 data-block-key=\"9xvtm\">Benefits of using neural probabilistic motor primitives<\/h2>\n<p data-block-key=\"iusys\">In summary, we\u2019ve used the NPMP skill model to learn complex tasks with humanoid characters in simulation and real-world robots. The NPMP packages low-level movement skills in a reusable fashion, making it easier to learn useful behaviours that would be difficult to discover by unstructured trial and error. Using motion capture as a source of prior information, it biases learning of motor control toward that of naturalistic movements.<\/p>\n<p data-block-key=\"324hq\">The NPMP enables embodied agents to learn more quickly using RL; to learn more naturalistic behaviours; to learn more safe, efficient and stable behaviours suitable for real-world robotics; and to combine full-body motor control with longer horizon cognitive skills, such as teamwork and coordination.<\/p>\n<p data-block-key=\"zqcap\">Learn more about our work<b>:<\/b><\/p>\n<\/div>\n<aside class=\"related-posts\">\n<\/aside><\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"https:\/\/deepmind.google\/discover\/blog\/from-motor-control-to-embodied-intelligence\/\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Research Published 31 August 2022 Authors Siqi Liu, Leonard Hasenclever, Steven Bohez, Guy Lever, Zhe Wang, S.<\/p>\n","protected":false},"author":2,"featured_media":765,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[],"class_list":["post-800","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-deepmind-ai"],"_links":{"self":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/800","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/comments?post=800"}],"version-history":[{"count":1,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/800\/revisions"}],"predecessor-version":[{"id":2671,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/800\/revisions\/2671"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media\/765"}],"wp:attachment":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media?parent=800"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/categories?post=800"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/tags?post=800"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}