{"id":880,"date":"2023-11-02T11:34:10","date_gmt":"2023-11-02T11:34:10","guid":{"rendered":"https:\/\/todaysainews.com\/index.php\/2023\/11\/02\/stacking-our-way-to-more-general-robots-2\/"},"modified":"2025-04-27T07:30:58","modified_gmt":"2025-04-27T07:30:58","slug":"stacking-our-way-to-more-general-robots-2","status":"publish","type":"post","link":"https:\/\/todaysainews.com\/index.php\/2023\/11\/02\/stacking-our-way-to-more-general-robots-2\/","title":{"rendered":"Stacking our way to more general robots"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<div class=\"article-cover article-cover--centered\">\n<div class=\"article-cover__header\">\n<p class=\"article-cover__eyebrow glue-label\">Research<\/p>\n<dl class=\"article-cover__meta\">\n<dt class=\"glue-visually-hidden\">Published<\/dt>\n<dd class=\"article-cover__date glue-label\">\n              <time datetime=\"2021-10-11\"><br \/>\n                11 October 2021<br \/>\n              <\/time>\n            <\/dd>\n<dt class=\"glue-visually-hidden\">Authors<\/dt>\n<dd class=\"article-cover__authors\">\n<p data-block-key=\"s9q9u\">The Robotics Team<\/p>\n<\/dd>\n<\/dl>\n<section class=\"glue-social glue-social--zippy share share--centered article-cover__share\" data-glue-expansion-panel-expand-tooltip=\"Share: Expand to see social channels\" data-glue-expansion-panel-collapse-tooltip=\"Share: Hide social channels\" id=\"share-e817da17-eba4-4d98-a26e-c985fc79cb9d\">\n<\/section><\/div>\n<picture class=\"article-cover__image\"><source media=\"(min-width: 1024px)\" type=\"image\/webp\" width=\"1072\" height=\"603\" srcset=\"https:\/\/lh3.googleusercontent.com\/ng9RUj9KvUb6rxPBWW38UqQhc9M7r8fNM05ejTEYuByQtMvYvwd69YTDTaNaMm-V64HkDaYVt-T18MBbEzBCf24QhjSdfxoAs2fHW2Vc5s43Ez7CLA=w1072-h603-n-nu-rw 1x, https:\/\/lh3.googleusercontent.com\/ng9RUj9KvUb6rxPBWW38UqQhc9M7r8fNM05ejTEYuByQtMvYvwd69YTDTaNaMm-V64HkDaYVt-T18MBbEzBCf24QhjSdfxoAs2fHW2Vc5s43Ez7CLA=w2144-h1206-n-nu-rw 2x\"\/><source media=\"(min-width: 600px)\" type=\"image\/webp\" width=\"928\" height=\"522\" srcset=\"https:\/\/lh3.googleusercontent.com\/ng9RUj9KvUb6rxPBWW38UqQhc9M7r8fNM05ejTEYuByQtMvYvwd69YTDTaNaMm-V64HkDaYVt-T18MBbEzBCf24QhjSdfxoAs2fHW2Vc5s43Ez7CLA=w928-h522-n-nu-rw 1x, https:\/\/lh3.googleusercontent.com\/ng9RUj9KvUb6rxPBWW38UqQhc9M7r8fNM05ejTEYuByQtMvYvwd69YTDTaNaMm-V64HkDaYVt-T18MBbEzBCf24QhjSdfxoAs2fHW2Vc5s43Ez7CLA=w1856-h1044-n-nu-rw 2x\"\/><source type=\"image\/webp\" width=\"528\" height=\"297\" srcset=\"https:\/\/lh3.googleusercontent.com\/ng9RUj9KvUb6rxPBWW38UqQhc9M7r8fNM05ejTEYuByQtMvYvwd69YTDTaNaMm-V64HkDaYVt-T18MBbEzBCf24QhjSdfxoAs2fHW2Vc5s43Ez7CLA=w528-h297-n-nu-rw 1x, https:\/\/lh3.googleusercontent.com\/ng9RUj9KvUb6rxPBWW38UqQhc9M7r8fNM05ejTEYuByQtMvYvwd69YTDTaNaMm-V64HkDaYVt-T18MBbEzBCf24QhjSdfxoAs2fHW2Vc5s43Ez7CLA=w1056-h594-n-nu-rw 2x\"\/><img loading=\"lazy\" decoding=\"async\" alt=\"A collection of blocks on a grey surface. They are a mixture of shapes, coloured red, green, and blue. Some of the blocks are stacked on top of one another.\" height=\"603\" src=\"https:\/\/lh3.googleusercontent.com\/ng9RUj9KvUb6rxPBWW38UqQhc9M7r8fNM05ejTEYuByQtMvYvwd69YTDTaNaMm-V64HkDaYVt-T18MBbEzBCf24QhjSdfxoAs2fHW2Vc5s43Ez7CLA=w1072-h603-n-nu\" width=\"1072\"\/>\n    <\/picture>\n<\/p><\/div>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"eeoy9\"><b>Introducing RGB-Stacking as a new benchmark for vision-based robotic manipulation<\/b><\/p>\n<p data-block-key=\"1kzwj\">Picking up a stick and balancing it atop a log or stacking a pebble on a stone may seem like simple \u2014 and quite similar \u2014 actions for a person. However, most robots struggle with handling more than one such task at a time. Manipulating a stick requires a different set of behaviours than stacking stones, never mind piling various dishes on top of one another or assembling furniture. Before we can teach robots how to perform these kinds of tasks, they first need to learn how to interact with a far greater range of objects. As part of <a href=\"https:\/\/deepmind.com\/about\" rel=\"noopener\" target=\"_blank\">DeepMind\u2019s mission<\/a> and as a step toward making more generalisable and useful robots, we\u2019re exploring how to enable robots to better understand the interactions of objects with diverse geometries.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"bo14p\">In a paper to be presented at <a href=\"https:\/\/www.robot-learning.org\/\" rel=\"noopener\" target=\"_blank\">CoRL 2021<\/a> (Conference on Robot Learning) and available now as a preprint on <a href=\"https:\/\/openreview.net\/forum?id=U0Q8CrtBJxJ\" rel=\"noopener\" target=\"_blank\">OpenReview<\/a>, we introduce RGB-Stacking as a new benchmark for vision-based robotic manipulation. In this benchmark, a robot has to learn how to grasp different objects and balance them on top of one another. What sets our research apart from prior work is the diversity of objects used and the large number of empirical evaluations performed to validate our findings. Our results demonstrate that a combination of simulation and real-world data can be used to learn complex multi-object manipulation and suggest a strong baseline for the open problem of generalising to novel objects. To support other researchers, we\u2019re <a href=\"https:\/\/github.com\/deepmind\/rgb_stacking\" rel=\"noopener\" target=\"_blank\">open-sourcing<\/a> a version of our simulated environment, and releasing the <a href=\"https:\/\/github.com\/deepmind\/rgb_stacking\/tree\/main\/real_cell_documentation\" rel=\"noopener\" target=\"_blank\">designs<\/a> for building our real-robot RGB-stacking environment, along with the RGB-object models and information for 3D printing them. We are also open-sourcing <a href=\"https:\/\/github.com\/deepmind\/dm_robotics\" rel=\"noopener\" target=\"_blank\">a collection of libraries and tools<\/a> used in our robotics research more broadly.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"ichuv\">RGB-Stacking benchmark<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"texv8\">With RGB-Stacking, our goal is to train a robotic arm via reinforcement learning to stack objects of different shapes. We place a parallel gripper attached to a robot arm above a basket, and three objects in the basket \u2014 one red, one green, and one blue, hence the name RGB. The task is simple: stack the red object on top of the blue object within 20 seconds, while the green object serves as an obstacle and distraction. The learning process ensures that the agent acquires generalised skills through training on multiple object sets. We intentionally vary the grasp and stack affordances \u2014 the qualities that define how the agent can grasp and stack each object. This design principle forces the agent to exhibit behaviours that go beyond a simple pick-and-place strategy.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"khayj\">Each triplet poses its own unique challenges to the agent: Triplet 1 requires a precise grasp of the top object; Triplet 2 often requires the top object to be used as a tool to flip the bottom object before stacking; Triplet 3 requires balancing; Triplet 4 requires precision stacking (i.e., the object centroids need to align); and the top object of Triplet 5 can easily roll off if not stacked gently. In assessing the challenges of this task, we found that our hand-coded scripted baseline had a 51% success rate at stacking.<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"sjbmu\">Our RGB-Stacking benchmark includes two task versions with different levels of difficulty. In \u201cSkill Mastery,\u201d our goal is to train a single agent that\u2019s skilled in stacking a predefined set of five triplets. In \u201cSkill Generalisation,\u201d we use the same triplets for evaluation, but train the agent on a large set of training objects \u2014 totalling more than a million possible triplets. To test for generalisation, these training objects exclude the family of objects from which the test triplets were chosen. In both versions, we decouple our learning pipeline into three stages:<\/p>\n<ul>\n<li data-block-key=\"5wfhf\">First, we train in simulation using an off-the-shelf RL algorithm: <a href=\"https:\/\/arxiv.org\/abs\/1806.06920\" rel=\"noopener\" target=\"_blank\">Maximum a Posteriori Policy Optimisation (MPO)<\/a>. At this stage, we use the simulator\u2019s state, allowing for fast training since the object positions are given directly to the agent instead of the agent needing to learn to find the objects in images. The resulting policy is not directly transferable to the real robot since this information is not available in the real world.<\/li>\n<li data-block-key=\"1kdm6\">Next, we train a new policy in simulation that uses only realistic observations: images and the robot\u2019s proprioceptive state. We use a domain-randomised simulation to improve transfer to real-world images and dynamics. The state policy serves as a teacher, providing the learning agent with corrections to its behaviours, and those corrections are distilled into the new policy.<\/li>\n<li data-block-key=\"vibs8\">Lastly, we collect data using this policy on real robots and train an improved policy from this data offline by weighting up good transitions based on a learned Q function, as done in <a href=\"https:\/\/arxiv.org\/abs\/2006.15134\" rel=\"noopener\" target=\"_blank\">Critic Regularised Regression (CRR)<\/a>. This allows us to use the data that\u2019s passively collected during the project instead of running a time-consuming online training algorithm on the real robots.<\/li>\n<\/ul>\n<p data-block-key=\"rcty6\">Decoupling our learning pipeline in such a way proves crucial for two main reasons. Firstly, it allows us to solve the problem at all, since it would simply take too long if we were to start from scratch on the robots directly. Secondly, it increases our research velocity, since different people in our team can work on different parts of the pipeline before we combine these changes for an overall improvement.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<figure class=\"single-media single-media--inline\"><figcaption class=\"single-media__caption\">\n<p data-block-key=\"59vfa\">Our agent shows novel behaviours for stacking the 5 triplets. The strongest result with Skill Mastery was a vision-based agent that achieved 79% average success in simulation (Stage 2), 68% zero-shot success on real robots (Stage 2), and 82% after the one-step policy improvement from real data (Stage 3). The same pipeline for Skill Generalisation resulted in a final agent that achieved 54% success on real robots (Stage 3). Closing this gap between Skill Mastery and Generalisation remains an open challenge.<\/p>\n<\/figcaption><\/figure>\n<div class=\"gdm-rich-text rich-text\">\n<p data-block-key=\"z7xlz\">In recent years, there has been much work on applying learning algorithms to solving difficult real-robot manipulation problems at scale, but the focus of such work has largely been on tasks such as grasping, pushing, or other forms of manipulating single objects. The approach to RGB-Stacking we describe in our paper, accompanied by <a href=\"https:\/\/github.com\/deepmind\/rgb_stacking\" rel=\"noopener\" target=\"_blank\">our robotics resources now available on GitHub<\/a>, results in surprising stacking strategies and mastery of stacking a subset of these objects. Still, this step only scratches the surface of what\u2019s possible \u2013 and the generalisation challenge remains not fully solved. As researchers keep working to solve the open challenge of true generalisation in robotics, we hope this new benchmark, along with the environment, designs, and tools we have released, contribute to new ideas and methods that can make manipulation even easier and robots more capable.<\/p>\n<\/div>\n<figure class=\"single-media single-media--inline\">\n<\/figure>\n<aside class=\"related-posts\">\n<\/aside><\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"https:\/\/deepmind.google\/discover\/blog\/stacking-our-way-to-more-general-robots\/\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Research Published 11 October 2021 Authors The Robotics Team Introducing RGB-Stacking as a new benchmark for vision-based<\/p>\n","protected":false},"author":2,"featured_media":881,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[],"class_list":["post-880","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-deepmind-ai"],"_links":{"self":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/880","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/comments?post=880"}],"version-history":[{"count":1,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/880\/revisions"}],"predecessor-version":[{"id":2614,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/880\/revisions\/2614"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media\/881"}],"wp:attachment":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media?parent=880"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/categories?post=880"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/tags?post=880"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}