A presenter enthusiastically explains the content of his research poster at a conference. Titled "Rapid Motor Adaptation for Robotic Manipulator Arms," the poster includes various sections detailing previous challenges, simulation training, and experiments related to the project. Diagrams, charts, and example images accompany the text, which is spread across three main phases of the study. The presenter, wearing a green shirt and a conference badge, engages with his audience, pointing towards specific details on the poster. The conference setting includes multiple poster displays under bright, structured ceiling lights, with the poster labeled as number 171.
Text transcribed from the image:
17
S3X
PR
024
WA
Rapid Motor Adaptation for Robotic Manipulator Arms
Kevin Ellis'
Yichao Liang12
'Computational and Biological Learning Lab, University of Cambridge Visual Geometry Group, University of Oxford Cornell University
João F. Henriques?
VGG
171
ivation
Robot
ck-and-pla
n of arbitrary objects
5. Rapid Motor Adaptation for Robotic Manipulator Arms (RMA³)
1. POUCY TRAINING PHASE
YCB objects dataset
ManiSkill2 faucets operation
EGAD objects (adversarial
CHEEL IT
rds tool use and dexterous actions.
Phase 1:
Simulation
BOOZE
ndle variations in task configurations.
Lope with state estimation errors, external
rbances, model misspecification and drift.
2. Prior work - challenges
Classic control: Inexact models, calibration.
Reinforcement learning: Sample-inefficient, damage.
Imitation learning: Requires human control.
3. Simulation training
Photorealistic rendering is expensive.
•Difficult to calibrate.
"Sim-to-real gap" always remains unknown parameters of
the real-world (friction, masses, inertia tensors...).
ndomization
ion as much as possible - the model "expects anything".
Anservative policies (i.e. "walking on potentially shaky ground").
environment
mass, friction.
bject size,...)
Policy
T(X,Z9)
State St
Simulation
Reward
TE(St. 9)
Action
at
Train policy
with domain
randomization,
conditional on
privileged info
of environment.
Phase 2:
Random
environment
parameters e
(mass, friction,
object identity
object size....)
Object-manipulation-specific components:
Learnable embeddings for object identity and category, as env. parameters.
Environment embedding predictor (adapter) conditioned on depth image.
→ Creates a strong proxy for object geometry, focused on visuomotor control.
6. Experiments
4 domains from ManiSkill2
benchmark (Sapien simulator).
Increase environmental variations
(and object shapes), observation
noise, and external disturbances.
for generalization testing.
Agent obs. x
Object obs. x
Goals 9
Action
a
D
ObjectNan
State s
Policy
(
Agertat
relations
deduce t
Env. encoder
H(e.st)
Environment
embedding z
State s
Reward
Modula
(5.9)
Pick and Place YCB Objects
2. ADAPTER TRAINING PHASE
Stop gradient
Obs. and
Lloss
Learn to predict action history-
the privileged
XtoQtXヒーシー
info from freely
available.
observations.
Adapter
(xstast ft)
Predicted embed. 2
Depth CNN
image
de
(d)
Agent obs. x
Object obs. x
Goals g
Policy
Action
at
Depth Visualization Depth Visualization
Explici
Lackin
learning
Self-su
map (9
matic
super
Faucet Turning
gen
region
Peg Insertion
Extrapolation of Policies from YCB to EGAD Dataset