CITT Framework

Blind Assembly in Contact-Intensive Tight-Tolerance Tasks:

A Learning-based Two-Stage Framework

Bukun Son, Hyelim Choi, Jaemin Yoon, Dongun Lee

paper

Abstract

We present a two-stage framework that integrates a learning-based estimator and a controller, designed to address contact-intensive tasks. The estimator leverages a Bayesian particle filter with a mixture density network (MDN) structure, effectively handling non-injective issues arising from contact information. The controller combines a self-supervised and reinforcement learning (RL) approach, strategically dividing the low-level admittance controller's parameters into labelable and non-labelable categories, which are then trained accordingly. To further enhance accuracy and generalization performance, a transformer model is incorporated into the self-supervised learning component. The proposed framework is evaluated on the bolting task using an accurate real-time simulator and successfully transferred to an experimental environment.

Two-stage Assembly Framework

Data-driven contact pose estimator: mixture density network (MDN) based likelihood + particle filter
Learning-based assembly controller: labeled parameters → self-supervised learning / unlabeled parameters → RL

Contact Pose Estimator

Needs: Occlusion in visual feedback
Challenges: non-injectivity (multiple pose from single contact wrench) + complex contact behaviors
Goal: estimates the target obejct pose (e.g., bolt) with large error (both position and orientation) + calculates the estimation uncertainty
Methods: pose likelihood modeled as the mixture density network (MDN) + particle filter update

Visual feedback occlusion

Example of multi-modality issue

Probabilstic prediction

Learning-based Assembly Controller

Goal: compensates for the residual error and completes the assembly tasks

⇒ optimizes the admittance controller parameters

Methods
- Self-supervised learning with transformer model → predicts the desired configuration
- Reinforcement learning → optimizes the variable admittance gain, modulation wrench, and rotation velocity

Simulation (M48 bolt/nut)

Contact pose estimator
- Initial error: position [-12 mm, +12 mm], orientation [-10 deg, +10 deg]
- Mean estimation error: 1.30 mm, 2.38 deg (maximum: 3.95 mm, 4.98 deg)

Search motion

MDN-based particle filter example

Learning-based assembly controller
- Initial error: position [-5mm, +5mm], orientation [-5deg, +5deg] (margin with maximum estimation error)
- Baselines
  - Naive: nominal, random
  - Train desired pose: fully connected (FC), long short-term memory (LSTM)
  - Suggested method: transformer only, transformer + RL

Experiment

Contact pose estimator

Learning-based assembly controller
- Baseline (nominal trajectory) → success in the small error, but idling & jamming in the large error

Suggested method

Page updated

Google Sites

Report abuse