Learning Gentle Grasping Using Vision, Sound, and Touch

Type of the data
datacite.resourceTypeGeneral

Dataset

Total size of the dataset
datacite.size

4784971381

Author
dc.contributor.author

Nakahara, Ken

Author
dc.contributor.author

Calandra, Roberto

Upload date
dc.date.accessioned

2025-03-11T07:45:20Z

Publication date
dc.date.available

2025-03-11T07:45:20Z

Data of data creation
dc.date.created

2024

Publication date
dc.date.issued

2025-03-11

Abstract of the dataset
dc.description.abstract

This dataset contains 1,500 robotic grasps collected for the paper of Learning Gentle Grasping Using Vision, Sound, and Touch. Additionally, we provide a description of this dataset and Python scripts to visualize the data and process raw data into a training dataset for a PyTorch model. The robotics system used consists of a multi-fingered robotic hand (16-DoF, Allegro Hand v4.0), 7-DoF robotic arms (xArm7), DIGIT tactile sensors, an RGB-D camera (Intel RealSense D435i), and a commodity microphone. The target object is a toy that emits sound when grasped strongly.

Public reference to this page
dc.identifier.uri

https://opara.zih.tu-dresden.de/handle/123456789/1361

Public reference to this page
dc.identifier.uri

https://doi.org/10.25532/OPARA-787

Publisher
dc.publisher

Technische Universität Dresden

Licence
dc.rights

Attribution-NonCommercial-NoDerivatives 4.0 Internationalen

URI of the licence text
dc.rights.uri

http://creativecommons.org/licenses/by-nc-nd/4.0/

Specification of the discipline(s)
dc.subject.classification

4::44::409::409-05

Title of the dataset
dc.title

Learning Gentle Grasping Using Vision, Sound, and Touch

Software
opara.descriptionSoftware.ResourceProcessing

Python

Software
opara.descriptionSoftware.ResourceViewing

Python

Project abstract
opara.project.description

In our daily life, we often encounter objects that are fragile and can be damaged by excessive grasping force, such as fruits. For these objects, it is paramount to grasp gently – not using the maximum amount of force possible, but rather the minimum amount of force necessary. This paper proposes using visual, tactile, and auditory signals to learn to grasp and regrasp fragile objects stably and gently. Specifically, we use audio signals as an indicator of gentleness during the grasping, and then train end-to-end an action-conditional model from raw visuo-tactile inputs that predicts both the stability and the gentleness of future grasping candidates, thus allowing the selection and execution of the most promising action. Experimental results on a multi-fingered hand over 1,500 grasping trials demonstrated that our model is useful for gentle grasping by validating the predictive performance (3.27% higher accuracy than the vision-only variant) and providing interpretations of their behavior. Finally, real-world experiments confirmed that the grasping performance with the trained multi-modal model outperformed other baselines (17% higher rate for stable and gentle grasps than vision-only). Our approach requires neither tactile sensor calibration nor analytical force modeling, drastically reducing the engineering effort to grasp fragile objects.

Public project website(s)
opara.project.publicReference

https://lasr.org/research/gentle-grasping

Project title
opara.project.title

Learning Gentle Grasping Using Vision, Sound, and Touch
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
data_gentle_grasping.zip
Size:
4.46 GB
Format:
Unknown data format
Description:
Loading...
Thumbnail Image
Name:
README.md
Size:
3.37 KB
Format:
Unknown data format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
4.66 KB
Format:
Item-specific license agreed to upon submission
Description:
Attribution-NonCommercial-NoDerivatives 4.0 International