Skip to content

Reproduced success rates are lower than reported on the Robotwin benchmark #53

@ycsh15151

Description

@ycsh15151

I am currently trying to reproduce the evaluation results using the robotwin-posttrain checkpoint. I ran the evaluation using the provided launch_client.sh and launch_server.sh scripts. I have not modified any code or configuration files.

However, I noticed that the success rates I'm getting are consistently lower than those reported in the paper. Specifically, under the demo_randomized setting, tasks that are reported to achieve a 90%+ success rate in the paper are only getting around 70%+ in my local tests.

Since I am using the default configs and checkpoint, could you help me analyze the potential reasons for this performance drop?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions