TyDiQA dataset

Thank you for proposing an interesting token-level data selection method. I encountered some issues while preparing the TyDiQA dataset for evaluation (For research purposes only):

1. **Dataset Download Issue**: I'm unable to download the TyDiQA dataset from the following links (see screenshot attached):
   - https://storage.googleapis.com/tydiqa/v1.1/tydiqa-goldp-v1.1-dev.json
   - https://storage.googleapis.com/tydiqa/v1.1/tydiqa-goldp-v1.1-train.json

<img width="554" height="56" alt="Image" src="https://github.com/user-attachments/assets/eddcbc4a-24ba-4a9a-aa1f-f5fba44ee77d" />
   
   Could you provide a new download link or share the dataset with me? My email: daishaojie96@gmail.com

2. **Metrics Clarification**: Regarding Table 1 in the paper, for the results on HellaSwag, LogiQA, and ARC_challenge, are the reported metrics accuracy (acc) or accuracy normalized (acc_norm)?

Looking forward to your reply. Thank you!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TyDiQA dataset #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TyDiQA dataset #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions