Skip to content

fix(download): use record IDs for all local data paths#167

Merged
tiborsimko merged 1 commit intocernopendata:masterfrom
tiborsimko:166-download-via-recid-and-xrootd
Dec 17, 2025
Merged

fix(download): use record IDs for all local data paths#167
tiborsimko merged 1 commit intocernopendata:masterfrom
tiborsimko:166-download-via-recid-and-xrootd

Conversation

@tiborsimko
Copy link
Member

When downloading files using --doi or --title parameters, the download directory was incorrectly named "None" instead of the actual record ID. This commit fixes the problem by dereferencing DOI/title into record ID and using that for storing downloaded files.

Closes #166

@tiborsimko tiborsimko self-assigned this Dec 5, 2025
tiborsimko added a commit to tiborsimko/cernopendata-client that referenced this pull request Dec 5, 2025
)

When downloading files using `--doi` or `--title` parameters, the
download directory was incorrectly named "None" instead of the actual
record ID. This commit fixes the problem by dereferencing DOI/title into
record ID and using that for storing downloaded files.

Closes cernopendata#166
@tiborsimko tiborsimko force-pushed the 166-download-via-recid-and-xrootd branch from 5b29910 to 1414b7a Compare December 5, 2025 13:36
@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.88%. Comparing base (7d14b38) to head (9b97f7c).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #167      +/-   ##
==========================================
+ Coverage   80.75%   80.88%   +0.13%     
==========================================
  Files          12       12              
  Lines         717      722       +5     
==========================================
+ Hits          579      584       +5     
  Misses        138      138              
Files with missing lines Coverage Δ
cernopendata_client/cli.py 92.97% <100.00%> (+0.19%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


total_files = len(download_file_locations)
path = str(recid)
path = str(record_recid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still needed to convert to str? I assumed that record_json["metadata"]["recid"] is already a string

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this should not be needed anymore. Amended.

@cernopendata_client.command()
@click.option("--recid", type=click.INT, help="Record ID")
@click.option("--doi", help="Digital Object Identifier")
@click.option("--title", help="Record title")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DOI and title, are there exact searches or wildcard? It might help to describe that in the option

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to all CLI options.

validate_recid(recid)

# Get record metadata and resolve recid from DOI/title if needed
record_json = get_record_as_json(server, recid, doi, title)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about checking that record_json is defined? Or does the function throw an exception if there are no matches?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the lookup does not allow to continue if there are no matches. It prints an error and sys-exits.

$ cernopendata-client download-files --recid 5500000
==> ERROR: The record ID number you supplied is not valid.

if os.path.isfile(test_file):
os.remove(test_file)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a test with an incorrect DOI?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a test for wrong DOI test_dry_run_from_doi_wrong() which does a dry run as well, but I guess that's OK and it should do.

tiborsimko added a commit to tiborsimko/cernopendata-client that referenced this pull request Dec 9, 2025
)

When downloading files using `--doi` or `--title` parameters, the
download directory was incorrectly named "None" instead of the actual
record ID. This commit fixes the problem by dereferencing DOI/title into
record ID and using that for storing downloaded files.

Closes cernopendata#166
@tiborsimko tiborsimko force-pushed the 166-download-via-recid-and-xrootd branch from 1414b7a to 013bf0d Compare December 9, 2025 15:03
)

When downloading files using `--doi` or `--title` parameters, the
download directory was incorrectly named "None" instead of the actual
record ID. This commit fixes the problem by dereferencing DOI/title into
record ID and using that for storing downloaded files.

Also enhances `--help` output to clarify that `--recid`, `--doi` and
`--title` options are triggering exact match with no wildcards.

Closes cernopendata#166
@tiborsimko tiborsimko force-pushed the 166-download-via-recid-and-xrootd branch from 013bf0d to 9b97f7c Compare December 9, 2025 15:13
@tiborsimko tiborsimko merged commit 9b97f7c into cernopendata:master Dec 17, 2025
24 checks passed
@tiborsimko tiborsimko deleted the 166-download-via-recid-and-xrootd branch December 17, 2025 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

download: improve logs and storage behaviour when a data is identified by record ID vs DOI

2 participants