Skip to content

ci: Remove duckdb nightly from cov#3015

Merged
dangotbanned merged 4 commits intomainfrom
fix-sqlframe-collect-duckdb-nightly
Aug 20, 2025
Merged

ci: Remove duckdb nightly from cov#3015
dangotbanned merged 4 commits intomainfrom
fix-sqlframe-collect-duckdb-nightly

Conversation

@dangotbanned
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned commented Aug 20, 2025

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Important

Originally named fix: Handle pa.RecordBatchReader in SparkLikeLazyFrame.collect

@dangotbanned dangotbanned added duckdb Issue is related to duckdb backend sqlframe Issue is related to sqlframe backend fix labels Aug 20, 2025
@dangotbanned dangotbanned marked this pull request as ready for review August 20, 2025 12:28
@dangotbanned
Copy link
Copy Markdown
Member Author

dangotbanned commented Aug 20, 2025

Not really sure what to do about this new failure
(https://github.com/narwhals-dev/narwhals/actions/runs/17098212889/job/48487645716?pr=3015)

FAILED tests/expr_and_series/reduction_test.py::test_empty_scalar_reduction_with_columns[sqlframe] - 
 NotImplementedError: This engine does not support schema inference likely since it does not have an active connection.

I've chased down (e6b21ca) already, but this is all stems from a duckdb nightly issue that hasn't been handled yet in sqlframe

@MarcoGorelli do we still need to run duckdb nightly for coverage?
Every other package seems to have the nightly install only in the nightly workflow 🤔

@MarcoGorelli
Copy link
Copy Markdown
Member

@MarcoGorelli do we still need to run duckdb nightly for coverage?

we don't need it anymore 👍

Copy link
Copy Markdown
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dangotbanned
Copy link
Copy Markdown
Member Author

dangotbanned commented Aug 20, 2025

@MarcoGorelli do we still need to run duckdb nightly for coverage?

we don't need it anymore 👍

Ahhh, in that case should we just take the simpler route and edit the workflows instead? 😂

Edit

And raise an issue in sqlframe to let them know their API may break soon

@MarcoGorelli
Copy link
Copy Markdown
Member

seems fine to merge, we'll need this fix in place eventually anyway, no?

@dangotbanned
Copy link
Copy Markdown
Member Author

dangotbanned commented Aug 20, 2025

seems fine to merge, we'll need this fix in place eventually anyway, no?

We might need it, but I'm not 100% sure

https://github.com/eakmanrq/sqlframe/blob/0245f44bbe118067777a5cef3c904666db4993ff/sqlframe/duckdb/dataframe.py#L55-L65

Unless sqlframe also wants to break backwards-compatibility here:

    @t.overload
    def toArrow(self) -> ArrowTable: ...

    @t.overload
    def toArrow(self, batch_size: int) -> RecordBatchReader: ...

    def toArrow(self, batch_size: t.Optional[int] = None) -> t.Union[ArrowTable, RecordBatchReader]:
        self._collect(skip_rows=True)
        if not batch_size:
            return self.session._last_result.arrow()
        return self.session._last_result.fetch_record_batch(batch_size)

They still have the option mentioned in the duckdb PR:

We also have a dedicated function to explicitly return an arrow table with fetch_arrow_table(), and a record batch reader with fetch_record_batch.

I feel like there's a reasonable chance that arrow() call will end up being replaced, WDYT?

@MarcoGorelli
Copy link
Copy Markdown
Member

sure, could be

let's just remove the nightly install from this job then?

@dangotbanned
Copy link
Copy Markdown
Member Author

dangotbanned commented Aug 20, 2025

sure, could be

let's just remove the nightly install from this job then?

Sounds good, also pinging @eakmanrq!

Edit

Following this duckdb PR, toArrow always returns RecordBatchReader.

@dangotbanned dangotbanned marked this pull request as draft August 20, 2025 16:41
Copy link
Copy Markdown
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feeel free to merge once coverage is addressed

@dangotbanned dangotbanned changed the title fix: Handle pa.RecordBatchReader in SparkLikeLazyFrame.collect ci: Remove duckdb nightly from cov Aug 20, 2025
@dangotbanned dangotbanned marked this pull request as ready for review August 20, 2025 16:52
@dangotbanned dangotbanned merged commit 1a67c16 into main Aug 20, 2025
33 of 34 checks passed
@dangotbanned dangotbanned deleted the fix-sqlframe-collect-duckdb-nightly branch August 20, 2025 16:54
dangotbanned added a commit that referenced this pull request Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci duckdb Issue is related to duckdb backend fix sqlframe Issue is related to sqlframe backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants