Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #664 +/- ##
===========================================
- Coverage 94.64% 94.60% -0.04%
===========================================
Files 73 73
Lines 4759 4800 +41
===========================================
+ Hits 4504 4541 +37
- Misses 255 259 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
davidorme
left a comment
There was a problem hiding this comment.
This all looks good. I've requested a minor change on the failure modes.
One thing - I'm not 100% sure about how the try stack is a mix of CSV failure modes and Excel failure modes without it being clear which one is which. That seems harder to maintain. We could split them, but then you'd duplicate the FileNotFoundError, unless the code specifically tests for the file using Path().exists() first - but that's bad practice because the file could in theory vanish between the two calls.
Should we just have separate CSV and Excel reader functions? It's more verbose but feels cleaner. Thoughts?
| except Exception as err: | ||
| to_raise = Exception(f"Unidentified exception opening {file}: {err}") | ||
| LOGGER.critical(to_raise) | ||
| raise to_raise |
There was a problem hiding this comment.
I think it's worth explicitly adding BadZipError - it's by far the most likely thing to happen and we can effectively report it as "Does not appear to be an Excel file". The general Exception can then handle anything else. I'd add a comment on that exception to say that openpyxl has a whole bunch of assorted failure modes for corrupt/non-excel Zip files and we're not going to try and track down an exhaustive list.
You can then add an actual zip file of (e.g.) a simple text file to test that failure mode.
There was a problem hiding this comment.
I added a corrupted excel file which triggers that error - I think that should cover that failure case?
Yeah, it already pained me a bit to have the duplication from |
|
@davidorme I pushed some changes to split up the tests and now the builds are failing. It seems like it is a problem with the poetry version in |
|
@sallymatson No - that's not you. For reasons I don't understand, The only problem being Can you push a change to to - name: Install Poetry
uses: abatilo/actions-poetry@v3.0.2
with:
poetry-version: 1.5.1 |
|
Aha! It did exist briefly, and the maintainer pulled it. abatilo/actions-poetry#88 |
|
@sallymatson Is this ready for re-review? If so, can you re-request review? |
davidorme
left a comment
There was a problem hiding this comment.
LGTM. I guess I'd also add something to the load_excel that says at present the data is explicitly only loaded from the first sheet. We could fix that, but not now?
Description
Added a dataframe reader that uses pandas to handle csv and excel input. Wrote tests for the functionality of both. Added
openpyxlto poetry as a dependency for reading excel files.Fixes # (issue)
Type of change
Key checklist
pre-commitchecks:$ pre-commit run -a$ poetry run pytestFurther checks