Merged
Conversation
dangotbanned
added a commit
that referenced
this pull request
Jan 25, 2025
Should have been done during #671 (`point.json`) The diff on `income.json` seems like removing a newline char?
dsmedia
added a commit
that referenced
this pull request
Feb 2, 2025
* feat: adds generation script for income.json * style: format income.py with ruff * refactor: convert lambda sort key to named function Replace lambda sort key in process_state_records with a named get_state_income_sort_key function for better readability and maintainability. This makes the sorting logic more explicit and follows Python's guidance on avoiding complex lambdas. * ci(typing): Include `income.py` for `pyright` * fix: Avoid `CRLF` on `win32` c8f3056, #653 * feat(typing): Utilize typing some more - shared `group` field is now hinted for all 4 places it is used - Including as a key function - added `Region` to indicate only `5` unique values - changed global constants `dict` -> `Mapping` to reflect they are not mutated - changed `AggregatedIncomeGroup` field to required - Otherwise it is exactly `BaseIncomeGroup` - The annotation already reflects that `BaseIncomeGroup | AggregatedIncomeGroup` * build: run `build_datapackage.py` Should have been done during #671 (`point.json`) The diff on `income.json` seems like removing a newline char? --------- Co-authored-by: dangotbanned <125183946+dangotbanned@users.noreply.github.com>
dsmedia
added a commit
that referenced
this pull request
Feb 2, 2025
* docs: add sources and license for 7zip resource Update datasets.toml with missing source metadata for 7zip.png dataset * chore: uvx taplo fmt * docs: add sources and license for ffox.png * docs: updates zipcodes.csv resource in datapackage_additions.toml * update world-110m.json * docs: updates us-10m.json * docs: updates wheat.json - adds citation to protovis in desscription - fixes link to image in sources - adds license * docs: adds missing license data to several - fixes bad link in annual-precip.json; adds license - adds license to birdstrikes.csv, budget.json, burtin.json, and cars.json * docs: update metadata for co2-concerntration.csv - expands description to explain units and seasonal adjustment - adds additional source directly to dataset csv - adds license details from source * docs: adds license to crimea.json metadata * docs: update metadata for earthquakes.json - expands description - adds license * docs: complete metadata for flights* datasets - Document that data used in flights* datasets are collected under US DOT requirements - Add row counts to flight dataset descriptions (2k-3M rows) - Note regulatory basis (14 CFR Part 234) while acknowledging unclear license terms * docs: updates london dataset metadata - adds license for londonBoroughs.json - adds sources, license for londonCentroids.json (itself derived from londonBoroughs.json) - expands description, corrects source URL, updates source title, and adds license for londonTubeLines.json * docs: adds government and IPUMS license metadata to several - global-temp.csv - iowa-electricity.csv - jobs.json - monarchs.json - political-contributions.json (also updates link to FEC github), note that FEC provides an explicit underlying license - population_engineers_hurricanes.csv - seattle-weather-hourly-normals.csv - seattle-weather.csv - unemployment-across-industries.json - unemployment.tsv - us-employment.csv - weather.csv Note that many pages hosting US government datasets do not explicitly grant a license. As a result, when there is a doubt, a link is provided to the USA government works page, which explains the nuances of licenses for data on US government web sites. * docs: adds 'undetermined' licenses and sources - adds license (football.json, la-riots.csv, penguins.json, platformer-terrain.json, population.json, sp500-2000.csv, sp500.csv, volcano.json) - airports.csv (adds description, sources, license) - barley.csv (updates description and source; adds license) - disasters.csv (expands description, updates sources, add license) - driving.json (adds description, updates source, adds license) - ohlc.json (modifies description, adds additional source, and license) - stocks.csv (adds source, license) - weekly-weather.json (adds source, license) - windvectors.csv (adds source, license) * docs: compltes anscombe.json metadata - updates description, adds sources and * docs: adds budgets.json metadata - adds description, source and license - makes license title of U.S. Government Datasets consistent for cases specific license terms are undetermined * docs: adds basic metadata to flare*.json datasets - focuses on how data is used in edge bundling example - would benefit from additional detail in the description * docs: completes flights-airport.csv metadata - corrects description, adds source, license * docs: update several file metadata entries - ffox.png (updates license) - gapminder.json (adds license) - gimp.png (updates description, adds source, license) - github.csv (adds description, source, license) - lookup_groups.csv, lookup_people.csv (adds description, source, license) - miserables.json adds description, source, license) - movies.json (adds source, license) - normal-2d.json (adds description, source, license) - stocks.csv (adds description) * docs: adds us-state-capitals.json metadata - related to #668 * docs: adds uniform-2d.json metadata * docs: adds obesity.json metadata * docs: remove points.json metadata - dataset was removed from repo in #671 * docs: adds metadata for income.json - relies on income.py script from #672 * docs: adds metadata for udistrict.json * docs: adds, fixes metadata - adds description, sources for sp500.csv - fixes formatting for weekly-weather.json * docs: updates datapackage - uv run scripts/build_datapackage.py # doctest: +SKIP * docs: begins to recast in PEP 257 style - Partial fix for #663 (comment) - edits descriptions through earthquakes.json * docs: recasts all in PEP 257 format - avoids 'this dataset' and similar - reruns datapackage script (json, md) * fix: corrects year in description of obesity.json - new source found confirming 1995, not 2008, data is shown, consistent with CDC data - removes link to vega example that references wrong source year * fix: Use correct heading level in `burtin.json` Drive-by fix, really been bugging me that this breaks the flow of the navigation * fix: remove extra space Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com> * fix: remove extra space from source Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com> * docs: add column schema to normal-2d.json metadata Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com> * reformats revision note for monarchs.json Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com> * fix: typo in monarchs.json metadata * adjust markdown in anscombe.json Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com> * adjust punctuation in anscombe.json * adds column schema for budgets.json, penguins.json - runs build_datapackage.py to verify * docs: removes 'undetermined' source and license info - source and license can be clarified in a future PR * fix: correct lookup example url * docs: moves gapminder clusters to schema * update file markdown in flare.json metadata Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com> * docs: adjust markdown in flare-dependencies.json metadata Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com> * docs: reformats driving.json metadata * fix formatting * adjustments to schemas - github.csv: move time range to schema - add categories to schema in seattle-weather.csv - sp500.csv, udistrict.json, uniform-2d, weather.json : move description content into schema - reformat usgs disclaimer in us-state-capitals.json - rerun build_datapackage.py * remove duplication in udistrict description * uvx run scripts --------- Co-authored-by: Dan Redding <125183946+dangotbanned@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
closes #670