Skip to content

GH-45644: [Doc][Python] Document timezone loss when converting timestamp arrays to NumPy#49843

Open
alex-anast wants to merge 2 commits intoapache:mainfrom
alex-anast:alex-anast/gh-45644-doc-python-numpy-tz
Open

GH-45644: [Doc][Python] Document timezone loss when converting timestamp arrays to NumPy#49843
alex-anast wants to merge 2 commits intoapache:mainfrom
alex-anast:alex-anast/gh-45644-doc-python-numpy-tz

Conversation

@alex-anast
Copy link
Copy Markdown

@alex-anast alex-anast commented Apr 22, 2026

Rationale for this change

NumPy's datetime64 type does not support timezones. When converting a timezone-aware Arrow timestamp array to NumPy via to_numpy(), the timezone information is silently dropped. This behaviour is expected but undocumented, which can surprise users (see #45644).

What changes are included in this PR?

Adds a "Timezone-aware Timestamps" subsection to docs/source/python/numpy.rst that:

  • Explains the timezone loss when calling to_numpy() on tz-aware timestamp arrays
  • Shows a code example demonstrating the behavior
  • Documents two alternatives: to_pandas() for tz-aware Series, and to_pylist() for Python datetime objects with tzinfo

Are these changes tested?

Documentation-only change. All code examples were verified against pyarrow 24.0.0 and sphinx-lint passes clean.

Are there any user-facing changes?

No behaviour changes. This adds documentation for existing behaviour.

AI-generated code disclosure

This PR was developed with assistance from an AI coding tool (Claude, Anthropic). All changes have been reviewed, understood, and verified.

@github-actions github-actions Bot added the awaiting review Awaiting review label Apr 22, 2026
@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #45644 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #45644 has been automatically assigned in GitHub to PR creator.

Copy link
Copy Markdown
Member

@AlenkaF AlenkaF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, the change looks good to me!
One ask I have is to include the caveat when using to_pandas() in the case of nested types described in #41162 (works for structs and maps, not for lists; unions and list views would need to be checked).

@alex-anast alex-anast force-pushed the alex-anast/gh-45644-doc-python-numpy-tz branch from 956b6a6 to 34e5fad Compare April 23, 2026 15:24
@alex-anast
Copy link
Copy Markdown
Author

Thanks for the review, @AlenkaF ! I've added a .. note:: block under the to_pandas() alternative documenting the nested types caveat.

Unrelated, but the most recent commit also fixed the Sphinx doctest failures -- the >>> examples were being picked up by pytest --doctest-glob and failing due to numpy output line-wrapping differences, so I added # doctest: +SKIP to those examples. Please let me know if there's a better way.

@AlenkaF
Copy link
Copy Markdown
Member

AlenkaF commented Apr 23, 2026

Unrelated, but the most recent commit also fixed the Sphinx doctest failures -- the >>> examples were being picked up by pytest --doctest-glob and failing due to numpy output line-wrapping differences, so I added # doctest: +SKIP to those examples. Please let me know if there's a better way.

I would use +SKIP as little as possible and I would make changes only on the lines that are failing. Also, I like to use ELLIPSIS (...) where possible so you can still check the whole first line before the line break. If that makes sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Doc][Python] Timestamp with tz loses its time zone after to_numpy

2 participants