Skip to content

Handle text-mode file objects in parse_wpc_surface_bulletin#4068

Open
gaoflow wants to merge 1 commit into
Unidata:mainfrom
gaoflow:fix/wpc-bulletin-text-file-object
Open

Handle text-mode file objects in parse_wpc_surface_bulletin#4068
gaoflow wants to merge 1 commit into
Unidata:mainfrom
gaoflow:fix/wpc-bulletin-text-file-object

Conversation

@gaoflow
Copy link
Copy Markdown

@gaoflow gaoflow commented May 29, 2026

Description

parse_wpc_surface_bulletin documents accepting a "str or file-like object", but it unconditionally calls .decode('utf-8') on the result of file.read(). open_as_needed opens filename paths in binary mode, but a file-like object passed in directly is returned untouched — so a text-mode object such as io.StringIO makes read() return str, and the .decode() call raises:

AttributeError: 'str' object has no attribute 'decode'

Fixes #3923.

Reproduction

from io import StringIO
from metpy.io import parse_wpc_surface_bulletin

sio = StringIO('''VALID 062818Z
HIGHS 1022 3961069 1020 3851069 1026 3750773
LOWS 1016 4510934 1002 3441145 1003 4271229
TROF 2971023 2831018 2691008
 ''')
parse_wpc_surface_bulletin(sio, year=2000)   # AttributeError before this PR

Fix

Only decode when read() returns bytes:

text = file.read()
if isinstance(text, bytes):
    text = text.decode('utf-8')

I went with an explicit isinstance(text, bytes) check rather than a broad try/except AttributeError (discussed in the issue): it handles exactly the two things the docstring promises — binary sources ('rb' file paths, BytesIO) and text sources (StringIO, text-mode files) — without swallowing unrelated AttributeErrors or revisiting the Python 2→3 byte-handling story more broadly.

Tests / docs

  • Added test_parse_wpc_surface_bulletin_text_file_object, asserting a StringIO parses identically (pd.testing.assert_frame_equal) to the equivalent BytesIO.
  • Moved the misplaced year entry from the Returns section to Parameters in the docstring (it is a parameter, not a return value).

tests/io/test_text.py (all 5), the text.py doctest, and ruff pass locally.

parse_wpc_surface_bulletin documents accepting a file-like object, but it
unconditionally called .decode('utf-8') on the result of file.read(). For a
text-mode object such as io.StringIO, read() returns str, so this raised
AttributeError: 'str' object has no attribute 'decode'.

Only decode when read() returns bytes, so both binary (file paths opened in
'rb', BytesIO) and text (StringIO, files opened in text mode) sources work as
the docstring promises. Add a regression test asserting a StringIO parses
identically to the equivalent BytesIO, and move the misplaced 'year' entry from
the Returns section to Parameters in the docstring.

Closes Unidata#3923
@gaoflow gaoflow requested a review from a team as a code owner May 29, 2026 09:24
@gaoflow gaoflow requested review from dcamron and removed request for a team May 29, 2026 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

parse_wpc_surface_bulletin does not handle StringIO object, results in AttributeError: 'str' object has no attribute 'decode'

1 participant