Skip to content

[Audio.transcribe] JsonDecodeError when printing vtt from m4a #243

@sheikheddy

Description

@sheikheddy

Describe the bug

This section of the codebase expects json even when the response_format is not json:

https://github.com/openai/openai-python/blob/75c90a71e88e4194ce22c71edeb3d2dee7f6ac93/openai/api_requestor.py#L668C7-L673

I think I can contribute a quick bug fix PR today!

To Reproduce

  1. Open an m4a file in a jupyter notebook (python 3.10.10)
  2. Transcribe with whisper-1
  3. Print transcript

Stack:
JSONDecodeError Traceback (most recent call last)
File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\openai\api_requestor.py:669, in APIRequestor._interpret_response_line(self, rbody, rcode, rheaders, stream)
668 try:
--> 669 data = json.loads(rbody)
670 except (JSONDecodeError, UnicodeDecodeError) as e:

File C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\json\__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    343 if (cls is None and object_hook is None and
    344         parse_int is None and parse_float is None and
    345         parse_constant is None and object_pairs_hook is None and not kw):
--> 346     return _default_decoder.decode(s)
    347 if cls is None:

File C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\json\decoder.py:337, in JSONDecoder.decode(self, s, _w)
    333 """Return the Python representation of ``s`` (a ``str`` instance
    334 containing a JSON document).
    335 
    336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338 end = _w(s, end).end()

File C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\json\decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
    354 except StopIteration as err:
--> 355     raise JSONDecodeError("Expecting value", s, err.value) from None

Code snippets

f = open("testing.m4a", "rb")
transcript = openai.Audio.transcribe("whisper-1", f,response_format="vtt")
print(transcript)


https://github.com/openai/openai-python/blob/75c90a71e88e4194ce22c71edeb3d2dee7f6ac93/openai/api_requestor.py#L668C7-L673`

OS

Windows 11

Python version

Python v3.10.10

Library version

openai-python 0.27.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions