Fix Python – Let JSON object accept bytes or let urlopen output strings

Question

Asked By – Peter Smit

With Python 3 I am requesting a json document from a URL.

response = urllib.request.urlopen(request)

The response object is a file-like object with read and readline methods. Normally a JSON object can be created with a file opened in text mode.

obj = json.load(fp)

What I would like to do is:

obj = json.load(response)

This however does not work as urlopen returns a file object in binary mode.

A work around is of course:

str_response = response.read().decode('utf-8')
obj = json.loads(str_response)

but this feels bad…

Is there a better way that I can transform a bytes file object to a string file object? Or am I missing any parameters for either urlopen or json.load to give an encoding?

Now we will see solution for issue: Let JSON object accept bytes or let urlopen output strings


Answer

HTTP sends bytes. If the resource in question is text, the character encoding is normally specified, either by the Content-Type HTTP header or by another mechanism (an RFC, HTML meta http-equiv,…).

urllib should know how to encode the bytes to a string, but it’s too naïve—it’s a horribly underpowered and un-Pythonic library.

Dive Into Python 3 provides an overview about the situation.

Your “work-around” is fine—although it feels wrong, it’s the correct way to do it.

This question is answered By – Humphrey Bogart

This answer is collected from stackoverflow and reviewed by FixPython community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0