@@ -1252,9 +1252,13 @@ it::
12521252
12531253 >>> import urllib.request
12541254 >>> with urllib.request.urlopen('https://www.python.org/') as f:
1255- ... print(f.read(300))
1256- ...
1257- b'<!doctype html>\n<!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 7]> <html class="no-js ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 8]> <html class="no-js ie8 lt-ie9">
1255+ ... # The response may be compressed (for example, 'gzip').
1256+ ... print(f.headers.get('Content-Encoding'))
1257+ ... data = f.read()
1258+ ... if f.headers.get('Content-Encoding') == 'gzip':
1259+ ... import gzip
1260+ ... data = gzip.decompress(data)
1261+ ... print(data[:300].decode('utf-8', errors='replace'))
12581262
12591263Note that urlopen returns a bytes object. This is because there is no way
12601264for urlopen to automatically determine the encoding of the byte stream
@@ -1272,25 +1276,29 @@ As the python.org website uses *utf-8* encoding as specified in its meta tag, we
12721276will use the same for decoding the bytes object::
12731277
12741278 >>> with urllib.request.urlopen('https://www.python.org/') as f:
1275- ... print(f.read(100).decode('utf-8'))
1279+ ... # Check for compression and decode appropriately.
1280+ ... enc = f.headers.get('Content-Encoding')
1281+ ... data = f.read()
1282+ ... if enc == 'gzip':
1283+ ... import gzip
1284+ ... data = gzip.decompress(data)
1285+ ... print(data[:100].decode('utf-8', errors='replace'))
12761286 ...
1277- <!doctype html>
1278- <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1279- <!-
12801287
12811288It is also possible to achieve the same result without using the
12821289:term: `context manager ` approach::
12831290
12841291 >>> import urllib.request
12851292 >>> f = urllib.request.urlopen('https://www.python.org/')
12861293 >>> try:
1287- ... print(f.read(100).decode('utf-8'))
1294+ ... enc = f.headers.get('Content-Encoding')
1295+ ... data = f.read()
1296+ ... if enc == 'gzip':
1297+ ... import gzip
1298+ ... data = gzip.decompress(data)
1299+ ... print(data[:100].decode('utf-8', errors='replace'))
12881300 ... finally:
12891301 ... f.close()
1290- ...
1291- <!doctype html>
1292- <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1293- <!--
12941302
12951303In the following example, we are sending a data-stream to the stdin of a CGI
12961304and reading the data it returns to us. Note that this example will only work
0 commit comments