Skip to content

Commit d6ae724

Browse files
committed
gh-146396: Improve tarfile documentation — unify examples, clarify TarInfo.size, add recipes
1 parent 476b649 commit d6ae724

1 file changed

Lines changed: 72 additions & 29 deletions

File tree

Doc/library/tarfile.rst

Lines changed: 72 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -816,7 +816,9 @@ A ``TarInfo`` object has the following public data attributes:
816816
.. attribute:: TarInfo.size
817817
:type: int
818818

819-
Size in bytes.
819+
Size of the archived file's data in bytes.
820+
This is the size of the file data stored in the archive,
821+
excluding the tar header blocks (which are typically 512 bytes each).
820822

821823

822824
.. attribute:: TarInfo.mtime
@@ -1374,9 +1376,8 @@ Reading examples
13741376
How to extract an entire tar archive to the current working directory::
13751377

13761378
import tarfile
1377-
tar = tarfile.open("sample.tar.gz")
1378-
tar.extractall(filter='data')
1379-
tar.close()
1379+
with tarfile.open("sample.tar.gz") as tar:
1380+
tar.extractall(filter='data')
13801381

13811382
How to extract a subset of a tar archive with :meth:`TarFile.extractall` using
13821383
a generator function instead of a list::
@@ -1389,36 +1390,79 @@ a generator function instead of a list::
13891390
if os.path.splitext(tarinfo.name)[1] == ".py":
13901391
yield tarinfo
13911392

1392-
tar = tarfile.open("sample.tar.gz")
1393-
tar.extractall(members=py_files(tar))
1394-
tar.close()
1393+
with tarfile.open("sample.tar.gz") as tar:
1394+
tar.extractall(members=py_files(tar))
13951395

13961396
How to read a gzip compressed tar archive and display some member information::
13971397

13981398
import tarfile
1399-
tar = tarfile.open("sample.tar.gz", "r:gz")
1400-
for tarinfo in tar:
1401-
print(tarinfo.name, "is", tarinfo.size, "bytes in size and is ", end="")
1402-
if tarinfo.isreg():
1403-
print("a regular file.")
1404-
elif tarinfo.isdir():
1405-
print("a directory.")
1406-
else:
1407-
print("something else.")
1408-
tar.close()
1399+
with tarfile.open("sample.tar.gz", "r:gz") as tar:
1400+
for tarinfo in tar:
1401+
print(tarinfo.name, "is", tarinfo.size, "bytes in size and is ", end="")
1402+
if tarinfo.isreg():
1403+
print("a regular file.")
1404+
elif tarinfo.isdir():
1405+
print("a directory.")
1406+
else:
1407+
print("something else.")
1408+
1409+
How to read a specific file from a tar archive into memory without
1410+
extracting it to the filesystem, using :meth:`TarFile.extractfile`::
14091411

1410-
Writing examples
1411-
~~~~~~~~~~~~~~~~
1412+
import tarfile
14121413

1413-
How to create an uncompressed tar archive from a list of filenames::
1414+
with tarfile.open("sample.tar.gz") as tar:
1415+
member = tar.getmember("README.txt")
1416+
f = tar.extractfile(member)
1417+
if f is not None:
1418+
content = f.read()
1419+
print(f"README.txt ({len(content)} bytes):")
1420+
print(content.decode("utf-8")[:200])
1421+
else:
1422+
print("README.txt is not a regular file")
14141423

1415-
import tarfile
1416-
tar = tarfile.open("sample.tar", "w")
1417-
for name in ["foo", "bar", "quux"]:
1418-
tar.add(name)
1419-
tar.close()
1424+
How to iterate over all members and read their contents, handling
1425+
non-file members (directories, symlinks, etc.) that return ``None``::
1426+
1427+
import tarfile
1428+
1429+
with tarfile.open("sample.tar.gz") as tar:
1430+
for member in tar.getmembers():
1431+
f = tar.extractfile(member)
1432+
if f is None:
1433+
print(f"{member.name} is not a regular file "
1434+
f"(type code: {member.type})")
1435+
else:
1436+
print(f"{member.name}: {len(f.read())} bytes")
1437+
1438+
How to handle errors when a member does not exist in the archive::
1439+
1440+
import tarfile
1441+
1442+
with tarfile.open("sample.tar.gz") as tar:
1443+
try:
1444+
f = tar.extractfile("nonexistent.txt")
1445+
except KeyError:
1446+
print("File not found in archive")
1447+
1448+
How to stream-read a tar archive without seeking, using the ``'r|*'``
1449+
stream mode. This is useful for large files or network streams where
1450+
the archive is processed sequentially without random access::
1451+
1452+
import tarfile
1453+
1454+
with tarfile.open("large_archive.tar.gz", "r|gz") as tar:
1455+
for member in tar:
1456+
f = tar.extractfile(member)
1457+
if f is not None:
1458+
data = f.read(1024) # read the first 1024 bytes
1459+
print(f"{member.name}: {len(data)} bytes read")
1460+
1461+
Writing examples
1462+
~~~~~~~~~~~~~~~~
14201463

1421-
The same example using the :keyword:`with` statement::
1464+
How to create an uncompressed tar archive from a list of filenames using
1465+
the :keyword:`with` statement::
14221466

14231467
import tarfile
14241468
with tarfile.open("sample.tar", "w") as tar:
@@ -1443,9 +1487,8 @@ parameter in :meth:`TarFile.add`::
14431487
tarinfo.uid = tarinfo.gid = 0
14441488
tarinfo.uname = tarinfo.gname = "root"
14451489
return tarinfo
1446-
tar = tarfile.open("sample.tar.gz", "w:gz")
1447-
tar.add("foo", filter=reset)
1448-
tar.close()
1490+
with tarfile.open("sample.tar.gz", "w:gz") as tar:
1491+
tar.add("foo", filter=reset)
14491492

14501493

14511494
.. _tar-formats:

0 commit comments

Comments
 (0)