Skip to content

Commit 10835ee

Browse files
gh-89268: Escape RTL characters in str.__repr__()
Previously, non-escaped right-to-left characters caused misleading output on Unicode-aware terminals and browsers. Correspondingly, str.isprintable() now returns False for such characters.
1 parent 46d5106 commit 10835ee

File tree

6 files changed

+1192
-1097
lines changed

6 files changed

+1192
-1097
lines changed

Doc/library/stdtypes.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2330,6 +2330,8 @@ expression support in the :mod:`re` module).
23302330
Number, Punctuation, or Symbol (L, M, N, P, or S); plus the ASCII space 0x20.
23312331
Nonprintable characters are those in group Separator or Other (Z or C),
23322332
except the ASCII space.
2333+
Additionally, strong right-to-left characters which have a bidirectional
2334+
class R or AL considered non-printable.
23332335

23342336
For example:
23352337

@@ -2339,6 +2341,11 @@ expression support in the :mod:`re` module).
23392341
(True, True)
23402342
>>> '\t'.isprintable(), '\n'.isprintable()
23412343
(False, False)
2344+
>>> '\u05be'.isprintable(), '\u0608'.isprintable()
2345+
(False, False)
2346+
2347+
.. versionchanged:: next
2348+
Strong right-to-left characters considered non-printable.
23422349

23432350

23442351
.. method:: str.isspace()

Doc/whatsnew/3.15.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -426,6 +426,14 @@ Other language changes
426426
controlled by :ref:`environment variables <using-on-controlling-color>`.
427427
(Contributed by Peter Bierma in :gh:`134170`.)
428428

429+
* The :meth:`~object.__repr__` of :class:`str` now hex-escapes strong
430+
right-to-left characters (bidirectional class R and AL).
431+
Previously, non-escaped right-to-left characters caused misleading
432+
output on Unicode-aware terminals and browsers.
433+
Correspondingly, :meth:`str.isprintable` now returns ``False`` for such
434+
characters.
435+
(Contributed by Serhiy Storchaka in :gh:`89268`.)
436+
429437
* The :meth:`~object.__repr__` of :class:`ImportError` and :class:`ModuleNotFoundError`
430438
now shows "name" and "path" as ``name=<name>`` and ``path=<path>`` if they were given
431439
as keyword arguments at construction time.

Lib/test/test_str.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -853,6 +853,12 @@ def test_isprintable(self):
853853
self.assertTrue('\U0001F46F'.isprintable())
854854
self.assertFalse('\U000E0020'.isprintable())
855855

856+
# strong right-to-left character
857+
self.assertFalse("\u05be".isprintable())
858+
self.assertFalse("\u0608".isprintable())
859+
self.assertFalse("\U00010800".isprintable())
860+
self.assertFalse("\U00010d00".isprintable())
861+
856862
@support.requires_resource('cpu')
857863
def test_isprintable_invariant(self):
858864
for codepoint in range(sys.maxunicode + 1):
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
The :meth:`~object.__repr__` of :class:`str` now hex-escapes strong
2+
right-to-left characters (bidirectional class R and AL). Previously,
3+
non-escaped right-to-left characters caused misleading output on
4+
Unicode-aware terminals and browsers. Correspondingly,
5+
:meth:`str.isprintable` now returns ``False`` for such characters.

0 commit comments

Comments
 (0)