Skip to content

Commit 3d04b1f

Browse files
committed
gh-150494: Sampling mode for tracemalloc
1 parent 27148d0 commit 3d04b1f

19 files changed

Lines changed: 1048 additions & 82 deletions

Doc/c-api/init_config.rst

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -493,6 +493,10 @@ Configuration Options
493493
- :c:member:`tracemalloc <PyConfig.tracemalloc>`
494494
- ``int``
495495
- Read-only
496+
* - ``"tracemalloc_sample_interval"``
497+
- :c:member:`tracemalloc_sample_interval <PyConfig.tracemalloc_sample_interval>`
498+
- ``int``
499+
- Read-only
496500
* - ``"use_environment"``
497501
- :c:member:`use_environment <PyConfig.use_environment>`
498502
- ``bool``
@@ -1891,13 +1895,31 @@ PyConfig
18911895
18921896
Enable tracemalloc?
18931897
1894-
If non-zero, call :func:`tracemalloc.start` at startup.
1898+
If non-zero, call :func:`tracemalloc.start` at startup with
1899+
:c:member:`tracemalloc` as the traceback limit and
1900+
:c:member:`tracemalloc_sample_interval` as the sampling interval.
18951901
1896-
Set by :option:`-X tracemalloc=N <-X>` command line option and by the
1897-
:envvar:`PYTHONTRACEMALLOC` environment variable.
1902+
Set by :option:`-X tracemalloc=NFRAME[:INTERVAL] <-X>` command line
1903+
option and by the :envvar:`PYTHONTRACEMALLOC` environment variable.
18981904
18991905
Default: ``-1`` in Python mode, ``0`` in isolated mode.
19001906
1907+
.. c:member:: int tracemalloc_sample_interval
1908+
1909+
Set the :mod:`tracemalloc` sampling interval in bytes at startup.
1910+
1911+
If ``0``, every allocation is traced. If greater than ``0``,
1912+
allocations are sampled using a Poisson process with a mean
1913+
inter-arrival of :c:member:`tracemalloc_sample_interval` bytes.
1914+
1915+
Only used when :c:member:`tracemalloc` is non-zero.
1916+
1917+
Set by the ``INTERVAL`` part of
1918+
:option:`-X tracemalloc=NFRAME:INTERVAL <-X>` command line option and
1919+
by the :envvar:`PYTHONTRACEMALLOC` environment variable.
1920+
1921+
Default: ``0``.
1922+
19011923
.. c:member:: int perf_profiling
19021924
19031925
Enable the Linux ``perf`` profiler support?

Doc/library/tracemalloc.rst

Lines changed: 34 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -363,13 +363,29 @@ Functions
363363
See also :func:`start` and :func:`stop` functions.
364364

365365

366-
.. function:: start(nframe: int=1)
366+
.. function:: start(nframe: int=1, sample_interval: int=0)
367367

368368
Start tracing Python memory allocations: install hooks on Python memory
369369
allocators. Collected tracebacks of traces will be limited to *nframe*
370370
frames. By default, a trace of a memory block only stores the most recent
371371
frame: the limit is ``1``. *nframe* must be greater or equal to ``1``.
372372

373+
If *sample_interval* is ``0`` (the default), every allocation is traced.
374+
If *sample_interval* is greater than zero, allocations are sampled using
375+
a `Poisson process <https://en.wikipedia.org/wiki/Poisson_point_process>`_
376+
with a mean inter-arrival of *sample_interval* bytes. Sampling can
377+
significantly reduce overhead while producing useful aggregate estimates.
378+
In sampled mode, :attr:`Trace.size` is an upscaled estimate of the bytes
379+
represented by the sample, while :attr:`Trace.real_size` is the actual
380+
allocation size.
381+
382+
Both *nframe* and *sample_interval* can be passed by keyword:
383+
384+
.. code-block:: python
385+
386+
tracemalloc.start(sample_interval=512 * 1024)
387+
tracemalloc.start(nframe=25, sample_interval=512 * 1024)
388+
373389
You can still read the original number of total frames that composed the
374390
traceback by looking at the :attr:`Traceback.total_nframe` attribute.
375391

@@ -382,12 +398,17 @@ Functions
382398
to measure how much memory is used by the :mod:`!tracemalloc` module.
383399

384400
The :envvar:`PYTHONTRACEMALLOC` environment variable
385-
(``PYTHONTRACEMALLOC=NFRAME``) and the :option:`-X` ``tracemalloc=NFRAME``
401+
(``PYTHONTRACEMALLOC=NFRAME`` or ``PYTHONTRACEMALLOC=NFRAME:INTERVAL``)
402+
and the :option:`-X` ``tracemalloc=NFRAME[:INTERVAL]``
386403
command line option can be used to start tracing at startup.
387404

388405
See also :func:`stop`, :func:`is_tracing` and :func:`get_traceback_limit`
389406
functions.
390407

408+
.. versionchanged:: 3.16
409+
Added the *sample_interval* parameter. Both arguments can now be
410+
passed by keyword.
411+
391412

392413
.. function:: stop()
393414

@@ -697,7 +718,17 @@ Trace
697718

698719
.. attribute:: size
699720

700-
Size of the memory block in bytes (``int``).
721+
Size of the memory block in bytes (``int``). When sampling is enabled,
722+
this is an upscaled estimate of the total bytes the trace represents.
723+
Use this value for aggregation.
724+
725+
.. attribute:: real_size
726+
727+
Actual allocation size in bytes (``int``). In exact mode
728+
(``sample_interval=0``), this equals :attr:`size`. In sampled mode,
729+
this is the real number of bytes requested by the allocation call.
730+
731+
.. versionadded:: 3.16
701732

702733
.. attribute:: traceback
703734

Doc/using/cmdline.rst

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -542,12 +542,17 @@ Miscellaneous options
542542
* ``-X tracemalloc`` to start tracing Python memory allocations using the
543543
:mod:`tracemalloc` module. By default, only the most recent frame is
544544
stored in a traceback of a trace. Use ``-X tracemalloc=NFRAME`` to start
545-
tracing with a traceback limit of *NFRAME* frames.
545+
tracing with a traceback limit of *NFRAME* frames. Use
546+
``-X tracemalloc=NFRAME:INTERVAL`` to enable Poisson sampling with a
547+
mean interval of *INTERVAL* bytes, which can reduce tracing overhead.
546548
See :func:`tracemalloc.start` and :envvar:`PYTHONTRACEMALLOC`
547549
for more information.
548550

549551
.. versionadded:: 3.4
550552

553+
.. versionchanged:: 3.16
554+
Added sampling support via ``NFRAME:INTERVAL`` syntax.
555+
551556
* ``-X int_max_str_digits`` configures the :ref:`integer string conversion
552557
length limitation <int_max_str_digits>`. See also
553558
:envvar:`PYTHONINTMAXSTRDIGITS`.
@@ -1033,12 +1038,17 @@ conflict.
10331038
Python memory allocations using the :mod:`tracemalloc` module. The value of
10341039
the variable is the maximum number of frames stored in a traceback of a
10351040
trace. For example, ``PYTHONTRACEMALLOC=1`` stores only the most recent
1036-
frame.
1041+
frame. Use ``PYTHONTRACEMALLOC=NFRAME:INTERVAL`` to enable Poisson
1042+
sampling with a mean interval of *INTERVAL* bytes (e.g.
1043+
``PYTHONTRACEMALLOC=1:524288`` for 512 KB sampling).
10371044
See the :func:`tracemalloc.start` function for more information.
10381045
This is equivalent to setting the :option:`-X` ``tracemalloc`` option.
10391046

10401047
.. versionadded:: 3.4
10411048

1049+
.. versionchanged:: 3.16
1050+
Added ``NFRAME:INTERVAL`` syntax for sampling support.
1051+
10421052

10431053
.. envvar:: PYTHONPROFILEIMPORTTIME
10441054

Include/cpython/initconfig.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,6 +141,7 @@ typedef struct PyConfig {
141141
unsigned long hash_seed;
142142
int faulthandler;
143143
int tracemalloc;
144+
int tracemalloc_sample_interval;
144145
int perf_profiling;
145146
int remote_debug;
146147
int import_time;

Include/internal/pycore_global_objects_fini_generated.h

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_global_strings.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -666,6 +666,7 @@ struct _Py_global_strings {
666666
STRUCT_FOR_ID(newline)
667667
STRUCT_FOR_ID(newlines)
668668
STRUCT_FOR_ID(next)
669+
STRUCT_FOR_ID(nframe)
669670
STRUCT_FOR_ID(nlocals)
670671
STRUCT_FOR_ID(node_depth)
671672
STRUCT_FOR_ID(node_offset)
@@ -767,6 +768,7 @@ struct _Py_global_strings {
767768
STRUCT_FOR_ID(reversed)
768769
STRUCT_FOR_ID(rounding)
769770
STRUCT_FOR_ID(salt)
771+
STRUCT_FOR_ID(sample_interval)
770772
STRUCT_FOR_ID(sample_interval_us)
771773
STRUCT_FOR_ID(sched_priority)
772774
STRUCT_FOR_ID(scheduler)

Include/internal/pycore_runtime_init_generated.h

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_tracemalloc.h

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,10 @@ struct _PyTraceMalloc_Config {
3030
/* limit of the number of frames in a traceback, 1 by default.
3131
Variable protected by the GIL. */
3232
int max_nframe;
33+
34+
/* Poisson sampling interval in bytes. 0 means trace every allocation.
35+
Variable protected by the GIL. */
36+
size_t sample_interval;
3337
};
3438

3539

@@ -113,6 +117,7 @@ struct _tracemalloc_runtime_state {
113117
.initialized = TRACEMALLOC_NOT_INITIALIZED, \
114118
.tracing = 0, \
115119
.max_nframe = 1, \
120+
.sample_interval = 0, \
116121
}, \
117122
.reentrant_key = Py_tss_NEEDS_INIT, \
118123
}
@@ -148,7 +153,7 @@ extern PyObject* _PyTraceMalloc_GetObjectTraceback(PyObject *obj);
148153
extern PyStatus _PyTraceMalloc_Init(void);
149154

150155
/* Start tracemalloc */
151-
extern int _PyTraceMalloc_Start(int max_nframe);
156+
extern int _PyTraceMalloc_Start(int max_nframe, size_t sample_interval);
152157

153158
/* Stop tracemalloc */
154159
extern void _PyTraceMalloc_Stop(void);

Include/internal/pycore_tstate.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,14 @@ struct _gc_thread_state {
2323
};
2424
#endif
2525

26+
/* Per-thread tracemalloc Poisson sampling state.
27+
Zero-initialized; prng_state == 0 means "needs initialization". */
28+
struct _tracemalloc_sampling_state {
29+
size_t bytes_since_last_sample;
30+
size_t threshold;
31+
uint64_t prng_state;
32+
};
33+
2634

2735
// Every PyThreadState is actually allocated as a _PyThreadStateImpl. The
2836
// PyThreadState fields are exposed as part of the C API, although most fields
@@ -103,6 +111,8 @@ typedef struct _PyThreadStateImpl {
103111
struct _PyJitTracerState *jit_tracer_state;
104112
#endif
105113

114+
struct _tracemalloc_sampling_state tracemalloc_sampling;
115+
106116
#ifdef Py_GIL_DISABLED
107117
// gh-144438: Add padding to ensure that the fields above don't share a
108118
// cache line with other allocations.

Include/internal/pycore_unicodeobject_generated.h

Lines changed: 8 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)