@@ -66,7 +66,7 @@ Summary -- Release highlights
6666.. PEP-sized items next.
6767
6868 * :pep: `799 `: :ref: `A dedicated profiling package for organizing Python
69- profiling tools <whatsnew315-sampling-profiler >`
69+ profiling tools <whatsnew315-profiling-package >`
7070* :pep: `686 `: :ref: `Python now uses UTF-8 as the default encoding
7171 <whatsnew315-utf8-default>`
7272* :pep: `782 `: :ref: `A new PyBytesWriter C API to create a Python bytes object
@@ -77,12 +77,32 @@ Summary -- Release highlights
7777New features
7878============
7979
80+ .. _whatsnew315-profiling-package :
81+
82+ :pep: `799 `: A dedicated profiling package
83+ -----------------------------------------
84+
85+ A new :mod: `!profiling ` module has been added to organize Python's built-in
86+ profiling tools under a single, coherent namespace. This module contains:
87+
88+ * :mod: `!profiling.tracing `: deterministic function-call tracing (relocated from
89+ :mod: `cProfile `).
90+ * :mod: `!profiling.sampling `: a new statistical sampling profiler (named Tachyon).
91+
92+ The :mod: `cProfile ` module remains as an alias for backwards compatibility.
93+ The :mod: `profile ` module is deprecated and will be removed in Python 3.17.
94+
95+ .. seealso :: :pep:`799` for further details.
96+
97+ (Contributed by Pablo Galindo and László Kiss Kollár in :gh: `138122 `.)
98+
99+
80100.. _whatsnew315-sampling-profiler :
81101
82- :pep: ` 799 ` : High frequency statistical sampling profiler
83- --------------------------------------------------------
102+ Tachyon : High frequency statistical sampling profiler
103+ -----------------------------------------------------
84104
85- A new statistical sampling profiler has been added to the new :mod: ` !profiling ` module as
105+ A new statistical sampling profiler (Tachyon) has been added as
86106:mod: `!profiling.sampling `. This profiler enables low-overhead performance analysis of
87107running Python processes without requiring code modification or process restart.
88108
@@ -91,101 +111,64 @@ every function call, the sampling profiler periodically captures stack traces fr
91111running processes. This approach provides virtually zero overhead while achieving
92112sampling rates of **up to 1,000,000 Hz **, making it the fastest sampling profiler
93113available for Python (at the time of its contribution) and ideal for debugging
94- performance issues in production environments.
114+ performance issues in production environments. This capability is particularly
115+ valuable for debugging performance issues in production systems where traditional
116+ profiling approaches would be too intrusive.
95117
96118Key features include:
97119
98120* **Zero-overhead profiling **: Attach to any running Python process without
99- affecting its performance
100- * **No code modification required **: Profile existing applications without restart
101- * **Real-time statistics **: Monitor sampling quality during data collection
102- * **Multiple output formats **: Generate both detailed statistics and flamegraph data
103- * **Thread-aware profiling **: Option to profile all threads or just the main thread
104-
105- Profile process 1234 for 10 seconds with default settings:
106-
107- .. code-block :: shell
108-
109- python -m profiling.sampling 1234
110-
111- Profile with custom interval and duration, save to file:
112-
113- .. code-block :: shell
114-
115- python -m profiling.sampling -i 50 -d 30 -o profile.stats 1234
116-
117- Generate collapsed stacks for flamegraph:
118-
119- .. code-block :: shell
120-
121- python -m profiling.sampling --collapsed 1234
122-
123- Profile all threads and sort by total time:
124-
125- .. code-block :: shell
126-
127- python -m profiling.sampling -a --sort-tottime 1234
128-
129- The profiler generates statistical estimates of where time is spent:
130-
131- .. code-block :: text
132-
133- Real-time sampling stats: Mean: 100261.5Hz (9.97µs) Min: 86333.4Hz (11.58µs) Max: 118807.2Hz (8.42µs) Samples: 400001
134- Captured 498841 samples in 5.00 seconds
135- Sample rate: 99768.04 samples/sec
136- Error rate: 0.72%
137- Profile Stats:
138- nsamples sample% tottime (s) cumul% cumtime (s) filename:lineno(function)
139- 43/418858 0.0 0.000 87.9 4.189 case.py:667(TestCase.run)
140- 3293/418812 0.7 0.033 87.9 4.188 case.py:613(TestCase._callTestMethod)
141- 158562/158562 33.3 1.586 33.3 1.586 test_compile.py:725(TestSpecifics.test_compiler_recursion_limit.<locals>.check_limit)
142- 129553/129553 27.2 1.296 27.2 1.296 ast.py:46(parse)
143- 0/128129 0.0 0.000 26.9 1.281 test_ast.py:884(AST_Tests.test_ast_recursion_limit.<locals>.check_limit)
144- 7/67446 0.0 0.000 14.2 0.674 test_compile.py:729(TestSpecifics.test_compiler_recursion_limit)
145- 6/60380 0.0 0.000 12.7 0.604 test_ast.py:888(AST_Tests.test_ast_recursion_limit)
146- 3/50020 0.0 0.000 10.5 0.500 test_compile.py:727(TestSpecifics.test_compiler_recursion_limit)
147- 1/38011 0.0 0.000 8.0 0.380 test_ast.py:886(AST_Tests.test_ast_recursion_limit)
148- 1/25076 0.0 0.000 5.3 0.251 test_compile.py:728(TestSpecifics.test_compiler_recursion_limit)
149- 22361/22362 4.7 0.224 4.7 0.224 test_compile.py:1368(TestSpecifics.test_big_dict_literal)
150- 4/18008 0.0 0.000 3.8 0.180 test_ast.py:889(AST_Tests.test_ast_recursion_limit)
151- 11/17696 0.0 0.000 3.7 0.177 subprocess.py:1038(Popen.__init__)
152- 16968/16968 3.6 0.170 3.6 0.170 subprocess.py:1900(Popen._execute_child)
153- 2/16941 0.0 0.000 3.6 0.169 test_compile.py:730(TestSpecifics.test_compiler_recursion_limit)
154-
155- Legend:
156- nsamples: Direct/Cumulative samples (direct executing / on call stack)
157- sample%: Percentage of total samples this function was directly executing
158- tottime: Estimated total time spent directly in this function
159- cumul%: Percentage of total samples when this function was on the call stack
160- cumtime: Estimated cumulative time (including time in called functions)
161- filename:lineno(function): Function location and name
162-
163- Summary of Interesting Functions:
164-
165- Functions with Highest Direct/Cumulative Ratio (Hot Spots):
166- 1.000 direct/cumulative ratio, 33.3% direct samples: test_compile.py:(TestSpecifics.test_compiler_recursion_limit.<locals>.check_limit)
167- 1.000 direct/cumulative ratio, 27.2% direct samples: ast.py:(parse)
168- 1.000 direct/cumulative ratio, 3.6% direct samples: subprocess.py:(Popen._execute_child)
169-
170- Functions with Highest Call Frequency (Indirect Calls):
171- 418815 indirect calls, 87.9% total stack presence: case.py:(TestCase.run)
172- 415519 indirect calls, 87.9% total stack presence: case.py:(TestCase._callTestMethod)
173- 159470 indirect calls, 33.5% total stack presence: test_compile.py:(TestSpecifics.test_compiler_recursion_limit)
174-
175- Functions with Highest Call Magnification (Cumulative/Direct):
176- 12267.9x call magnification, 159470 indirect calls from 13 direct: test_compile.py:(TestSpecifics.test_compiler_recursion_limit)
177- 10581.7x call magnification, 116388 indirect calls from 11 direct: test_ast.py:(AST_Tests.test_ast_recursion_limit)
178- 9740.9x call magnification, 418815 indirect calls from 43 direct: case.py:(TestCase.run)
179-
180- The profiler automatically identifies performance bottlenecks through statistical
181- analysis, highlighting functions with high CPU usage and call frequency patterns.
182-
183- This capability is particularly valuable for debugging performance issues in
184- production systems where traditional profiling approaches would be too intrusive.
185-
186- .. seealso :: :pep:`799` for further details.
187-
188- (Contributed by Pablo Galindo and László Kiss Kollár in :gh: `135953 `.)
121+ affecting its performance. Ideal for production debugging where you can't afford
122+ to restart or slow down your application.
123+
124+ * **No code modification required **: Profile existing applications without restart.
125+ Simply point the profiler at a running process by PID and start collecting data.
126+
127+ * **Flexible target modes **:
128+
129+ * Profile running processes by PID (``attach ``) - attach to already-running applications
130+ * Run and profile scripts directly (``run ``) - profile from the very start of execution
131+ * Execute and profile modules (``run -m ``) - profile packages run as ``python -m module ``
132+
133+ * **Multiple profiling modes **: Choose what to measure based on your performance investigation:
134+
135+ * **Wall-clock time ** (``--mode wall ``, default): Measures real elapsed time including I/O,
136+ network waits, and blocking operations. Use this to understand where your program spends
137+ calendar time, including when waiting for external resources.
138+ * **CPU time ** (``--mode cpu ``): Measures only active CPU execution time, excluding I/O waits
139+ and blocking. Use this to identify CPU-bound bottlenecks and optimize computational work.
140+ * **GIL-holding time ** (``--mode gil ``): Measures time spent holding Python's Global Interpreter
141+ Lock. Use this to identify which threads dominate GIL usage in multi-threaded applications.
142+
143+ * **Thread-aware profiling **: Option to profile all threads (``-a ``) or just the main thread,
144+ essential for understanding multi-threaded application behavior.
145+
146+ * **Multiple output formats **: Choose the visualization that best fits your workflow:
147+
148+ * ``--pstats ``: Detailed tabular statistics compatible with :mod: `pstats `. Shows function-level
149+ timing with direct and cumulative samples. Best for detailed analysis and integration with
150+ existing Python profiling tools.
151+ * ``--collapsed ``: Generates collapsed stack traces (one line per stack). This format is
152+ specifically designed for creating flamegraphs with external tools like Brendan Gregg's
153+ FlameGraph scripts or speedscope.
154+ * ``--flamegraph ``: Generates a self-contained interactive HTML flamegraph using D3.js.
155+ Opens directly in your browser for immediate visual analysis. Flamegraphs show the call
156+ hierarchy where width represents time spent, making it easy to spot bottlenecks at a glance.
157+ * ``--gecko ``: Generates Gecko Profiler format compatible with Firefox Profiler
158+ (https://profiler.firefox.com). Upload the output to Firefox Profiler for advanced
159+ timeline-based analysis with features like stack charts, markers, and network activity.
160+ * ``--heatmap ``: Generates an interactive HTML heatmap visualization with line-level sample
161+ counts. Creates a directory with per-file heatmaps showing exactly where time is spent
162+ at the source code level.
163+
164+ * **Live interactive mode **: Real-time TUI profiler with a top-like interface (``--live ``).
165+ Monitor performance as your application runs with interactive sorting and filtering.
166+
167+ * **Async-aware profiling **: Profile async/await code with task-based stack reconstruction
168+ (``--async-aware ``). See which coroutines are consuming time, with options to show only
169+ running tasks or all tasks including those waiting.
170+
171+ (Contributed by Pablo Galindo and László Kiss Kollár in :gh: `135953 ` and :gh: `138122 `.)
189172
190173
191174.. _whatsnew315-improved-error-messages :
0 commit comments