On 15/05/2026 3:11 am, Amir Ayupov wrote:
> In a system-wide `perf record -e cs_etm/.../u` capture on aarch64,
> synthesized samples emitted by `perf script --itrace=il64` are
> sometimes attributed to the WRONG sample.pid/tid (and to the wrong
> EL/cpumode) for the chunk of branches that straddle a context-switch
> boundary on a CPU. A branch actually retired by process A is emitted
> with sample.pid set to the thread that next ran on the same CPU.
>
> Mechanism:
> 1. ETM emits CONTEXTIDR/EL packets in-stream when the kernel updates
> CONTEXTIDR_EL1 on context switch / EL change. OpenCSD turns these
> into OCSD_GEN_TRC_ELEM_PE_CONTEXT elements interleaved with
> OCSD_GEN_TRC_ELEM_INSTR_RANGE elements for retired branch ranges.
> 2. cs_etm_decoder__buffer_range() queues each INSTR_RANGE into
> packet_queue->packet_buffer[]; packets carry start/end addrs,
> instr_count, last-instruction info, etc., but NO owner identity.
> 3. PE_CONTEXT goes through cs_etm_decoder__set_tid() ->
> cs_etm__set_thread(), which immediately mutates tidq->thread and
> tidq->el. Queued packets are not drained first; reset_timestamp()
> is called so the next TIMESTAMP triggers OCSD_RESP_WAIT and a
> drain.
> 4. By drain time in cs_etm__process_traceid_queue() ->
> cs_etm__sample(), sample.pid/tid is read from the now-mutated
> tidq->thread and sample.cpumode from the now-mutated tidq->el.
> Pre-context INSTR_RANGEs get the post-context owner.
>
> The same race affects branch samples via tidq->prev_packet_thread /
> tidq->prev_packet_el, captured at packet-swap time from
> tidq->thread / tidq->el (which may already have flipped).
>
> This is independent of PERF_RECORD_SWITCH_CPU_WIDE, which is
> deliberately not used to assign sample identity in this path. The
> bug applies to any cs_etm capture with in-stream CONTEXTIDR
> (PIDFMT_CTXTID or PIDFMT_CTXTID2).
>
> Effect on downstream tools: branches that should belong to the
> previous thread on the CPU get attributed to the next thread. When
> the two threads share a binary, leaked branches' VAs land in the
> wrong thread's mappings; samples whose IPs land in r-x mappings
> silently pollute that binary's profile, while samples landing in
> R-only/RW mappings show up as out-of-range / non-text samples.
> Either way, AutoFDO/BOLT profiles built from `perf script --itrace`
> output of system-wide cs_etm captures contain misattributed samples.
>
> Concrete example from `perf script --itrace=il64` of the same
> captured branch (same timestamp, same IP, same from/to addrs) before
> and after this fix:
>
> before: launcher_multia 2638146/2638146 705897.219172: \
> fffcda6b124c 0xfffcda641958/0xfffcda6b123c
> after: ws-tcf-sr-io13 2736581/2741587 705897.219172: \
> fffcda6b124c 0xfffcda641958/0xfffcda6b123c
>
> The branch was retired by ws-tcf-sr-io13 (tid 2741587) but, before
> the fix, was attributed to launcher_multia (the next thread to run on
> that CPU after the context switch). After the fix, it is correctly
> attributed to ws-tcf-sr-io13.
>
> Why not "drain on PE_CONTEXT then switch" (deferred-set_thread):
> tidq->thread has two consumers \u2014 sample emission needs the OUTGOING
> identity for queued packets, but cs_etm__mem_access() needs the
> CURRENT thread's maps to fetch instruction bytes for OpenCSD. The
> two needs are temporally inverted; a single tidq->thread cannot
> serve both. Keeping tidq->thread current and stamping owner identity
> per packet is the only design that decouples them cleanly.
>
> Fix: capture the owning pid/tid/EL on each buffered packet at
> cs_etm_decoder__buffer_packet() time (before any subsequent
> PE_CONTEXT can mutate tidq->thread / tidq->el), and read them at
> sample emission time.
>
> - struct cs_etm_packet gains pid_t pid, pid_t tid, int el (storing
> an ocsd_ex_level value; typed as int so the struct does not
> depend on OpenCSD headers, which are only included inside
> HAVE_CSTRACE_SUPPORT).
> - cs_etm__etmq_get_pid_tid_el() (formerly cs_etm__etmq_get_pid_tid)
> returns all three.
> - cs_etm__synth_instruction_sample() reads sample.pid / sample.tid
> from tidq->packet->{pid,tid} and derives sample.cpumode from
> tidq->packet->el.
> - cs_etm__synth_branch_sample() reads sample.pid / sample.tid /
> cpumode from tidq->prev_packet->{pid,tid,el}.
> - The separate prev_packet_thread / prev_packet_el bookkeeping in
> cs_etm__packet_swap() / cs_etm__init_traceid_queue() /
> cs_etm__free_traceid_queues() is removed; the per-packet stamp
> on prev_packet now carries that information.
>
> Cost: 12 bytes added to struct cs_etm_packet (~12-16 KB per
> packet_queue with CS_ETM_PACKET_MAX_BUFFER=1024), 16 bytes saved per
> cs_etm_traceid_queue (one struct thread * + one ocsd_ex_level).
>
> A residual gap: cs_etm__copy_insn() reads sample.insn bytes via
> cs_etm__mem_access(), which still uses tidq->thread (the current
> thread), so the inline insn bytes for an outgoing-thread sample may
> be looked up against the wrong address space. Fixing this requires
> threading the packet's owner pid through cs_etm__mem_access and is
> left for a follow-up. sample.ip / sample.pid attribution \u2014 what
> AutoFDO/BOLT consume \u2014 is correct.
>
Hi Amir,
Can you test the patch here to see if it fixes your issue [1]?
We thought it didn't make sense to store the thread on every packet when
there is only one active thread for the decoder and one for sample
generation. We also fixed the other issue mentioned above about
cs_etm__copy_insn() not working.
Thanks
James
[1]:
https://lore.kernel.org/linux-perf-users/20260526-james-cs-context-tracking…
Fix thread tracking when decoding Coresight trace and add a new test for
it.
Unfortunately the test has to be added as a separate binary like the
other existing Coresight workloads which needs a bit of makefile
boilerplate. Ideally it would be a builtin Perf test workload, but Perf
has a lot of dependencies and generates a lot of trace when starting up
which makes tracing tests very slow because all that has to be decoded.
Hopefully I can find a generic way to enable Perf events at the
beginning of the workload and then I can migrate everything from
tools/perf/tests/shell/coresight to Perf test workloads, but that will
have to be done as a separate change.
Signed-off-by: James Clark <james.clark(a)linaro.org>
---
James Clark (1):
perf test cs-etm: Test thread attribution
Leo Yan (1):
perf cs-etm: Queue context packets for frontend
tools/perf/tests/shell/coresight/Makefile | 1 +
.../shell/coresight/context_switch_loop/.gitignore | 1 +
.../shell/coresight/context_switch_loop/Makefile | 29 ++++
.../context_switch_loop/context_switch_loop.c | 83 ++++++++++
.../tests/shell/coresight/context_switch_thread.sh | 48 ++++++
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 18 +-
tools/perf/util/cs-etm.c | 181 +++++++++++++--------
tools/perf/util/cs-etm.h | 5 +-
8 files changed, 295 insertions(+), 71 deletions(-)
---
base-commit: 09d355618f7ccc27ffc7fc668b2e232872962079
change-id: 20260515-james-cs-context-tracking-fix-754998bae7ed
Best regards,
--
James Clark <james.clark(a)linaro.org>
On 20/05/2026 11:13 am, Jie Gan wrote:
>
>
> On 5/20/2026 5:27 PM, Mike Leach wrote:
>>
>>
>>> -----Original Message-----
>>> From: James Clark <james.clark(a)linaro.org>
>>> Sent: Wednesday, May 20, 2026 9:38 AM
>>> To: Jie Gan <jie.gan(a)oss.qualcomm.com>
>>> Cc: coresight(a)lists.linaro.org; linux-arm-kernel(a)lists.infradead.org;
>>> linux-
>>> kernel(a)vger.kernel.org; Suzuki Poulose <Suzuki.Poulose(a)arm.com>; Mike
>>> Leach <Mike.Leach(a)arm.com>; Leo Yan <Leo.Yan(a)arm.com>; Alexander
>>> Shishkin <alexander.shishkin(a)linux.intel.com>; Mathieu Poirier
>>> <mathieu.poirier(a)linaro.org>; Tingwei Zhang
>>> <tingwei.zhang(a)oss.qualcomm.com>; Greg Kroah-Hartman
>>> <gregkh(a)linuxfoundation.org>
>>> Subject: Re: [PATCH] coresight: fix resource leaks on path build failure
>>>
>>>
>>>
>>> On 20/05/2026 2:55 am, Jie Gan wrote:
>>>>
>>>>
>>>> On 5/19/2026 9:57 PM, James Clark wrote:
>>>>>
>>>>>
>>>>> On 13/05/2026 2:32 am, Jie Gan wrote:
>>>>>> Two related leaks when _coresight_build_path() encounters an error
>>>>>> after
>>>>>> coresight_grab_device() has already incremented the pm_runtime,
>>> module,
>>>>>> and device references for a node:
>>>>>>
>>>>>> 1. In _coresight_build_path(), if kzalloc_obj() for the path node
>>>>>> fails
>>>>>> after coresight_grab_device() succeeds,
>>>>>> coresight_drop_device() was
>>>>>> never called, permanently leaking all three references.
>>>>>>
>>>>>> 2. In coresight_build_path(), on failure the partial path was
>>>>>> freed with
>>>>>> kfree(path) instead of coresight_release_path(path). kfree()
>>>>>> only
>>>>>> frees the coresight_path struct itself; it does not iterate
>>>>>> path_list
>>>>>> to call coresight_drop_device() and kfree() for each
>>>>>> coresight_node
>>>>>> already added by deeper recursive calls, leaking both the
>>>>>> pm_runtime,
>>>>>> module, and device references and the node memory for every
>>>>>> element
>>>>>> on the partial path.
>>>>>>
>>>>>> Fix both by adding coresight_drop_device() in the OOM unwind of
>>>>>> _coresight_build_path(), and replacing kfree(path) with
>>>>>> coresight_release_path(path) in coresight_build_path().
>>>>>>
>>>>>> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in
>>>>>> coresight_grab_device()")
>>>>>> Fixes: b3e94405941e ("coresight: associating path with session rather
>>>>>> than tracer")
>>>>>> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
>>>>>> ---
>>>>>> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
>>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/
>>>>>> hwtracing/coresight/coresight-core.c
>>>>>> index 46f247f73cf6..c1354ea8e11d 100644
>>>>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>>>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>>>>> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct
>>>>>> coresight_device *csdev,
>>>>>> return ret;
>>>>>> node = kzalloc_obj(struct coresight_node);
>>>>>> - if (!node)
>>>>>> + if (!node) {
>>>>>> + coresight_drop_device(csdev);
>>>>>> return -ENOMEM;
>>>>>> + }
>>>>>> node->csdev = csdev;
>>>>>> list_add(&node->link, &path->path_list);
>>>>>> @@ -851,7 +853,7 @@ struct coresight_path
>>>>>> *coresight_build_path(struct coresight_device *source,
>>>>>> rc = _coresight_build_path(source, source, sink, path);
>>>>>> if (rc) {
>>>>>> - kfree(path);
>>>>>> + coresight_release_path(path);
>>>>>> return ERR_PTR(rc);
>>>>>> }
>>>>>>
>>>>>> ---
>>>>>> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
>>>>>> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
>>>>>>
>>>>>> Best regards,
>>>>>
>>>>> Looks good to me, but sashiko is complaining: https://sashiko.dev/#/
>>>>> patchset/20260513-fix-memory-leak-issue-
>>>>> v1-1-49822d7bc7d4%40oss.qualcomm.com
>>>>>
>>>>> I'm trying to understand why it's saying that, but I think the
>>>>> scenario is that if there are multiple correct paths to a sink, when
>>>>> one path partially fails and a second path succeeds you could get a
>>>>> path_list with some garbage entries in it.
>>>>
>>>> I think the coresight_release_path is added to address this situation.
>>>> We suffered the path partially failure, and we need release all nodes
>>>> already added to the path.
>>>>
>>>
>>> It wouldn't call coresight_release_path() in this case though. If one
>>> path ends up building to success but a parallel path partially failed
>>> then _coresight_build_path() still returns success. During the search it
>>> would have still added the nodes from the partially failed path to the
>>> path_list. This is only an issue if there are multiple correct paths.
>>>
>
> The point here is there are multiple routes from the same source device
> to the same sink device, am right?
>
> I have no experience on this scenario. So with the scenario, the
> build_path may succeeded in one route and failed in another route, but
> finally, the _coresight_build_path still returns success, is that correct?
>
>>>>>
>>>>> That's kind of a different and existing issue to the one you've fixed,
>>>>> and assumes that multiple paths to one sink are possible, which I'm
>>>>> not sure is supported?
>>>>
>>>> Each path is unique. We only deal with the issue path for balancing the
>>>> reference count.
>>>>
>>>> Thanks,
>>>> Jie
>>>>
>>>
>>> I'm not exactly sure what you mean by unique, but the same source and
>>> sink could potentially be connected through two different sets of links.
>>>
>>
>> Multiple paths between a source and sink are not permitted under the
>> CoreSight spec.
>>
>
> As Mike mentioned, my understanding is that a source device is only
> allowed to be added to one valid path—this is what I mean by “unique.”
>
> Thanks,
> Jie
>
That's ok then we can ignore this for this patch. But it would be good
to enforce that in _coresight_build_path() with some kind of assert. Or
at least add a comment to appease the AI reviewers.
>> If such a system was to be built - then a fix would need to be in the
>> declaration of connections - e.g. miss one path out in the device tree
>> for example. Not up to the Coresight drivers to handle out of
>> specification hardware.
>>
>> Mike
>>
>>
>>>>>
>>>>> It might be as easy as breaking the loop early for any return value
>>>>> other than -ENODEV, but I'll leave it to you to decide whether to do
>>>>> that here or not.
>>>>>
>>>>> Reviewed-by: James Clark <james.clark(a)linaro.org>
>>>>>
>>>>
>>
>
On 20/05/2026 2:55 am, Jie Gan wrote:
>
>
> On 5/19/2026 9:57 PM, James Clark wrote:
>>
>>
>> On 13/05/2026 2:32 am, Jie Gan wrote:
>>> Two related leaks when _coresight_build_path() encounters an error after
>>> coresight_grab_device() has already incremented the pm_runtime, module,
>>> and device references for a node:
>>>
>>> 1. In _coresight_build_path(), if kzalloc_obj() for the path node fails
>>> after coresight_grab_device() succeeds, coresight_drop_device() was
>>> never called, permanently leaking all three references.
>>>
>>> 2. In coresight_build_path(), on failure the partial path was freed with
>>> kfree(path) instead of coresight_release_path(path). kfree() only
>>> frees the coresight_path struct itself; it does not iterate
>>> path_list
>>> to call coresight_drop_device() and kfree() for each coresight_node
>>> already added by deeper recursive calls, leaking both the
>>> pm_runtime,
>>> module, and device references and the node memory for every element
>>> on the partial path.
>>>
>>> Fix both by adding coresight_drop_device() in the OOM unwind of
>>> _coresight_build_path(), and replacing kfree(path) with
>>> coresight_release_path(path) in coresight_build_path().
>>>
>>> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in
>>> coresight_grab_device()")
>>> Fixes: b3e94405941e ("coresight: associating path with session rather
>>> than tracer")
>>> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
>>> ---
>>> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/
>>> hwtracing/coresight/coresight-core.c
>>> index 46f247f73cf6..c1354ea8e11d 100644
>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct
>>> coresight_device *csdev,
>>> return ret;
>>> node = kzalloc_obj(struct coresight_node);
>>> - if (!node)
>>> + if (!node) {
>>> + coresight_drop_device(csdev);
>>> return -ENOMEM;
>>> + }
>>> node->csdev = csdev;
>>> list_add(&node->link, &path->path_list);
>>> @@ -851,7 +853,7 @@ struct coresight_path
>>> *coresight_build_path(struct coresight_device *source,
>>> rc = _coresight_build_path(source, source, sink, path);
>>> if (rc) {
>>> - kfree(path);
>>> + coresight_release_path(path);
>>> return ERR_PTR(rc);
>>> }
>>>
>>> ---
>>> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
>>> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
>>>
>>> Best regards,
>>
>> Looks good to me, but sashiko is complaining: https://sashiko.dev/#/
>> patchset/20260513-fix-memory-leak-issue-
>> v1-1-49822d7bc7d4%40oss.qualcomm.com
>>
>> I'm trying to understand why it's saying that, but I think the
>> scenario is that if there are multiple correct paths to a sink, when
>> one path partially fails and a second path succeeds you could get a
>> path_list with some garbage entries in it.
>
> I think the coresight_release_path is added to address this situation.
> We suffered the path partially failure, and we need release all nodes
> already added to the path.
>
It wouldn't call coresight_release_path() in this case though. If one
path ends up building to success but a parallel path partially failed
then _coresight_build_path() still returns success. During the search it
would have still added the nodes from the partially failed path to the
path_list. This is only an issue if there are multiple correct paths.
>>
>> That's kind of a different and existing issue to the one you've fixed,
>> and assumes that multiple paths to one sink are possible, which I'm
>> not sure is supported?
>
> Each path is unique. We only deal with the issue path for balancing the
> reference count.
>
> Thanks,
> Jie
>
I'm not exactly sure what you mean by unique, but the same source and
sink could potentially be connected through two different sets of links.
>>
>> It might be as easy as breaking the loop early for any return value
>> other than -ENODEV, but I'll leave it to you to decide whether to do
>> that here or not.
>>
>> Reviewed-by: James Clark <james.clark(a)linaro.org>
>>
>
On 13/05/2026 2:32 am, Jie Gan wrote:
> Two related leaks when _coresight_build_path() encounters an error after
> coresight_grab_device() has already incremented the pm_runtime, module,
> and device references for a node:
>
> 1. In _coresight_build_path(), if kzalloc_obj() for the path node fails
> after coresight_grab_device() succeeds, coresight_drop_device() was
> never called, permanently leaking all three references.
>
> 2. In coresight_build_path(), on failure the partial path was freed with
> kfree(path) instead of coresight_release_path(path). kfree() only
> frees the coresight_path struct itself; it does not iterate path_list
> to call coresight_drop_device() and kfree() for each coresight_node
> already added by deeper recursive calls, leaking both the pm_runtime,
> module, and device references and the node memory for every element
> on the partial path.
>
> Fix both by adding coresight_drop_device() in the OOM unwind of
> _coresight_build_path(), and replacing kfree(path) with
> coresight_release_path(path) in coresight_build_path().
>
> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in coresight_grab_device()")
> Fixes: b3e94405941e ("coresight: associating path with session rather than tracer")
> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 46f247f73cf6..c1354ea8e11d 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct coresight_device *csdev,
> return ret;
>
> node = kzalloc_obj(struct coresight_node);
> - if (!node)
> + if (!node) {
> + coresight_drop_device(csdev);
> return -ENOMEM;
> + }
>
> node->csdev = csdev;
> list_add(&node->link, &path->path_list);
> @@ -851,7 +853,7 @@ struct coresight_path *coresight_build_path(struct coresight_device *source,
>
> rc = _coresight_build_path(source, source, sink, path);
> if (rc) {
> - kfree(path);
> + coresight_release_path(path);
> return ERR_PTR(rc);
> }
>
>
> ---
> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
>
> Best regards,
Looks good to me, but sashiko is complaining:
https://sashiko.dev/#/patchset/20260513-fix-memory-leak-issue-v1-1-49822d7b…
I'm trying to understand why it's saying that, but I think the scenario
is that if there are multiple correct paths to a sink, when one path
partially fails and a second path succeeds you could get a path_list
with some garbage entries in it.
That's kind of a different and existing issue to the one you've fixed,
and assumes that multiple paths to one sink are possible, which I'm not
sure is supported?
It might be as easy as breaking the loop early for any return value
other than -ENODEV, but I'll leave it to you to decide whether to do
that here or not.
Reviewed-by: James Clark <james.clark(a)linaro.org>
On 19/05/2026 09:43, Ma Ke wrote:
> bus_find_device() returns a device with its reference count
> incremented. coresight_get_sink_by_id() only uses the returned device
> to find the matching CoreSight sink by id and does not need to
> transfer this lookup reference to its callers.
>
> Keeping the reference forces callers such as etm_setup_aux() to know
> about the internal lookup implementation and to drop the reference
> themselves. This is error-prone and led to a leaked reference when a
> user-selected sink is used for perf AUX tracing.
>
> Drop the reference inside coresight_get_sink_by_id() after converting
> the device to the corresponding coresight_device. The CoreSight path
> code takes device references it needs when building/using the path.
>
> Found by code review.
Thanks for the report. But..
>
> Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
> Cc: stable(a)vger.kernel.org
> Fixes: 226443925887 ("coresight: Use event attributes for sink selection")
I would rather drop the reference in the etm_setup_aux, to make sure we
are still dealing with a valid device, that has not been removed under
our feet.
Suzuki
> ---
> drivers/hwtracing/coresight/coresight-core.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 46f247f73cf6..2cca4ed83e2c 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -624,11 +624,24 @@ static int coresight_sink_by_id(struct device *dev, const void *data)
> struct coresight_device *coresight_get_sink_by_id(u32 id)
> {
> struct device *dev = NULL;
> + struct coresight_device *csdev;
>
> dev = bus_find_device(&coresight_bustype, NULL, &id,
> coresight_sink_by_id);
> + if (!dev)
> + return NULL;
> +
> + csdev = to_coresight_device(dev);
> +
> + /*
> + * bus_find_device() returns a device with its reference count
> + * incremented. coresight_get_sink_by_id() only performs a lookup;
> + * the CoreSight path code takes the references it needs when the
> + * path is built, so drop the lookup reference here.
> + */
> + put_device(dev);
>
> - return dev ? to_coresight_device(dev) : NULL;
> + return csdev;
> }
>
> /**
This series focuses on CoreSight path power management. The changes can
be divided into four parts for review:
Patches 01 - 10: Preparison for CPU PM:
Fix source disabling on idr_alloc failure.
Fix helper enable failure handling.
Refactor CPU ID stored in csdev.
Move CPU lock to sysfs layer.
Move per-CPU source pointer from etm-perf to core layer.
Refactor etm-perf to retrieve source via per-CPU's event
data for lockless and get source reference during AUX
setup.
Patches 11 - 13: Refactor CPU idle flow managed in the CoreSight core
layer.
Patches 14 - 23: Refactor path enable / disable with range, control path
during CPU idle.
Patches 24 - 25: Support the sink (TRBE) control during CPU idle.
Patches 26 - 28: Move CPU hotplug into the core layer, and fix sysfs
mode for hotplug.
This series is rebased on the coresight-next branch and has been verified
on Juno-r2 (ETM + ETR) and FVP RevC (ETE + TRBE). Built successfully
for armv7 (ARCH=arm).
---
Changes in v14:
- Fixed percpu_pm_failed write with per_cpu(percpu_pm_faile, cpu).
- Link to v13: https://lore.kernel.org/r/20260515-arm_coresight_path_power_management_impr…
Changes in v13:
- Cleared percpu_pm_failed flag when source is unregistered (Suzuki).
- Rebased on latest coresight-next.
- Link to v12: https://lore.kernel.org/r/20260511-arm_coresight_path_power_management_impr…
Changes in v12:
- Added comments on coresight_{get|put)_percpu_source_ref (Suzuki).
- Refined failure handling in path enable (Suzuki).
- Added coresight_is_software_source() helper (Suzuki).
- Reordered taking ref on csdev and its parent in patch 07.
- Define the enum mode with bit flags.
- Minor improvements on commit logs.
- Rebased on lastest coresight-next.
- Link to v11: https://lore.kernel.org/r/20260501-arm_coresight_path_power_management_impr…
Changes in v11:
- Moved per-CPU source pointer from etm-perf to core (Suzuki).
- Added grabbing/ungrabbing csdev for device reference (Suzuki).
- Minor refine for error handling and logs in CPU PM (James).
- Refactored etm-perf with fetching path/source from event data (Suzuki).
- Fixed Helper error handling (sashiko).
- Added Jie's test tag (thanks!).
- Minor improvement for comments and commit logs.
- Link to v10: https://lore.kernel.org/r/20260405-arm_coresight_path_power_management_impr…
Changes in v10:
- Removed redundant checks in ETMv4 PM callbacks (sashiko).
- Added a new const structure etm4_cs_pm_ops (sashiko).
- Used fine-grained spinlock on sysfs_active_config (sashiko).
- Blocked notification after failures in save / restore to avoid lockups.
- Changed Change CPUHP_AP_ARM_CORESIGHT_STARTING to
CPUHP_AP_ARM_CORESIGHT_ONLINE so that the CPU hotplug callback runs in
the thread context (sashiko).
- Link to v9: https://lore.kernel.org/r/20260401-arm_coresight_path_power_management_impr…
Signed-off-by: Leo Yan <leo.yan(a)arm.com>
---
Jie Gan (1):
coresight: Fix source not disabled on idr_alloc_u32 failure
Leo Yan (26):
coresight: Handle helper enable failure properly
coresight: Extract device init into coresight_init_device()
coresight: Populate CPU ID into coresight_device
coresight: Remove .cpu_id() callback from source ops
coresight: Take hotplug lock in enable_source_store() for Sysfs mode
coresight: perf: Retrieve path and source from event data
coresight: Take a reference on csdev
coresight: Move per-CPU source pointer to core layer
coresight: Take per-CPU source reference during AUX setup
coresight: Register CPU PM notifier in core layer
coresight: etm4x: Hook CPU PM callbacks
coresight: etm4x: Remove redundant checks in PM save and restore
coresight: syscfg: Use IRQ-safe spinlock to protect active variables
coresight: Disable source helpers in coresight_disable_path()
coresight: Control path with range
coresight: Use helpers to fetch first and last nodes
coresight: Introduce coresight_enable_source() helper
coresight: Save active path for system tracers
coresight: etm4x: Set active path on target CPU
coresight: etm3x: Set active path on target CPU
coresight: sysfs: Use source's path pointer for path control
coresight: Control path during CPU idle
coresight: Add PM callbacks for sink device
coresight: sysfs: Increment refcount only for software source
coresight: Move CPU hotplug callbacks to core layer
coresight: sysfs: Validate CPU online status for per-CPU sources
Yabin Cui (1):
coresight: trbe: Save and restore state across CPU low power state
drivers/hwtracing/coresight/coresight-catu.c | 2 +-
drivers/hwtracing/coresight/coresight-core.c | 551 +++++++++++++++++++--
drivers/hwtracing/coresight/coresight-cti-core.c | 9 +-
drivers/hwtracing/coresight/coresight-etm-perf.c | 288 ++++++-----
drivers/hwtracing/coresight/coresight-etm3x-core.c | 73 +--
drivers/hwtracing/coresight/coresight-etm4x-core.c | 166 ++-----
drivers/hwtracing/coresight/coresight-priv.h | 6 +
drivers/hwtracing/coresight/coresight-syscfg.c | 38 +-
drivers/hwtracing/coresight/coresight-syscfg.h | 2 +
drivers/hwtracing/coresight/coresight-sysfs.c | 131 ++---
drivers/hwtracing/coresight/coresight-trbe.c | 61 ++-
include/linux/coresight.h | 27 +-
include/linux/cpuhotplug.h | 2 +-
13 files changed, 874 insertions(+), 482 deletions(-)
---
base-commit: f4526ffee6ff9f5845b430957417149eded74bf3
change-id: 20251104-arm_coresight_path_power_management_improvement-dab4966f8280
Best regards,
--
Leo Yan <leo.yan(a)arm.com>
This series focuses on CoreSight path power management. The changes can
be divided into four parts for review:
Patches 01 - 10: Preparison for CPU PM:
Fix source disabling on idr_alloc failure.
Fix helper enable failure handling.
Refactor CPU ID stored in csdev.
Move CPU lock to sysfs layer.
Move per-CPU source pointer from etm-perf to core layer.
Refactor etm-perf to retrieve source via per-CPU's event
data for lockless and get source reference during AUX
setup.
Patches 11 - 13: Refactor CPU idle flow managed in the CoreSight core
layer.
Patches 14 - 23: Refactor path enable / disable with range, control path
during CPU idle.
Patches 24 - 25: Support the sink (TRBE) control during CPU idle.
Patches 26 - 28: Move CPU hotplug into the core layer, and fix sysfs
mode for hotplug.
This series is rebased on the coresight-next branch and has been verified
on Juno-r2 (ETM + ETR) and FVP RevC (ETE + TRBE). Built successfully
for armv7 (ARCH=arm).
---
Changes in v13:
- Cleared percpu_pm_failed flag when source is unregistered (Suzuki).
- Rebased on lastest coresight-next.
- Link to v12: https://lore.kernel.org/r/20260511-arm_coresight_path_power_management_impr…
Changes in v12:
- Added comments on coresight_{get|put)_percpu_source_ref (Suzuki).
- Refined failure handling in path enable (Suzuki).
- Added coresight_is_software_source() helper (Suzuki).
- Reordered taking ref on csdev and its parent in patch 07.
- Define the enum mode with bit flags.
- Minor improvements on commit logs.
- Rebased on lastest coresight-next.
- Link to v11: https://lore.kernel.org/r/20260501-arm_coresight_path_power_management_impr…
Changes in v11:
- Moved per-CPU source pointer from etm-perf to core (Suzuki).
- Added grabbing/ungrabbing csdev for device reference (Suzuki).
- Minor refine for error handling and logs in CPU PM (James).
- Refactored etm-perf with fetching path/source from event data (Suzuki).
- Fixed Helper error handling (sashiko).
- Added Jie's test tag (thanks!).
- Minor improvement for comments and commit logs.
- Link to v10: https://lore.kernel.org/r/20260405-arm_coresight_path_power_management_impr…
Changes in v10:
- Removed redundant checks in ETMv4 PM callbacks (sashiko).
- Added a new const structure etm4_cs_pm_ops (sashiko).
- Used fine-grained spinlock on sysfs_active_config (sashiko).
- Blocked notification after failures in save / restore to avoid lockups.
- Changed Change CPUHP_AP_ARM_CORESIGHT_STARTING to
CPUHP_AP_ARM_CORESIGHT_ONLINE so that the CPU hotplug callback runs in
the thread context (sashiko).
- Link to v9: https://lore.kernel.org/r/20260401-arm_coresight_path_power_management_impr…
Signed-off-by: Leo Yan <leo.yan(a)arm.com>
---
Jie Gan (1):
coresight: Fix source not disabled on idr_alloc_u32 failure
Leo Yan (26):
coresight: Handle helper enable failure properly
coresight: Extract device init into coresight_init_device()
coresight: Populate CPU ID into coresight_device
coresight: Remove .cpu_id() callback from source ops
coresight: Take hotplug lock in enable_source_store() for Sysfs mode
coresight: perf: Retrieve path and source from event data
coresight: Take a reference on csdev
coresight: Move per-CPU source pointer to core layer
coresight: Take per-CPU source reference during AUX setup
coresight: Register CPU PM notifier in core layer
coresight: etm4x: Hook CPU PM callbacks
coresight: etm4x: Remove redundant checks in PM save and restore
coresight: syscfg: Use IRQ-safe spinlock to protect active variables
coresight: Disable source helpers in coresight_disable_path()
coresight: Control path with range
coresight: Use helpers to fetch first and last nodes
coresight: Introduce coresight_enable_source() helper
coresight: Save active path for system tracers
coresight: etm4x: Set active path on target CPU
coresight: etm3x: Set active path on target CPU
coresight: sysfs: Use source's path pointer for path control
coresight: Control path during CPU idle
coresight: Add PM callbacks for sink device
coresight: sysfs: Increment refcount only for software source
coresight: Move CPU hotplug callbacks to core layer
coresight: sysfs: Validate CPU online status for per-CPU sources
Yabin Cui (1):
coresight: trbe: Save and restore state across CPU low power state
drivers/hwtracing/coresight/coresight-catu.c | 2 +-
drivers/hwtracing/coresight/coresight-core.c | 551 +++++++++++++++++++--
drivers/hwtracing/coresight/coresight-cti-core.c | 9 +-
drivers/hwtracing/coresight/coresight-etm-perf.c | 288 ++++++-----
drivers/hwtracing/coresight/coresight-etm3x-core.c | 73 +--
drivers/hwtracing/coresight/coresight-etm4x-core.c | 166 ++-----
drivers/hwtracing/coresight/coresight-priv.h | 6 +
drivers/hwtracing/coresight/coresight-syscfg.c | 38 +-
drivers/hwtracing/coresight/coresight-syscfg.h | 2 +
drivers/hwtracing/coresight/coresight-sysfs.c | 131 ++---
drivers/hwtracing/coresight/coresight-trbe.c | 61 ++-
include/linux/coresight.h | 27 +-
include/linux/cpuhotplug.h | 2 +-
13 files changed, 874 insertions(+), 482 deletions(-)
---
base-commit: f4526ffee6ff9f5845b430957417149eded74bf3
change-id: 20251104-arm_coresight_path_power_management_improvement-dab4966f8280
Best regards,
--
Leo Yan <leo.yan(a)arm.com>
This series focuses on CoreSight path power management. The changes can
be divided into four parts for review:
Patches 01 - 10: Preparison for CPU PM:
Fix source disabling on idr_alloc failure.
Fix helper enable failure handling.
Refactor CPU ID stored in csdev.
Move CPU lock to sysfs layer.
Move per-CPU source pointer from etm-perf to core layer.
Refactor etm-perf to retrieve source via per-CPU's event
data for lockless and get source reference during AUX
setup.
Patches 11 - 13: Refactor CPU idle flow managed in the CoreSight core
layer.
Patches 14 - 23: Refactor path enable / disable with range, control path
during CPU idle.
Patches 24 - 25: Support the sink (TRBE) control during CPU idle.
Patches 26 - 28: Move CPU hotplug into the core layer, and fix sysfs
mode for hotplug.
This series is rebased on the coresight-next branch and has been verified
on Juno-r2 (ETM + ETR) and FVP RevC (ETE + TRBE). Built successfully
for armv7 (ARCH=arm).
---
Changes in v12:
- Added comments on coresight_{get|put)_percpu_source_ref (Suzuki).
- Refined failure handling in path enable (Suzuki).
- Added coresight_is_software_source() helper (Suzuki).
- Reordered taking ref on csdev and its parent in patch 07.
- Define the enum mode with bit flags.
- Minor improvements on commit logs.
- Rebased on lastest coresight-next.
- Link to v11: https://lore.kernel.org/r/20260501-arm_coresight_path_power_management_impr…
Changes in v11:
- Moved per-CPU source pointer from etm-perf to core (Suzuki).
- Added grabbing/ungrabbing csdev for device reference (Suzuki).
- Minor refine for error handling and logs in CPU PM (James).
- Refactored etm-perf with fetching path/source from event data (Suzuki).
- Fixed Helper error handling (sashiko).
- Added Jie's test tag (thanks!).
- Minor improvement for comments and commit logs.
- Link to v10: https://lore.kernel.org/r/20260405-arm_coresight_path_power_management_impr…
Changes in v10:
- Removed redundant checks in ETMv4 PM callbacks (sashiko).
- Added a new const structure etm4_cs_pm_ops (sashiko).
- Used fine-grained spinlock on sysfs_active_config (sashiko).
- Blocked notification after failures in save / restore to avoid lockups.
- Changed Change CPUHP_AP_ARM_CORESIGHT_STARTING to
CPUHP_AP_ARM_CORESIGHT_ONLINE so that the CPU hotplug callback runs in
the thread context (sashiko).
- Link to v9: https://lore.kernel.org/r/20260401-arm_coresight_path_power_management_impr…
Signed-off-by: Leo Yan <leo.yan(a)arm.com>
---
Jie Gan (1):
coresight: Fix source not disabled on idr_alloc_u32 failure
Leo Yan (26):
coresight: Handle helper enable failure properly
coresight: Extract device init into coresight_init_device()
coresight: Populate CPU ID into coresight_device
coresight: Remove .cpu_id() callback from source ops
coresight: Take hotplug lock in enable_source_store() for Sysfs mode
coresight: perf: Retrieve path and source from event data
coresight: Take a reference on csdev
coresight: Move per-CPU source pointer to core layer
coresight: Take per-CPU source reference during AUX setup
coresight: Register CPU PM notifier in core layer
coresight: etm4x: Hook CPU PM callbacks
coresight: etm4x: Remove redundant checks in PM save and restore
coresight: syscfg: Use IRQ-safe spinlock to protect active variables
coresight: Disable source helpers in coresight_disable_path()
coresight: Control path with range
coresight: Use helpers to fetch first and last nodes
coresight: Introduce coresight_enable_source() helper
coresight: Save active path for system tracers
coresight: etm4x: Set active path on target CPU
coresight: etm3x: Set active path on target CPU
coresight: sysfs: Use source's path pointer for path control
coresight: Control path during CPU idle
coresight: Add PM callbacks for sink device
coresight: sysfs: Increment refcount only for software source
coresight: Move CPU hotplug callbacks to core layer
coresight: sysfs: Validate CPU online status for per-CPU sources
Yabin Cui (1):
coresight: trbe: Save and restore state across CPU low power state
drivers/hwtracing/coresight/coresight-catu.c | 2 +-
drivers/hwtracing/coresight/coresight-core.c | 548 ++++++++++++++++++---
drivers/hwtracing/coresight/coresight-cti-core.c | 9 +-
drivers/hwtracing/coresight/coresight-etm-perf.c | 285 ++++++-----
drivers/hwtracing/coresight/coresight-etm3x-core.c | 73 +--
drivers/hwtracing/coresight/coresight-etm4x-core.c | 166 ++-----
drivers/hwtracing/coresight/coresight-priv.h | 6 +
drivers/hwtracing/coresight/coresight-syscfg.c | 38 +-
drivers/hwtracing/coresight/coresight-syscfg.h | 2 +
drivers/hwtracing/coresight/coresight-sysfs.c | 131 ++---
drivers/hwtracing/coresight/coresight-trbe.c | 61 ++-
include/linux/coresight.h | 27 +-
include/linux/cpuhotplug.h | 2 +-
13 files changed, 870 insertions(+), 480 deletions(-)
---
base-commit: 0ec0a8785d21f63db520bd9d2a67c55e855d36a8
change-id: 20251104-arm_coresight_path_power_management_improvement-dab4966f8280
Best regards,
--
Leo Yan <leo.yan(a)arm.com>