On 03/03/2022 11:36, Jiri Olsa wrote:
> On Wed, Mar 02, 2022 at 04:20:00PM +0000, James Clark wrote:
>>
>>
>> On 27/02/2022 22:50, Jiri Olsa wrote:
>>> On Thu, Feb 24, 2022 at 05:19:55PM +0000, James Clark wrote:
>>>> MMAP records that occur after the build-id header is parsed do not have
>>>> their build-id set even if the filename matches an entry from the
>>>> header. Set the build-id on these dsos as long as the MMAP record
>>>> doesn't have its own build-id set.
>>>>
>>>> This fixes an issue with off target analysis where the local version of
>>>> a dso is loaded rather than one from ~/.debug via a build-id.
>>>
>>> nice catch :)
>>>
>>>>
>>>> Reported-by: Denis Nikitin <denik(a)chromium.org>
>>>> Signed-off-by: James Clark <james.clark(a)arm.com>
>>>> ---
>>>> tools/perf/util/dso.h | 1 +
>>>> tools/perf/util/header.c | 1 +
>>>> tools/perf/util/map.c | 16 ++++++++++++++--
>>>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>>>> index 011da3924fc1..3a9fd4d389b5 100644
>>>> --- a/tools/perf/util/dso.h
>>>> +++ b/tools/perf/util/dso.h
>>>> @@ -167,6 +167,7 @@ struct dso {
>>>> enum dso_load_errno load_errno;
>>>> u8 adjust_symbols:1;
>>>> u8 has_build_id:1;
>>>> + u8 header_build_id:1;
>>>> u8 has_srcline:1;
>>>> u8 hit:1;
>>>> u8 annotate_warned:1;
>>>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>>>> index 6da12e522edc..571d73d4f976 100644
>>>> --- a/tools/perf/util/header.c
>>>> +++ b/tools/perf/util/header.c
>>>> @@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
>>>>
>>>> build_id__init(&bid, bev->data, size);
>>>> dso__set_build_id(dso, &bid);
>>>> + dso->header_build_id = 1;
>>>>
>>>> if (dso_space != DSO_SPACE__USER) {
>>>> struct kmod_path m = { .name = NULL, };
>>>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>>>> index 1803d3887afe..4ae91e491e23 100644
>>>> --- a/tools/perf/util/map.c
>>>> +++ b/tools/perf/util/map.c
>>>> @@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>>>
>>>> if (map != NULL) {
>>>> char newfilename[PATH_MAX];
>>>> - struct dso *dso;
>>>> + struct dso *dso, *header_bid_dso;
>>>> int anon, no_dso, vdso, android;
>>>>
>>>> android = is_android_lib(filename);
>>>> @@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>>>
>>>> if (build_id__is_defined(bid))
>>>> dso__set_build_id(dso, bid);
>>>> -
>>>> + else {
>>>
>>> nit please add { } to the if clause as well
>>>
>>>> + /*
>>>> + * If the mmap event had no build ID, search for an existing dso from the
>>>> + * build ID header by name. Otherwise only the dso loaded at the time of
>>>> + * reading the header will have the build ID set and all future mmaps will
>>>> + * have it missing.
>>>> + */
>>>> + header_bid_dso = __dsos__find(&machine->dsos, filename, false);
>>>
>>> is this 'perf top' safe? I think dso should be added in the
>>> same thread, but please check and add comment why we don't
>>> need locking in here
>>
>> Seems like there are multiple synthesize_threads_workers using the same machine->dsos object so
>> I think locking is needed.
>>
>> At first I thought of doing this:
>>
>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>> index 4ae91e491e23..b87b81e3d41c 100644
>> --- a/tools/perf/util/map.c
>> +++ b/tools/perf/util/map.c
>> @@ -192,7 +192,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>> * reading the header will have the build ID set and all future mmaps will
>> * have it missing.
>> */
>> + down_read(&machine->dsos.lock);
>> header_bid_dso = __dsos__find(&machine->dsos, filename, false);
>> + up_read(&machine->dsos.lock);
>> if (header_bid_dso && header_bid_dso->header_build_id) {
>> dso__set_build_id(dso, &header_bid_dso->bid);
>> dso->header_build_id = 1;
>>
>> But then I was wondering why it doesn't need a write lock all the way from machine__findnew_dso_id() to
>> dso__put()? At the moment there are writes to the dso like dso__set_loaded(), dso->nsinfo = nsi and
>> dso__set_build_id(), so another thread could find the dso in a partially constructed state.
>
> I think that's fine, machine__findnew_dso_id takes the dso ref so
> the real 'release' is when the machine object goes down at dsos__purge
>
>>
>> Not sure if this is an issue currently without my patch, but at least with it they would have to be found
>> with header_build_id already set to 1 otherwise it will mess things up.
>
> as for the partial changes I think it's also fine, because it happens
> at separate places.. we'd need to investigate specific example to see
> if there's a problem
>
>>
>> Extending the write lock outside of machine__findnew_dso_id() is difficult because it already
>> releases it before it returns. Does it need to be changed so that machine__findnew_dso_id() takes all the
>> arguments needed to construct it inside the lock?
>
> the change above should do it.. as for top threads I don't think
> it actually matters because there's events processing and hists
> processing, and I did not find them clashing in this.. but there's
> some setup session code that touches this, so it's better to be safe
Ok yes I think I agree. The header flag isn't important for perf top so having it all
constructed in place isn't necessary.
I've submitted v2 with this change.
Thanks for the review.
>
> thanks,
> jirka
Changes since v1:
* Add read lock around dso find
* Bracket style fix
Hi,
We are seeing an issue with doing Coresight decode off target where
initially the correct dso from ~/.debug is used, but after a new thread
in the perf.data file is passed with its mmap record, the local version
of the dso is picked up instead. This happens if the binary exists in the
same path on both devices, for example /bin/ls.
Initially when parsing the build-ids in the header, the dso for /bin/ls
will be created, and the file will correctly point to
~/.debug/bin/ls/2f15ad836be3339dec0e2e6a3c637e08e48aacbd/elf, but for any
new threads or mmaps that are also for /bin/ls, they will not have a
build-id set so they point to /bin/ls on the local machine rather than the
debug folder.
To fix this I made it possible to look up which existing dsos have
build id's set that originate from the header and then copy that build-id
onto the new dso if the name matches. Another way to do it would
be to stop comparing the mmap id so it matches on filename only, but I
think we do want to differentiate between different mmaps, even if they
have the same name, which is how it works in this version.
Applies to perf/core 56dce8681
James Clark (1):
perf: Set build-id using build-id header on new mmap records
tools/perf/util/dso.h | 1 +
tools/perf/util/header.c | 1 +
tools/perf/util/map.c | 20 +++++++++++++++++---
3 files changed, 19 insertions(+), 3 deletions(-)
--
2.28.0
On 27/02/2022 22:50, Jiri Olsa wrote:
> On Thu, Feb 24, 2022 at 05:19:55PM +0000, James Clark wrote:
>> MMAP records that occur after the build-id header is parsed do not have
>> their build-id set even if the filename matches an entry from the
>> header. Set the build-id on these dsos as long as the MMAP record
>> doesn't have its own build-id set.
>>
>> This fixes an issue with off target analysis where the local version of
>> a dso is loaded rather than one from ~/.debug via a build-id.
>
> nice catch :)
>
>>
>> Reported-by: Denis Nikitin <denik(a)chromium.org>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> tools/perf/util/dso.h | 1 +
>> tools/perf/util/header.c | 1 +
>> tools/perf/util/map.c | 16 ++++++++++++++--
>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index 011da3924fc1..3a9fd4d389b5 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -167,6 +167,7 @@ struct dso {
>> enum dso_load_errno load_errno;
>> u8 adjust_symbols:1;
>> u8 has_build_id:1;
>> + u8 header_build_id:1;
>> u8 has_srcline:1;
>> u8 hit:1;
>> u8 annotate_warned:1;
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index 6da12e522edc..571d73d4f976 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -2200,6 +2200,7 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev,
>>
>> build_id__init(&bid, bev->data, size);
>> dso__set_build_id(dso, &bid);
>> + dso->header_build_id = 1;
>>
>> if (dso_space != DSO_SPACE__USER) {
>> struct kmod_path m = { .name = NULL, };
>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>> index 1803d3887afe..4ae91e491e23 100644
>> --- a/tools/perf/util/map.c
>> +++ b/tools/perf/util/map.c
>> @@ -127,7 +127,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>
>> if (map != NULL) {
>> char newfilename[PATH_MAX];
>> - struct dso *dso;
>> + struct dso *dso, *header_bid_dso;
>> int anon, no_dso, vdso, android;
>>
>> android = is_android_lib(filename);
>> @@ -185,7 +185,19 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>>
>> if (build_id__is_defined(bid))
>> dso__set_build_id(dso, bid);
>> -
>> + else {
>
> nit please add { } to the if clause as well
>
>> + /*
>> + * If the mmap event had no build ID, search for an existing dso from the
>> + * build ID header by name. Otherwise only the dso loaded at the time of
>> + * reading the header will have the build ID set and all future mmaps will
>> + * have it missing.
>> + */
>> + header_bid_dso = __dsos__find(&machine->dsos, filename, false);
>
> is this 'perf top' safe? I think dso should be added in the
> same thread, but please check and add comment why we don't
> need locking in here
Seems like there are multiple synthesize_threads_workers using the same machine->dsos object so
I think locking is needed.
At first I thought of doing this:
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 4ae91e491e23..b87b81e3d41c 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -192,7 +192,9 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
* reading the header will have the build ID set and all future mmaps will
* have it missing.
*/
+ down_read(&machine->dsos.lock);
header_bid_dso = __dsos__find(&machine->dsos, filename, false);
+ up_read(&machine->dsos.lock);
if (header_bid_dso && header_bid_dso->header_build_id) {
dso__set_build_id(dso, &header_bid_dso->bid);
dso->header_build_id = 1;
But then I was wondering why it doesn't need a write lock all the way from machine__findnew_dso_id() to
dso__put()? At the moment there are writes to the dso like dso__set_loaded(), dso->nsinfo = nsi and
dso__set_build_id(), so another thread could find the dso in a partially constructed state.
Not sure if this is an issue currently without my patch, but at least with it they would have to be found
with header_build_id already set to 1 otherwise it will mess things up.
Extending the write lock outside of machine__findnew_dso_id() is difficult because it already
releases it before it returns. Does it need to be changed so that machine__findnew_dso_id() takes all the
arguments needed to construct it inside the lock?
James
>
> thanks,
> jirka
>
>> + if (header_bid_dso && header_bid_dso->header_build_id) {
>> + dso__set_build_id(dso, &header_bid_dso->bid);
>> + dso->header_build_id = 1;
>> + }
>> + }
>> dso__put(dso);
>> }
>> return map;
>> --
>> 2.28.0
>>
On 02/03/2022 08:01, Mao Jinlong wrote:
> It is possibe that probe failure issue happens when the device
> and its child_device's probe happens at the same time.
> In coresight_make_links, has_conns_grp is true for parent, but
> has_conns_grp is false for child device as has_conns_grp is set
> to true in coresight_create_conns_sysfs_group. The probe of parent
> device will fail at this condition. Add has_conns_grp check for
> child device before make the links and make the process from
> device_register to connection_create be atomic to avoid this
> probe failure issue.
>
> Suggested-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
> Suggested-by: Mike Leach <mike.leach(a)linaro.org>
> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 88653d1c06a4..7ce78dddfe31 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -1382,7 +1382,7 @@ static int coresight_fixup_device_conns(struct coresight_device *csdev)
> continue;
> conn->child_dev =
> coresight_find_csdev_by_fwnode(conn->child_fwnode);
> - if (conn->child_dev) {
> + if (conn->child_dev && conn->child_dev->has_conns_grp) {
> ret = coresight_make_links(csdev, conn,
> conn->child_dev);
> if (ret)
> @@ -1619,6 +1619,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
> csdev->dev.fwnode = fwnode_handle_get(dev_fwnode(desc->dev));
> dev_set_name(&csdev->dev, "%s", desc->name);
>
> + mutex_lock(&coresight_mutex);
> ret = device_register(&csdev->dev);
> if (ret) {
> put_device(&csdev->dev);
> @@ -1645,8 +1646,6 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
> }
> }
>
> - mutex_lock(&coresight_mutex);
> -
> ret = coresight_create_conns_sysfs_group(csdev);
> if (!ret)
> ret = coresight_fixup_device_conns(csdev);
> @@ -1666,6 +1665,7 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
> err_free_csdev:
> kfree(csdev);
> err_out:
> + mutex_unlock(&coresight_mutex);
This appears to be wrong. We may do an unlock when we didn't lock it in
the first place.
Please could you double check. Also, I think it may be neater to move
the kfree(csdev) to the only user and return straight away from there.
Cheers
Suzuki
> /* Cleanup the connection information */
> coresight_release_platform_data(NULL, desc->pdata);
> return ERR_PTR(ret);
Hi
On 01/03/2022 13:30, Jinlong Mao wrote:
> Hi Mike,
>
> On 3/1/2022 9:15 PM, Mike Leach wrote:
>> Hi,
>>
>> On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao(a)quicinc.com>
>> wrote:
>>> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>>>
...
>>>
>>> Hi Suzuki,
>>>
>>> This issue happens when race condition happens.
>>> The condition is that the device and its child_device's probe happens
>>> at the same time.
>>>
>>> For example: device0 and its child device device1.
>>> Both of them are calling coresight_register function. device0 is
>>> calling coresight_fixup_device_conns.
>>> device1 is waiting for device0 to release the coresight_mutex.
>>> Because device1's csdev node is allocated,
>>> coresight_make_links will be called for device0. Then in
>>> coresight_add_sysfs_link, has_conns_grp is true
>>> for device0, but has_conns_grp is false for device1 as has_conns_grp
>>> is set to true in coresight_create_conns_sysfs_group .
>>> The probe of device0 will fail for at this condition.
>>>
>>>
>>> struct coresight_device *coresight_register(struct coresight_desc *desc)
>>> {
>>> .........
>>> mutex_lock(&coresight_mutex);
>>>
>>> ret = coresight_create_conns_sysfs_group(csdev);
>>> if (!ret)
>>> ret = coresight_fixup_device_conns(csdev);
>>> if (!ret)
>>> ret = coresight_fixup_orphan_conns(csdev);
>>> if (!ret && cti_assoc_ops && cti_assoc_ops->add)
>>> cti_assoc_ops->add(csdev);
>>>
>>> mutex_unlock(&coresight_mutex);
>>>
>>> .........
>>>
>>> }
>>>
>>> static int coresight_fixup_device_conns(struct coresight_device *csdev)
>>> {
>>> ..........
>>> conn->child_dev =
>>> coresight_find_csdev_by_fwnode(conn->child_fwnode);
>> The issue appears to be a constraint hidden in the lower layers of the
>> code.
>> Would a better solution not be to alter the code here:
>>
>> if (conn->child_dev && conn->child_dev->has_conns_grp) {
>> ...
>> } else {
>> csdev->orphan = true;
>> }
>>
>> which would mean that the connection attempt would drop through to
>> label the connection as an orphan, to be cleaned up by the child
>> itself when it runs coresight_fixup_orphan_conns()
>>
Tnanks Mike, I think that is a good solution. Alternatively, we
could make sure that device_register() and the fixup following
that are atomic.
i.e.
mutex_lock()
device_register()
fixup_connections()
create_sysfs()
mutex_unlock();
The fix may be a bit invasive than Mike's proposal, but it makes
sure we don't end up with half baked device on the coresight-bus.
Suzuki
Hi,
On Tue, 1 Mar 2022 at 11:42, Jinlong Mao <quic_jinlmao(a)quicinc.com> wrote:
>
> On 2/28/2022 10:51 PM, Suzuki K Poulose wrote:
>
> Hi Jinlong
>
> On 28/02/2022 13:31, Mao Jinlong wrote:
>
> From: Mao Jinlong <jinlmao(a)qti.qualcomm.com>
>
> It is possible that when device probe, its child device is not
> probed. Then it will fail when add sysfs connection for the device.
> Make device defer probe when the child device is not probed.
>
>
> Please could you a bit a more specific on the exact issue ?
> I don't see a problem with probe deferral right now, with
> coresight/next.
>
> For e.g.,
>
> root@juno-server:~# lsmod
> Module Size Used by
> coresight 73728 0
> root@juno-server:~# ls /sys/bus/coresight/devices/
> root@juno-server:~# modprobe coresight-etm4x
> root@juno-server:~# lsmod
> Module Size Used by
> coresight_etm4x 81920 0
> coresight 73728 1 coresight_etm4x
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0 etm1
>
> -- Note etm2-etm5 doesn't appear --
>
> root@juno-server:~# modprobe coresight-funnel
> root@juno-server:~# lsmod
> Module Size Used by
> coresight_funnel 20480 0
> coresight_etm4x 81920 0
> coresight 73728 2 coresight_etm4x,coresight_funnel
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0 etm1
>
> -- Still don't appear ---
>
> root@juno-server:~# modprobe coresight-replicator
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0 etm1
> root@juno-server:~# modprobe coresight-tmc
>
> -- At this stage, the devices automatically get probed and appear --
> root@juno-server:~# ls /sys/bus/coresight/devices/
> etm0 etm1 etm2 etm3 etm4 etm5 funnel0 funnel1 funnel2 tmc_etf0 tmc_etr0
>
>
> root@juno-server:~# lsmod
> Module Size Used by
> coresight_tmc 40960 0
> coresight_replicator 20480 0
> coresight_funnel 20480 0
> coresight_etm4x 81920 0
> coresight 73728 4 coresight_tmc,coresight_etm4x,coresight_replicator,coresight_funnel
>
> So, my question is, what is this patch trying to solve ?
>
>
> Cheers
> Suzuki
>
> Hi Suzuki,
>
> This issue happens when race condition happens.
> The condition is that the device and its child_device's probe happens at the same time.
>
> For example: device0 and its child device device1.
> Both of them are calling coresight_register function. device0 is calling coresight_fixup_device_conns.
> device1 is waiting for device0 to release the coresight_mutex. Because device1's csdev node is allocated,
> coresight_make_links will be called for device0. Then in coresight_add_sysfs_link, has_conns_grp is true
> for device0, but has_conns_grp is false for device1 as has_conns_grp is set to true in coresight_create_conns_sysfs_group .
> The probe of device0 will fail for at this condition.
>
>
> struct coresight_device *coresight_register(struct coresight_desc *desc)
> {
> .........
> mutex_lock(&coresight_mutex);
>
> ret = coresight_create_conns_sysfs_group(csdev);
> if (!ret)
> ret = coresight_fixup_device_conns(csdev);
> if (!ret)
> ret = coresight_fixup_orphan_conns(csdev);
> if (!ret && cti_assoc_ops && cti_assoc_ops->add)
> cti_assoc_ops->add(csdev);
>
> mutex_unlock(&coresight_mutex);
>
> .........
>
> }
>
> static int coresight_fixup_device_conns(struct coresight_device *csdev)
> {
> ..........
> conn->child_dev =
> coresight_find_csdev_by_fwnode(conn->child_fwnode);
The issue appears to be a constraint hidden in the lower layers of the code.
Would a better solution not be to alter the code here:
if (conn->child_dev && conn->child_dev->has_conns_grp) {
...
} else {
csdev->orphan = true;
}
which would mean that the connection attempt would drop through to
label the connection as an orphan, to be cleaned up by the child
itself when it runs coresight_fixup_orphan_conns()
Regards
Mike
> if (conn->child_dev) {
> ret = coresight_make_links(csdev, conn,
>
> conn->child_dev);
>
> ..........
>
> }
>
>
> int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
> {
> ................
> if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
> return -EINVAL;
>
>
>
> The probe fail issue is reproduced with reboot stress test on our internal device.
>
> With the patch, the probe fail issue is not reproduced.
>
> Thanks
>
> Jinlong Mao
>
>
>
> Signed-off-by: Mao Jinlong <jinlmao(a)qti.qualcomm.com>
> ---
> drivers/hwtracing/coresight/coresight-sysfs.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
> index 34d2a2d31d00..7df9eb59bf2c 100644
> --- a/drivers/hwtracing/coresight/coresight-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-sysfs.c
> @@ -73,8 +73,10 @@ int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
> if (!info->orig || !info->target ||
> !info->orig_name || !info->target_name)
> return -EINVAL;
> - if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
> + if (!info->orig->has_conns_grp)
> return -EINVAL;
> + if (!info->target->has_conns_grp)
> + return -EPROBE_DEFER;
> /* first link orig->target */
> ret = sysfs_add_link_to_group(&info->orig->dev.kobj,
>
>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
OpenCSD v1.2.1 has been released.
This contains a bugfix affecting the sequencing of output packets
around context updates - which can affect clients ability to provide
correct memory images to the library in certain circumstances.
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK