This allows enabling branch broadcast for Perf hosted sessions (the option
is currently only available for the sysfs interface). Hopefully this could
be useful for testing the decode in perf, for example does a determinisitic
run with branch broadcast enabled look the same as with it disabled? It
could also be used for scenarios like OpenJ9's support for JIT code:
java -Xjit:perfTool hello.java
Currently this is not working and you get the usual errors of a missing
DSO, but branch broadcast would have to be enabled anyway before working
through this next issue:
CS ETM Trace: Debug data not found for address 0xffff7b94b058 in /tmp/perf-29360.map
Address range support in Perf for branch broadcast has also not been added
here, but could be added later.
The documentation has been refactored slightly to allow updates to be made
that link the Perf format arguments with the existing documentation.
For Suzuki's comment, I will do it as a separate change that converts all
the other hard coded values to a more consistent sysreg.h style:
nit: While at this, please could you change the hard coded value
to ETM4_CFG_BIT_RETSTK ?
Changes since v1:
* Added Leo's reviewed by on patch 3
* Fix bracket styling
* Add documentation
Applies on top of coresight/next efa56eddf5d5c. But this docs fix is also
required to get the links to work:
https://marc.info/?l=linux-doc&m=164139331923986&w=2
Also available at: https://gitlab.arm.com/linux-arm/linux-jc/-/tree/james-branch-broadcast-v2
James Clark (6):
coresight: Add config flag to enable branch broadcast
coresight: Fail to open with return stacks if they are unavailable
perf cs-etm: Update deduction of TRCCONFIGR register for branch
broadcast
Documentation: coresight: Turn numbered subsections into real
subsections
Documentation: coresight: Link config options to existing
documentation
Documentation: coresight: Expand branch broadcast documentation
.../coresight/coresight-etm4x-reference.rst | 14 ++++-
Documentation/trace/coresight/coresight.rst | 56 +++++++++++++++++--
.../hwtracing/coresight/coresight-etm-perf.c | 2 +
.../coresight/coresight-etm4x-core.c | 23 ++++++--
include/linux/coresight-pmu.h | 2 +
tools/include/linux/coresight-pmu.h | 2 +
tools/perf/arch/arm/util/cs-etm.c | 3 +
7 files changed, 92 insertions(+), 10 deletions(-)
--
2.28.0
From: Carsten Haitzler <carsten.haitzler(a)arm.com>
Perf test's shell runner will just run everything in the tests
directory (as long as it's not another directory or does not begin
with a dot), but sometimes you find files in there that are not shell
scripts - perf.data output for example if you do some testing and then
the next time you run perf test it tries to run these. Check the files
are executable so they are actually intended to be test scripts and
not just some "random junk" files there.
Signed-off-by: Carsten Haitzler <carsten.haitzler(a)arm.com>
---
tools/perf/tests/builtin-test.c | 4 +++-
tools/perf/util/path.c | 12 ++++++++++++
tools/perf/util/path.h | 1 +
3 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 8cb5a1c3489e..ece272b55587 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -295,7 +295,9 @@ static const char *shell_test__description(char *description, size_t size,
#define for_each_shell_test(entlist, nr, base, ent) \
for (int __i = 0; __i < nr && (ent = entlist[__i]); __i++) \
- if (!is_directory(base, ent) && ent->d_name[0] != '.')
+ if (!is_directory(base, ent) && \
+ is_executable_file(base, ent) && \
+ ent->d_name[0] != '.')
static const char *shell_tests__dir(char *path, size_t size)
{
diff --git a/tools/perf/util/path.c b/tools/perf/util/path.c
index caed0336429f..7dde8c230ae8 100644
--- a/tools/perf/util/path.c
+++ b/tools/perf/util/path.c
@@ -92,3 +92,15 @@ bool is_directory(const char *base_path, const struct dirent *dent)
return S_ISDIR(st.st_mode);
}
+
+bool is_executable_file(const char *base_path, const struct dirent *dent)
+{
+ char path[PATH_MAX];
+ struct stat st;
+
+ sprintf(path, "%s/%s", base_path, dent->d_name);
+ if (stat(path, &st))
+ return false;
+
+ return !S_ISDIR(st.st_mode) && (st.st_mode & S_IXUSR);
+}
diff --git a/tools/perf/util/path.h b/tools/perf/util/path.h
index 083429b7efa3..d94902c22222 100644
--- a/tools/perf/util/path.h
+++ b/tools/perf/util/path.h
@@ -12,5 +12,6 @@ int path__join3(char *bf, size_t size, const char *path1, const char *path2, con
bool is_regular_file(const char *file);
bool is_directory(const char *base_path, const struct dirent *dent);
+bool is_executable_file(const char *base_path, const struct dirent *dent);
#endif /* _PERF_PATH_H */
--
2.32.0
Hi Jinlong
On 28/02/2022 13:31, Mao Jinlong wrote:
> From: Mao Jinlong <jinlmao(a)qti.qualcomm.com>
>
> It is possible that when device probe, its child device is not
> probed. Then it will fail when add sysfs connection for the device.
> Make device defer probe when the child device is not probed.
Please could you a bit a more specific on the exact issue ?
I don't see a problem with probe deferral right now, with
coresight/next.
For e.g.,
root@juno-server:~# lsmod
Module Size Used by
coresight 73728 0
root@juno-server:~# ls /sys/bus/coresight/devices/
root@juno-server:~# modprobe coresight-etm4x
root@juno-server:~# lsmod
Module Size Used by
coresight_etm4x 81920 0
coresight 73728 1 coresight_etm4x
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0 etm1
-- Note etm2-etm5 doesn't appear --
root@juno-server:~# modprobe coresight-funnel
root@juno-server:~# lsmod
Module Size Used by
coresight_funnel 20480 0
coresight_etm4x 81920 0
coresight 73728 2 coresight_etm4x,coresight_funnel
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0 etm1
-- Still don't appear ---
root@juno-server:~# modprobe coresight-replicator
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0 etm1
root@juno-server:~# modprobe coresight-tmc
-- At this stage, the devices automatically get probed and appear --
root@juno-server:~# ls /sys/bus/coresight/devices/
etm0 etm1 etm2 etm3 etm4 etm5 funnel0 funnel1 funnel2 tmc_etf0
tmc_etr0
root@juno-server:~# lsmod
Module Size Used by
coresight_tmc 40960 0
coresight_replicator 20480 0
coresight_funnel 20480 0
coresight_etm4x 81920 0
coresight 73728 4
coresight_tmc,coresight_etm4x,coresight_replicator,coresight_funnel
So, my question is, what is this patch trying to solve ?
Cheers
Suzuki
>
> Signed-off-by: Mao Jinlong <jinlmao(a)qti.qualcomm.com>
> ---
> drivers/hwtracing/coresight/coresight-sysfs.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-sysfs.c b/drivers/hwtracing/coresight/coresight-sysfs.c
> index 34d2a2d31d00..7df9eb59bf2c 100644
> --- a/drivers/hwtracing/coresight/coresight-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-sysfs.c
> @@ -73,8 +73,10 @@ int coresight_add_sysfs_link(struct coresight_sysfs_link *info)
> if (!info->orig || !info->target ||
> !info->orig_name || !info->target_name)
> return -EINVAL;
> - if (!info->orig->has_conns_grp || !info->target->has_conns_grp)
> + if (!info->orig->has_conns_grp)
> return -EINVAL;
> + if (!info->target->has_conns_grp)
> + return -EPROBE_DEFER;
>
> /* first link orig->target */
> ret = sysfs_add_link_to_group(&info->orig->dev.kobj,
If a profiling program runs in a non-root PID namespace and CoreSight
driver enables PID tracing (with contextID), it can lead to mismatching
issue between the context ID traced in hardware (from the root
namespace) and the PIDs gathered by profiling tool (e.g. perf) in its
non-root namespace.
CoreSight driver has tried to address this issue for the contextID
related interfaces under sysfs, but it misses to prevent user to set
VMID (virtual contextID) for kernel runs in EL2 with VHE; furthermore,
it misses to handle the case when the profiling tool runs in the
non-root PID namespace.
For this reason, this patch series is to correct contextID tracing for
non-root namespace. After applied this patchset, patch 02 doesn't
permit users to access virtual contextID via sysfs nodes in the non-root
PID namespace, patch 03 and 04 stop to trace PID packet for non-root PID
namespace.
This patch series has been rebased on the mainline kernel and applied
cleanly on latest commit dcb85f85fa6f ("gcc-plugins/stackleak: Use
noinstr in favor of notrace").
Leo Yan (4):
coresight: etm4x: Add lock for reading virtual context ID comparator
coresight: etm4x: Don't use virtual contextID for non-root PID
namespace
coresight: etm4x: Don't trace PID for non-root PID namespace
coresight: etm3x: Don't trace PID for non-root PID namespace
.../coresight/coresight-etm3x-core.c | 4 +++
.../coresight/coresight-etm4x-core.c | 10 +++++--
.../coresight/coresight-etm4x-sysfs.c | 30 +++++++++++++++++++
3 files changed, 42 insertions(+), 2 deletions(-)
--
2.25.1
On 07/02/2022 05:44, Anshuman Khandual wrote:
> Hi James,
>
> These are all ETM4X specific changes. Something like this might be cleaner
> and also more compact. Also would suggest to follow the same for subsequent
> patches as well.
>
> coresight: etm4x: Cleanup TRCIDR0 register accesses
>
> Consistency with sysreg.h could be mentioned in the commit message itself.
>
> On 2/3/22 5:35 PM, James Clark wrote:
>> This is a no-op change for style and consistency and has no effect on the
>> binary produced by gcc-11.
>
> This patch adds register definitions, helper macros as well. Please expand
> the commit message to add more details. This is too short, for the change
> it creates. BTW why is it necessary to mention GCC version number here.
>
>>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> .../coresight/coresight-etm4x-core.c | 36 +++++--------------
>> drivers/hwtracing/coresight/coresight-etm4x.h | 17 +++++++++
>> drivers/hwtracing/coresight/coresight-priv.h | 5 +++
>> 3 files changed, 30 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> index e2eebd865241..107e81948f76 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> @@ -1091,41 +1091,21 @@ static void etm4_init_arch_data(void *info)
>> etmidr0 = etm4x_relaxed_read32(csa, TRCIDR0);
>>
>> /* INSTP0, bits[2:1] P0 tracing support field */
>> - if (BMVAL(etmidr0, 1, 2) == 0b11)
>> - drvdata->instrp0 = true;
>> - else
>> - drvdata->instrp0 = false;
>> -
>> + drvdata->instrp0 = !!(REG_VAL(etmidr0, TRCIDR0_INSTP0) == 0b11);
>> /* TRCBB, bit[5] Branch broadcast tracing support bit */
>> - if (BMVAL(etmidr0, 5, 5))
>> - drvdata->trcbb = true;
>> - else
>> - drvdata->trcbb = false;
>> -
>> + drvdata->trcbb = !!(etmidr0 & TRCIDR0_TRCBB);
>> /* TRCCOND, bit[6] Conditional instruction tracing support bit */
>> - if (BMVAL(etmidr0, 6, 6))
>> - drvdata->trccond = true;
>> - else
>> - drvdata->trccond = false;
>> -
>> + drvdata->trccond = !!(etmidr0 & TRCIDR0_TRCCOND);
>> /* TRCCCI, bit[7] Cycle counting instruction bit */
>> - if (BMVAL(etmidr0, 7, 7))
>> - drvdata->trccci = true;
>> - else
>> - drvdata->trccci = false;
>> -
>> + drvdata->trccci = !!(etmidr0 & TRCIDR0_TRCCCI);
>> /* RETSTACK, bit[9] Return stack bit */
>> - if (BMVAL(etmidr0, 9, 9))
>> - drvdata->retstack = true;
>> - else
>> - drvdata->retstack = false;
>> -
>> + drvdata->retstack = !!(etmidr0 & TRCIDR0_RETSTACK);
>> /* NUMEVENT, bits[11:10] Number of events field */
>> - drvdata->nr_event = BMVAL(etmidr0, 10, 11);
>> + drvdata->nr_event = REG_VAL(etmidr0, TRCIDR0_NUMEVENT);
>> /* QSUPP, bits[16:15] Q element support field */
>> - drvdata->q_support = BMVAL(etmidr0, 15, 16);
>> + drvdata->q_support = REG_VAL(etmidr0, TRCIDR0_QSUPP);
>> /* TSSIZE, bits[28:24] Global timestamp size field */
>> - drvdata->ts_size = BMVAL(etmidr0, 24, 28);
>> + drvdata->ts_size = REG_VAL(etmidr0, TRCIDR0_TSSIZE);
>>
>> /* maximum size of resources */
>> etmidr2 = etm4x_relaxed_read32(csa, TRCIDR2);
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
>> index 3c4d69b096ca..2bd8ad953b8e 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
>> @@ -130,6 +130,23 @@
>>
>> #define TRCRSR_TA BIT(12)
>>
>> +/*
>> + * Bit positions of registers that are defined above, in the sysreg.h style
>> + * of _MASK, _SHIFT and BIT().
>> + */
>
> ^^^ not really necessary. Instead the format requirement for below mentioned
> CORESIGHT_REG_VAL() macro might be relevant and should be mentioned.
>
>> +#define TRCIDR0_INSTP0_SHIFT 1
>> +#define TRCIDR0_INSTP0_MASK GENMASK(1, 0)
>> +#define TRCIDR0_TRCBB BIT(5)
>> +#define TRCIDR0_TRCCOND BIT(6)
>> +#define TRCIDR0_TRCCCI BIT(7)
>> +#define TRCIDR0_RETSTACK BIT(9)
>> +#define TRCIDR0_NUMEVENT_SHIFT 10
>> +#define TRCIDR0_NUMEVENT_MASK GENMASK(1, 0)
>> +#define TRCIDR0_QSUPP_SHIFT 15
>> +#define TRCIDR0_QSUPP_MASK GENMASK(1, 0)
>> +#define TRCIDR0_TSSIZE_SHIFT 24
>> +#define TRCIDR0_TSSIZE_MASK GENMASK(4, 0)
>> +
>> /*
>> * System instructions to access ETM registers.
>> * See ETMv4.4 spec ARM IHI0064F section 4.3.6 System instructions
>> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
>> index ff1dd2092ac5..1452c6038421 100644
>> --- a/drivers/hwtracing/coresight/coresight-priv.h
>> +++ b/drivers/hwtracing/coresight/coresight-priv.h
>> @@ -36,6 +36,11 @@
>>
>> #define TIMEOUT_US 100
>> #define BMVAL(val, lsb, msb) ((val & GENMASK(msb, lsb)) >> lsb)
>> +/*
>> + * Extract a field from a register where field is #defined in the form
>> + * <register_name>_<field_name>_MASK and <register_name>_<field_name>_SHIFT
>> + */
>
> Looking at the usage, <register_name> is already embedded in <filed_name>. So
> it requires <field_name>_SHIFT and <field_name>_MASK instead. Unless register
> name should be passed as separate argument (which actually might be better).
>
> REG_VAL(etmidr0, TRCIDR0_TSSIZE) ----> REG_VAL(etmidr0, TRCIDR0, TSSIZE)
I don't see much difference here. So I am fine either way.
>
> with some restructuring in the comment ..
>
> /*
> * Extract a field from a coresight register
> *
> * Required fields are defined as macros like the following
> *
> * <register_name>_<field_name>_MASK and <register_name>_<field_name>_SHIFT
> */
>
>> +#define REG_VAL(val, field) ((val & (field##_MASK << field##_SHIFT)) >> field##_SHIFT)
>
> This is too generic to be in a coresight header or it should just be
> named CORESIGHT_REG_VAL() instead, making it more specific for here.
>
> The build should fail in case any required macro definition is absent.
> I guess no more fortification is required in case macros are missing.
>
> However CORESIGHT_REG_VAL() is better placed in <coresight-etm4x.h>
> just before all the dependent SHIFT/MASK register field definition
> starts.
Not necessarily. CORESIGHT_REG_VAL() is a generic function and doesn't
have anything specific to do with etm4x. We could reuse that for
cleaning up other drivers in CoreSight.
Cheers
Suzuki
Hi,
We are seeing an issue with doing Coresight decode off target where
initially the correct dso from ~/.debug is used, but after a new thread
in the perf.data file is passed with its mmap record, the local version
of the dso is picked up instead. This happens if the binary exists in the
same path on both devices, for example /bin/ls.
Initially when parsing the build-ids in the header, the dso for /bin/ls
will be created, and the file will correctly point to
~/.debug/bin/ls/2f15ad836be3339dec0e2e6a3c637e08e48aacbd/elf, but for any
new threads or mmaps that are also for /bin/ls, they will not have a
build-id set so they point to /bin/ls on the local machine rather than the
debug folder.
To fix this I made it possible to look up which existing dsos have
build id's set that originate from the header and then copy that build-id
onto the new dso if the name matches. Another way to do it would
be to stop comparing the mmap id so it matches on filename only, but I
think we do want to differentiate between different mmaps, even if they
have the same name, which is how it works in this version.
Applies to perf/core 859f7e4554
Thanks
James
James Clark (1):
perf: Set build-id using build-id header on new mmap records
tools/perf/util/dso.h | 1 +
tools/perf/util/header.c | 1 +
tools/perf/util/map.c | 16 ++++++++++++++--
3 files changed, 16 insertions(+), 2 deletions(-)
--
2.28.0