CoreSight September 2018

coresight@lists.linaro.org

25 participants
25 discussions

by Ajith Kurian Issac

Hi, We are trying to us the Open CSD for decoding a onchip trace in our ETB. The trace was enabled and is captured in the ETB. We read the trace back and dumped it into a text file. I am attaching it. How can I use the CSD tool to decode it? Thanks Ajith

7 years

[RFC v3 0/2] perf: Support for Arm A32/T32 instruction sets

by Robert Walker

Hi, I'm taking this back to the linaro coresight list so we can get the OpenCSD library versioning sorted out. The first patch splits the OpenCSD feature check into two parts. The original check is left as is - this just checks for the presence of an OpenCSD library. A new check (libopencsd-numinstr) is added that checks for the new OpenCSD (>0.9.0) that has the num_instr_range member in the ocsd_generic_trace_elem struct. This feature is then used to set a flag used in cs-etm-decoder.c to select which versions of 2 functions are used to get the instruction count / last instruction size of each instruction block - if the flag is not set, then the previous assumptions of a 4 byte instruction size are used. It was suggested that OpenCSD should export a version header - I agree this is a good idea, but this will require a new release of the library, so we would miss support for the instruction sizes when OpenCSD 0.9.{0,1,2} is installed - hence why I've kept the version check using the presence of num_instr_range. The second patch adds support for finding the T32 instruction counts when the OpenCSD library doesn't report the instruction counts. As this involves iterating through the block of instructions and examining each instruction, there is a significant peformance hit (about 5x slower than using the OpenCSD library to report the instruction counts), so I'm not sure this patch should go into upstream. Regards Rob Robert Walker (2): perf: Support for Arm A32/T32 instruction sets in CoreSight trace perf: Full support for Arm T32 instructions with older version of OpenCSD tools/build/Makefile.feature | 3 +- tools/build/feature/Makefile | 4 + tools/build/feature/test-libopencsd-numinstr.c | 15 ++++ tools/perf/Makefile.config | 3 + tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 106 ++++++++++++++++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 +++ tools/perf/util/cs-etm.c | 71 +++++++--------- 7 files changed, 171 insertions(+), 41 deletions(-) create mode 100644 tools/build/feature/test-libopencsd-numinstr.c -- 2.7.4

7 years

[PATCH v2 00/11] dts: Update coresight device tree bindings

by Suzuki K Poulose

Coresight DT bindings have been updated to obey the DTS rules for label/address matching for graph nodes. The changes are in coresight/next tree scheduled for v4.20. This series updates the in kernel dts to match the new bindings along with updating a couple of new examples (e.,g CATU) in the Documentation (which were missed as they were still in flight when we created the series). Please note that this should not be pulled for v4.19, which I think is a safe assumption. But please do pull it for v4.20. The dt updates for the Juno boards were sent earlier with the original DT update series and has been queued for v4.20. Applies on coresight/next (which is based on v4.19) and should apply cleanly on v4.19-rc3. Changes since V1: - Avoid "avoid_unnecessary_addr_size" warnings by removing #address-cells/#size-cells for single port with address 0. - Fix TPIU inport for qcom msm8196. (Leo Yan) - Fix documentation example for TPIU (Leo Yan) - Fix subject tags (as pointed out by Leo and Shawn) - Drop patch for TC2, which has been queued by Sudeep Cc: Alexandre Belloni <alexandre.belloni(a)bootlin.com> Cc: Andy Gross <andy.gross(a)linaro.org> Cc: Benoît Cousson <bcousson(a)baylibre.com> Cc: David Brown <david.brown(a)linaro.org> Cc: Fabio Estevam <fabio.estevam(a)nxp.com> Cc: Frank Rowand <frowand.list(a)gmail.com> Cc: Ivan T. Ivanov <ivan.ivanov(a)linaro.org> Cc: Linus Walleij <linus.walleij(a)linaro.org> Cc: linux-omap(a)vger.kernel.org Cc: lipengcheng8(a)huawei.com Cc: Liviu Dudau <liviu.dudau(a)arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Nicolas Ferre <nicolas.ferre(a)microchip.com> Cc: orsonzhai(a)gmail.com Cc: Pengutronix Kernel Team <kernel(a)pengutronix.de> Cc: Rob Herring <robh(a)kernel.org> Cc: Sascha Hauer <s.hauer(a)pengutronix.de> Cc: Shawn Guo <shawnguo(a)kernel.org> Cc: Sudeep Holla <sudeep.holla(a)arm.com> Cc: Tony Lindgren <tony(a)atomide.com> Cc: Wei Xu <xuwei5(a)hisilicon.com> Cc: xuwei5(a)hisilicon.com Cc: zhang.lyra(a)gmail.com Cc: arm(a)kernel.org Suzuki K Poulose (11): coresight: dts: binding: Fix example for TPIU component coresight: dts: binding: Update coresight binding examples arm64: dts: hi6220: Update coresight bindings for hardware ports arm64: dts: sc9836/sc9860: Update coresight bindings for hardware ports arm64: dts: msm8916: Update coresight bindings for hardware ports arm: dts: hip04: Update coresight bindings for hardware ports arm: dts: imx7: Update coresight binding for hardware ports arm: dts: omap: Update coresight bindings for hardware ports arm: dts: qcom: Update coresight bindings for hardware ports arm: dts: sama5d2: Update coresight bindings for hardware ports arm: dts: ste-dbx5x0: Update coresight bindings for hardware port .../devicetree/bindings/arm/coresight.txt | 27 +- arch/arm/boot/dts/hip04.dtsi | 346 +++++++++--------- arch/arm/boot/dts/imx7d.dtsi | 14 +- arch/arm/boot/dts/imx7s.dtsi | 82 ++--- arch/arm/boot/dts/omap3-beagle-xm.dts | 17 +- arch/arm/boot/dts/omap3-beagle.dts | 17 +- arch/arm/boot/dts/qcom-apq8064.dtsi | 71 ++-- arch/arm/boot/dts/qcom-msm8974.dtsi | 104 +++--- arch/arm/boot/dts/sama5d2.dtsi | 17 +- arch/arm/boot/dts/ste-dbx5x0.dtsi | 65 ++-- .../boot/dts/hisilicon/hi6220-coresight.dtsi | 181 +++++---- arch/arm64/boot/dts/qcom/msm8916.dtsi | 95 ++--- arch/arm64/boot/dts/sprd/sc9836.dtsi | 82 +++-- arch/arm64/boot/dts/sprd/sc9860.dtsi | 215 +++++------ 14 files changed, 682 insertions(+), 651 deletions(-) -- 2.19.0

7 years, 1 month

AutoFDO ETM strobe patch available.

by Mike Leach

Hello, As promised at teh recent Linaro Connect, the patch to enable ETM strobing for AutoFDO is now available on github/Linaro/perf-opencsd, branch master-4.19-rc1-afdo-etm-strobe. https://github.com/Linaro/perf-opencsd/commits/master-4.19-rc1-afdo-etm-str… Regards Mike -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK

7 years, 1 month

[PATCH 0/3] perf tools: Add support for ETMv3/PTM1.1 decoding

by Mathieu Poirier

This set _should_ add support for ETMv3/PTM1.1 trace decoding. It was produced close to two years ago on top of a code base that no longer exist. At the time though, it did work. So I've rebased the work to the current coresight next branch. It apply and compiles cleanly but other than that, I can't offer any guarantee of proper operation. I am currently traveling and don't have access to a platform where it can be tested. Even if I was, I do not have the bandwidth to work on the feature. As such I am releasing it on this list, in the hope that it can help someone get started with trace decoding on ETMv3/PTM1.1. Let me know how bad it crashes. Mathieu Mathieu Poirier (3): perf tools: Add configuration for ETMv3 trace protocol perf tools: Add support for ETMv3 trace decoding perf tools: Add support for PTMv1.1 decoding tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 31 +++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 9 +++ tools/perf/util/cs-etm.c | 73 ++++++++++++++++++++----- 3 files changed, 99 insertions(+), 14 deletions(-) -- 2.7.4

7 years, 1 month

Re: Coresight traces of the same application decode differently.

by Mike Bazov

Hello, Sorry for the delay. I appreciate your posts. I have recorded a different program now("ping 8.8.8.8"), and it seems that decoding the trace using the "ping" ELF file gives no issues now. I cannot explain how "ls" is the only corrupt trace(i rerecorded, same results). Perhaps the image is indeed wrong. I will check it further. Thank you very much! On Thu, Sep 20, 2018 at 10:42 AM, Mike Bazov <mike(a)perception-point.io> wrote: > Hello, > > Sorry for the delay. I appreciate your posts. > > I have recorded a different program now("ping 8.8.8.8"), and it seems that > decoding > the trace using the "ping" ELF file gives no issues now. I cannot explain > how "ls" > is the only corrupt trace(i rerecorded, same results). Perhaps the image > is indeed wrong. > I will check it further. > > Thank you very much! > > Mike. > > > On Thu, Sep 20, 2018 at 1:28 AM, Mike Leach <mike.leach(a)linaro.org> wrote: > >> Hi Mike, >> >> I have looked into this issue further, found my previous assumption to >> be wrong, and unfortunately have come to the conclusion that the >> generated trace is somehow wrong / corrupt, or the supplied image is >> not what was run when the trace was generated. >> >> If you look at the attached analysis of the trace generated from the >> ls_api.cs data [analysis001.txt] This is at the very start of the >> traced image. >> >> The first few packets [raw packets (0)] show the sync and start at >> 00000000004003f0 <_start>: >> followed by the first 'E' atom that marks the branch to 0x41a158. The >> next two 'E' atoms get us to 0x41a028. >> >> At this point we get an exception packet, followed by a preferred >> return address packet [ raw packets (2)]. >> This return address is 0x400630. >> >> The rules from the ETM architecture specification 4.0-4.4 p6-242 state:- >> >> "The Exception packet contains an address. This means that execution >> has continued from the target of the most >> recent P0 element, up to, but not including, that address, and a trace >> analyzer must analyze each instruction in this >> range." >> >> Thus the decoder is required to analyze from the previous P0 element - >> the 'E' atom that marked the branch to 0x41a028, until the preferred >> return address. >> This is actually lower than the start address, which results in a huge >> range seen here, and also seen by you in the example you described. >> The decoder effectively runs off the end of the memory image before it >> stops. >> >> The trace should be indicating an address after but relatively close >> to 0x41a028 - as otherwise an atom would have been emitted by the cbnz >> 41a054. >> >> If I examine the start of the perf_ls.cs decode, I see the same 3 'E' >> atoms followed by the odd data fault exception. >> >> So for the first few branches at least, the perf and api captures go >> in the same direction. >> >> Given the it is unlikely that the generated trace packets are >> incorrect - it seems more likely that the 'ls' image being used for >> decode is not what is generating this trace. Since we have to analyze >> opcodes to follow the 'E' and 'N' atoms, decode relies on accurate >> memory images being fed into the decoder. The only actual addresses we >> have explicitly stated in the trace are the start: 0x4003f0, and the >> exception return address 0x400360. The others are synthesized from the >> supplied image. >> >> There may be a case for checking when decoding the exception packet >> that the address is not behind the current location and throwing an >> error, but beyond that I do not at this point believe that the decoder >> is at fault. >> >> Regards >> >> Mike >> >> >> >> On 18 September 2018 at 19:32, Mike Leach <mike.leach(a)linaro.org> wrote: >> > Hi Mike, >> > >> > I've looked further at this today, and can see a location where a >> > large block appears in both the api and perf trace data on decode >> > using the library test program. >> > >> > There does appear to be an issue if the decoder is in a "waiting for >> > address state" i.e. it has lost track usually because an area of >> > memory is unavailable, and an exception packet is seen - the exception >> > address appears to be used twice - both to complete an address range >> > and as an exception return - hence in this case the improbable large >> > block. I need to look into this in more detail and fix it up. >> > >> > However - I am seeing before this the api and perf decodes have >> > diverged, which suggests an issue elsewhere too perhaps. I do need to >> > look deeper into this as well. >> > I am not 100% certain that using the ls.bin as a full memory image at >> > 0x400000 is necessarily working in the snapshot tests - there might be >> > another offset needed to access the correct opcodes for the trace. >> > >> > I'll let you know if I make further progress. >> > >> > >> > On 17 September 2018 at 16:53, Mike Leach <mike.leach(a)linaro.org> >> wrote: >> >> Hi Mike, >> >> >> >> I've looked at the data you supplied. >> >> >> >> I created test snapshot directories so that I could run each of the >> >> trace data files through the trc_pkt_lister test program (the attached >> >> .tgz file contains these, plus the results). >> >> >> >> Now the two trace files are different sizes - this is explained by the >> >> fact that the api trace run had cycle counts switched on, whereas the >> >> perf run did not - plus the perf run turned off the trace while in >> >> kernel calls - the api left the trace on, though filtering out the >> >> kernel - but a certain amount of sync packets have come through adding >> >> to the size. >> >> >> >> Now looking at the results I cannot see the 0x4148f4 location in >> >> either trace dump (perf_ls2.ppl and api_ls2.ppl in the .tgz). >> >> >> >> There are no obvious differences I could detect in the results, though >> >> they are difficult to compare given the difference in output. >> >> >> >> The effect you are seeing does look like some sort of runaway - with >> >> the decoder searching for opcodes - possibly in a section of the ls >> >> binary file that does not contain executable code - till it happens >> >> upon something that looks like an opcode. >> >> >> >> At this point I cannot explain the difference you and I are seeing >> >> given the data provided. Can you look at the snapshot results, and see >> >> if there is anything there? You can re-run the tests I ran if you >> >> rename ls to ls.bin and put on level up from the ss-perf or ss-api >> >> snapshot directories where the file is referenced to. >> >> >> >> Regards >> >> >> >> Mike >> >> >> >> >> >> >> >> >> >> On 17 September 2018 at 13:44, Mike Bazov <mike(a)perception-point.io> >> wrote: >> >>> Greetings, >> >>> >> >>> I recorded the program "ls" (statically linked to provide a single >> >>> executable as a memory accesses file). >> >>> >> >>> I recorded the program using perf, and then extracted the actual raw >> trace >> >>> data from the perf.data file using a little tool i wrote. I can use >> OpenCSD >> >>> to fully decode the trace produced by perf. >> >>> >> >>> I also recorded the "ls" util using an API i wrote from kernel mode. I >> >>> published the API here as an [RFC]. Basically, i start recording and >> stop >> >>> recording whenever the __process__ of my interest is scheduling in. >> >>> This post is not much about requesting a review for my API.. but i do >> have >> >>> some issues with the trace that is produced by this API, and i'm not >> quite >> >>> sure why. >> >>> >> >>> I use the OpenCSD directly in my code, and register a decoder >> callback for >> >>> every generic trace element. When my callback is called, i simply >> print the >> >>> element string representation(e.g. OCSD_GEN_TRC_ELEM_INSTR_RANGE). >> >>> >> >>> Now, the weird thing is the perf and API produce the same generic >> elements >> >>> until a certain element: >> >>> >> >>> OCSD_GEN_TRC_ELEM_TRACE_ON() >> >>> ... >> >>> ... >> >>> ... same elements... >> >>> ... same elements... >> >>> ... same elements... >> >>> ... >> >>> ... >> >>> >> >>> And eventually diverge from each other. I assume the perf trace is >> going in >> >>> the right direction, but my trace simply starts going nuts. The last >> >>> __common__ generic element is the following: >> >>> >> >>> OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x4148f4:[0x414910] >> (ISA=A64) E iBR >> >>> A64:ret ) >> >>> >> >>> After this element, perf trace goes in a different route, and the API >> right >> >>> afterwards produced a very weird instruction range element: >> >>> >> >>> OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x414910:[0x498a20] >> (ISA=A64) E --- >> >>> ) >> >>> >> >>> There is no way this 0x498a20 address was reached, and i cannot see >> any >> >>> proof for it in the trace itself(using ptm2human). It seems that the >> decoder >> >>> keeps decoding and disassembling opcodes until it reaches 0x498a20... >> my >> >>> memory callback(callback that is called if the decoder needs memory >> that >> >>> isn't present) is called for the address 0x498a20. From the on, the >> trace >> >>> just goes into a very weird path. I can't explain the address >> branches that >> >>> are taken from here on. >> >>> >> >>> >> >>> Any ideas on how to approach this? OpenCSD experts would be >> appreciated. >> >>> I have attached the perf and API trace, and the "ls" executable which >> is >> >>> loaded into address 0x400000. I also attached the ETMv4 config for >> every >> >>> trace(trace id, etc..). There is no need to create multiple decoders >> for >> >>> different trace ids, theres only a single ID for a single decoder. >> >>> >> >>> Thanks, >> >>> Mike. >> >>> >> >>> _______________________________________________ >> >>> CoreSight mailing list >> >>> CoreSight(a)lists.linaro.org >> >>> https://lists.linaro.org/mailman/listinfo/coresight >> >>> >> >> >> >> >> >> >> >> -- >> >> Mike Leach >> >> Principal Engineer, ARM Ltd. >> >> Manchester Design Centre. UK >> > >> > >> > >> > -- >> > Mike Leach >> > Principal Engineer, ARM Ltd. >> > Manchester Design Centre. UK >> >> >> >> -- >> Mike Leach >> Principal Engineer, ARM Ltd. >> Manchester Design Centre. UK >> > >

7 years, 1 month

New OpenCSD library release v0.9.2

by Mike Leach

Updated library: - fixes bug with generic exception packets being set with wrong exception number in ETMv4 - updates docs with latest AutoFDO instructions, and record.sh script -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK

7 years, 1 month

Coresight traces of the same application decode differently.

by Mike Bazov

Greetings, I recorded the program "ls" (statically linked to provide a single executable as a memory accesses file). I recorded the program using perf, and then extracted the actual raw trace data from the perf.data file using a little tool i wrote. I can use OpenCSD to fully decode the trace produced by perf. I also recorded the "ls" util using an API i wrote from kernel mode. I published the API here as an* [RFC]*. Basically, i start recording and stop recording whenever the __process__ of my interest is scheduling in. This post is not much about requesting a review for my API.. but i do have some issues with the trace that is produced by this API, and i'm not quite sure why. I use the OpenCSD directly in my code, and register a decoder callback for every generic trace element. When my callback is called, i simply print the element string representation(e.g. OCSD_GEN_TRC_ELEM_INSTR_RANGE). Now, the weird thing is the perf and API produce the same generic elements until a certain element: OCSD_GEN_TRC_ELEM_TRACE_ON() ... ... ... same elements... ... same elements... ... same elements... ... ... And eventually diverge from each other. I assume the perf trace is going in the right direction, but my trace simply starts going nuts. The last *__common__* generic element is the following: OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x4148f4:[0x414910] (ISA=A64) E iBR A64:ret ) After this element, perf trace goes in a different route, and the API right afterwards produced a very *weird *instruction range element: OCSD_GEN_TRC_ELEM_INSTR_RANGE(exec range=0x414910:[0x498a20] (ISA=A64) E --- ) There is no way this 0x498a20 address was reached, and i cannot see any proof for it in the trace itself(using ptm2human). It seems that the decoder keeps decoding and disassembling opcodes until it reaches 0x498a20... my memory callback(callback that is called if the decoder needs memory that isn't present) is called for the address 0x498a20. From the on, the trace just goes into a very weird path. I can't explain the address branches that are taken from here on. Any ideas on how to approach this? OpenCSD experts would be appreciated. I have attached the perf and API trace, and the "ls" executable which is loaded into address 0x400000. I also attached the ETMv4 config for every trace(trace id, etc..). There is no need to create multiple decoders for different trace ids, theres only a single ID for a single decoder. Thanks, Mike.

7 years, 1 month

CoreSight trace decoding for ARMv7

by Stefan Agner

Hi, Attending the Hardware Trace on Linux talk at Linaro Connect the topic of ARMv7 support for trace decoding came up. What is the current state and are there patches available somewhere? -- Stefan

7 years, 1 month

[RFC 1/4] coresight: tmc-etr: Introduce CS_MODE_API and prepare the ground for it

by mike＠perception-point.io

From: Mike Bazov <mike(a)perception-point.io> Introducing a new mode: CS_MODE_API. This mode shall be used by the etm-api framework to expose coresight functionality to kernel mode clients. This commit makes tmc-etr aware of it. The set_buffer()/alloc_buffer()/free_buffer() sink operations are now turned to be mode-oriented. Up to this commit they were perf-specific, because only the perf framework used it. They are now extended to also be used with CS_MODE_API, in the hope of giving a kernel mode client the power of allocating and using a coresight buffer. tmc-etr forbids using the sink for CS_MODE_API mode concurrently with other session modes(perf/sysfs). The implementation doesn't forbid using the sink multiple times if the requested mode is CS_MODE_API. This commit also makes the perf framework aware of the sink operations function signature changes. Signed-off-by: Mike Bazov <mike(a)perception-point.io> --- drivers/hwtracing/coresight/coresight-etb10.c | 13 ++- drivers/hwtracing/coresight/coresight-etm-perf.c | 7 +- drivers/hwtracing/coresight/coresight-priv.h | 1 + drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++- drivers/hwtracing/coresight/coresight-tmc-etr.c | 124 +++++++++++++++++++---- drivers/hwtracing/coresight/coresight-tmc.h | 3 +- include/linux/coresight.h | 10 +- 7 files changed, 139 insertions(+), 35 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 08fa660098f8..4fbc8af7851a 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -323,12 +323,15 @@ static void etb_disable(struct coresight_device *csdev) dev_dbg(drvdata->dev, "ETB disabled\n"); } -static void *etb_alloc_buffer(struct coresight_device *csdev, int cpu, +static void *etb_alloc_buffer(struct coresight_device *csdev, u32 mode, int cpu, void **pages, int nr_pages, bool overwrite) { int node; struct cs_buffers *buf; + if (mode != CS_MODE_PERF) + return NULL; + if (cpu == -1) cpu = smp_processor_id(); node = cpu_to_node(cpu); @@ -344,11 +347,13 @@ static void *etb_alloc_buffer(struct coresight_device *csdev, int cpu, return buf; } -static void etb_free_buffer(void *config) +static void etb_free_buffer(u32 mode, void *config) { - struct cs_buffers *buf = config; + if (mode != CS_MODE_API) { + struct cs_buffers *buf = config; - kfree(buf); + kfree(buf); + } } static int etb_set_buffer(struct coresight_device *csdev, diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index abe8249b893b..563f4b47e977 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -119,7 +119,8 @@ static void free_event_data(struct work_struct *work) cpu = cpumask_first(mask); sink = coresight_get_sink(etm_event_cpu_path(event_data, cpu)); if (sink_ops(sink)->free_buffer) - sink_ops(sink)->free_buffer(event_data->snk_config); + sink_ops(sink)->free_buffer(CS_MODE_PERF, + event_data->snk_config); } for_each_cpu(cpu, mask) { @@ -250,7 +251,8 @@ static void *etm_setup_aux(int event_cpu, void **pages, /* Allocate the sink buffer for this session */ event_data->snk_config = - sink_ops(sink)->alloc_buffer(sink, cpu, pages, + sink_ops(sink)->alloc_buffer(sink, CS_MODE_PERF, + cpu, pages, nr_pages, overwrite); if (!event_data->snk_config) goto err; @@ -284,6 +286,7 @@ static void etm_event_start(struct perf_event *event, int flags) goto fail; path = etm_event_cpu_path(event_data, cpu); + /* We need a sink, no need to continue without one */ sink = coresight_get_sink(path); if (WARN_ON_ONCE(!sink)) diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index c11da5564a67..174e6f1fab3e 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -72,6 +72,7 @@ enum cs_mode { CS_MODE_DISABLED, CS_MODE_SYSFS, CS_MODE_PERF, + CS_MODE_API }; /** diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 4156c95ce1bb..897dfc1bcbaa 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -307,12 +307,16 @@ static void tmc_disable_etf_link(struct coresight_device *csdev, dev_dbg(drvdata->dev, "TMC-ETF disabled\n"); } -static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, int cpu, - void **pages, int nr_pages, bool overwrite) +static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, u32 mode, + int cpu, void **pages, int nr_pages, + bool overwrite) { int node; struct cs_buffers *buf; + if (mode == CS_MODE_API) + return NULL; + if (cpu == -1) cpu = smp_processor_id(); node = cpu_to_node(cpu); @@ -329,11 +333,13 @@ static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, int cpu, return buf; } -static void tmc_free_etf_buffer(void *config) +static void tmc_free_etf_buffer(u32 mode, void *config) { - struct cs_buffers *buf = config; + if (mode != CS_MODE_API) { + struct cs_buffers *buf = config; - kfree(buf); + kfree(buf); + } } static int tmc_set_etf_buffer(struct coresight_device *csdev, diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 56fea4ff947e..6ffcaaf76680 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1176,32 +1176,35 @@ tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages, return etr_perf; } - -static void *tmc_alloc_etr_buffer(struct coresight_device *csdev, - int cpu, void **pages, int nr_pages, - bool snapshot) +static void *tmc_etr_alloc_api_buffer(struct tmc_drvdata *drvdata, int cpu, + int nr_pages, void **pages, bool snapshot) { - struct etr_perf_buffer *etr_perf; - struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + ssize_t size = 0; - if (cpu == -1) - cpu = smp_processor_id(); + size = nr_pages << PAGE_SHIFT; - etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu), - nr_pages, pages, snapshot); - if (IS_ERR(etr_perf)) { - dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n"); - return NULL; - } + return tmc_alloc_etr_buf(drvdata, size, 0, cpu_to_node(0), pages); +} - etr_perf->snapshot = snapshot; - etr_perf->nr_pages = nr_pages; - etr_perf->pages = pages; +void *tmc_alloc_etr_buffer(struct coresight_device *csdev, u32 mode, + int cpu, void **pages, int nr_pages, bool snapshot) +{ + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - return etr_perf; + switch (mode) { + case CS_MODE_PERF: + return tmc_etr_setup_perf_buf(drvdata, cpu, nr_pages, + pages, snapshot); + case CS_MODE_API: + return tmc_etr_alloc_api_buffer(drvdata, cpu, nr_pages, + pages, snapshot); + default: + /* We shouldn't get here. */ + return ERR_PTR(-EINVAL); + } } -static void tmc_free_etr_buffer(void *config) +static void tmc_etr_free_perf_buffer(void *config) { struct etr_perf_buffer *etr_perf = config; @@ -1210,6 +1213,61 @@ static void tmc_free_etr_buffer(void *config) kfree(etr_perf); } +static void tmc_etr_free_api_buffer(void *buffer) +{ + if (buffer) + tmc_free_etr_buf(buffer); +} + +static void tmc_free_etr_buffer(u32 mode, void *buffer) +{ + switch (mode) { + case CS_MODE_PERF: + tmc_etr_free_perf_buffer(buffer); + break; + case CS_MODE_API: + tmc_etr_free_api_buffer(buffer); + default: + break; + } +} + +static int tmc_etr_set_api_buffer(struct coresight_device *csdev, void *buffer) +{ + struct etr_buf *etr_buf = buffer; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + unsigned long flags; + + spin_lock_irqsave(&drvdata->spinlock, flags); + + /* + * Disable the etr if it's running. The user must pause any playing + * sources if it doesn't want to lose data. + */ + if (drvdata->mode == CS_MODE_API && drvdata->api_buf != NULL) + tmc_etr_disable_hw(drvdata); + + drvdata->api_buf = etr_buf; + + if (drvdata->mode == CS_MODE_API && drvdata->api_buf != NULL) + tmc_etr_enable_hw(drvdata, drvdata->api_buf); + + spin_unlock_irqrestore(&drvdata->spinlock, flags); + + return 0; +} + +static int tmc_etr_set_buffer(struct coresight_device *csdev, u32 mode, + struct perf_output_handle *handle, void *buffer) +{ + switch (mode) { + case CS_MODE_API: + return tmc_etr_set_api_buffer(csdev, buffer); + default: + return -EINVAL; + } +} + /* * tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware * buffer to the perf ring buffer. @@ -1350,6 +1408,29 @@ static int tmc_enable_etr_sink_perf(struct coresight_device *csdev, void *data) return rc; } +static int tmc_enable_etr_sink_api(struct coresight_device *csdev) +{ + int rc = 0; + unsigned long flags; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + spin_lock_irqsave(&drvdata->spinlock, flags); + + /* + * If the sink is already being used by someone, we cannot interfere. + */ + if (drvdata->mode != CS_MODE_DISABLED && + drvdata->mode != CS_MODE_API) { + rc = -EBUSY; + goto out; + } + + drvdata->mode = CS_MODE_API; +out: + spin_unlock_irqrestore(&drvdata->spinlock, flags); + return rc; +} + static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode, void *data) { @@ -1358,13 +1439,15 @@ static int tmc_enable_etr_sink(struct coresight_device *csdev, return tmc_enable_etr_sink_sysfs(csdev); case CS_MODE_PERF: return tmc_enable_etr_sink_perf(csdev, data); + case CS_MODE_API: + return tmc_enable_etr_sink_api(csdev); } /* We shouldn't be here */ return -EINVAL; } -static void tmc_disable_etr_sink(struct coresight_device *csdev) +void tmc_disable_etr_sink(struct coresight_device *csdev) { unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); @@ -1391,6 +1474,7 @@ static const struct coresight_ops_sink tmc_etr_sink_ops = { .disable = tmc_disable_etr_sink, .alloc_buffer = tmc_alloc_etr_buffer, .update_buffer = tmc_update_etr_buffer, + .set_buffer = tmc_etr_set_buffer, .free_buffer = tmc_free_etr_buffer, }; diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 487c53701e9c..39c6646c82d3 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -172,6 +172,7 @@ struct etr_buf { * device configuration register (DEVID) * @perf_data: PERF buffer for ETR. * @sysfs_data: SYSFS buffer for ETR. + * @api_buf: API buffer for ETR. */ struct tmc_drvdata { void __iomem *base; @@ -193,6 +194,7 @@ struct tmc_drvdata { u32 etr_caps; struct etr_buf *sysfs_buf; void *perf_data; + struct etr_buf *api_buf; }; struct etr_buf_operations { @@ -257,7 +259,6 @@ extern const struct coresight_ops tmc_etr_cs_ops; ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, loff_t pos, size_t len, char **bufpp); - #define TMC_REG_PAIR(name, lo_off, hi_off) \ static inline u64 \ tmc_read_##name(struct tmc_drvdata *drvdata) \ diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 53535821dc25..8f7f98570afd 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -183,16 +183,20 @@ struct coresight_device { * Operations available for sinks * @enable: enables the sink. * @disable: disables the sink. - * @alloc_buffer: initialises perf's ring buffer for trace collection. + * @alloc_buffer: allocates the trace buffer. * @free_buffer: release memory allocated in @get_config. + * @set_buffer: set the trace buffer. * @update_buffer: update buffer pointers after a trace session. */ struct coresight_ops_sink { int (*enable)(struct coresight_device *csdev, u32 mode, void *data); void (*disable)(struct coresight_device *csdev); - void *(*alloc_buffer)(struct coresight_device *csdev, int cpu, + void *(*alloc_buffer)(struct coresight_device *csdev, u32 mode, int cpu, void **pages, int nr_pages, bool overwrite); - void (*free_buffer)(void *config); + void (*free_buffer)(u32 mode, void *buffer); + int (*set_buffer)(struct coresight_device *csdev, u32 mode, + struct perf_output_handle *handle, + void *buffer); unsigned long (*update_buffer)(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config); -- 2.16.2

7 years, 1 month

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

CoreSight September 2018