On 30/06/2025 11:03 am, Yuanfang Zhang wrote:
On 6/27/2025 7:23 PM, James Clark wrote:
On 27/06/2025 12:10 pm, Yuanfang Zhang wrote:
The current implementation uses a fixed timeout via coresight_timeout(), which may be insufficient when multiple sources are enabled or under heavy load, leading to TMC readiness or flush completion timeout.
This patch introduces a configurable timeout mechanism for flush and tmcready.
What kind of values are you using? Is there a reason to not increase the global one?
1000, Because only TMC FLUSH will face timeout situations.
How long was the flush taking exactly? You should be able to log the time it took to get past the flush. Because if it's 101us then we can increase the global timeout to 150us or 200us without too much thought.
I don't think we can justify why 100us was picked over any other value anyway. And we've seen a couple of timeouts in our own CI.
I don't think it's important what value we choose because it's only to stop hangs and make it terminate eventually. As far as I can see it wouldn't matter if we set it to a huge value like 1 second. That would only cause a big delay when something has actually gone wrong. Under normal circumstances the timeout won't be hit so it doesn't really need to be configurable.
But in some cases, TMC doesn't hang up, it just requires a longer waiting time.
Signed-off-by: Yuanfang Zhang quic_yuanfang@quicinc.com
drivers/hwtracing/coresight/coresight-tmc-core.c | 43 ++++++++++++++++++++++-- 1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c index 88afb16bb6bec395ba535155228d176250f38625..286d56ce88fe80fbfa022946dc798f0f4e72f961 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -8,6 +8,7 @@ #include <linux/kernel.h> #include <linux/init.h> #include <linux/types.h> +#include <linux/delay.h> #include <linux/device.h> #include <linux/idr.h> #include <linux/io.h> @@ -35,13 +36,31 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb"); DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf"); DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr"); +static u32 tmc_timeout;
+static void tmc_extend_timeout(struct csdev_access *csa, u32 offset, int pos, int val) +{ + int i;
+ for (i = tmc_timeout; i > 0; i--) { + if (i - 1)
I didn't get what the if is for here? Removing it does basically the same thing, but if you do want to keep it maybe if (i > 1) is more explanatory.
sure.
+ udelay(1);
Can you not do udelay(tmc_timeout)?
sure.
+ } +}
+static int tmc_wait_status(struct csdev_access *csa, u32 offset, int pos, int val) +{ + return coresight_timeout_action(csa, offset, pos, val, + tmc_extend_timeout); +}
int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata) { struct coresight_device *csdev = drvdata->csdev; struct csdev_access *csa = &csdev->access; /* Ensure formatter, unformatter and hardware fifo are empty */ - if (coresight_timeout(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) { + if (tmc_wait_status(csa, TMC_STS, TMC_STS_TMCREADY_BIT, 1)) { dev_err(&csdev->dev, "timeout while waiting for TMC to be Ready\n"); return -EBUSY; @@ -61,7 +80,7 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata) ffcr |= BIT(TMC_FFCR_FLUSHMAN_BIT); writel_relaxed(ffcr, drvdata->base + TMC_FFCR); /* Ensure flush completes */ - if (coresight_timeout(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) { + if (tmc_wait_status(csa, TMC_FFCR, TMC_FFCR_FLUSHMAN_BIT, 0)) { dev_err(&csdev->dev, "timeout while waiting for completion of Manual Flush\n"); } @@ -561,11 +580,31 @@ static ssize_t stop_on_flush_store(struct device *dev, static DEVICE_ATTR_RW(stop_on_flush); +static ssize_t timeout_cfg_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return scnprintf(buf, PAGE_SIZE, "%d\n", tmc_timeout); +}
+static ssize_t timeout_cfg_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t size) +{ + unsigned long val;
+ if (kstrtoul(buf, 0, &val)) + return -EINVAL; + tmc_timeout = val;
+ return size; +} +static DEVICE_ATTR_RW(timeout_cfg);
Seeing as the existing timeout is global for all devices, if we do want a configurable one shouldn't we make the global one configurable rather than per-device? That seems too fine grained to me.
sure.
static struct attribute *coresight_tmc_attrs[] = { &dev_attr_trigger_cntr.attr, &dev_attr_buffer_size.attr, &dev_attr_stop_on_flush.attr, + &dev_attr_timeout_cfg.attr, NULL, };
base-commit: 408c97c4a5e0b634dcd15bf8b8808b382e888164 change-id: 20250627-flush_timeout-a598b4c0ce7b
Best regards,