Hi Sebastian,
I agree that this is either a decoder or inject issue.
First, can you confirm you are using version 0.7.3 of the decoder - I recently fixed an addressing bug in there.
Otherwise there are a couple of possibilities.... A) an as yet undiscovered decoder bug. B) gaps in the trace that are not being communicated correctly making inject assume continuous trace when it is not. C) some other misunderstanding/misinterpretation between decoder and perf inject.
Whichever it is I need to look at the raw trace data alongside the inject output at one of the A/B points and follow the packet => decode => inject flow to see why we get a bad address value.
Can you send me the capture you are using to create the examples you quote above - plus instructions on how to reproduce the output you were getting. I'll then up the logging on the decoder and walk through the trace decode path.
Regards
Mike On Tue, 19 Sep 2017 at 23:05, Dehao Chen dehao@google.com wrote:
I agree. If we can fix the issue upstream, we don't want to have hacks downstream to patch the issue.
Dehao
On Tue, Sep 19, 2017 at 2:59 PM, Sebastian Pop sebpop@gmail.com wrote:
On Tue, Sep 19, 2017 at 3:16 PM, Dehao Chen dehao@google.com wrote:
On Tue, Sep 19, 2017 at 1:02 PM, Sebastian Pop sebpop@gmail.com wrote:
By popular demand, I started debugging this problem again.
With the two patches that I posted earlier, the traces seem correct with the exception of a few "holes" where the trace seems to jump over a few instructions that are not reported in the trace, creating jumps that do not exist in the control flow graph of the code.
The nested loop for bubble sort is:
4008a0: 9100e3a0 add x0, x29, #0x38 4008a4: 52800004 mov w4, #0x0 // #0 4008a8: 29400402 ldp w2, w1, [x0] 4008ac: 6b02003f cmp w1, w2 4008b0: 5400006a b.ge 4008bc <sort_array+0x84> 4008b4: 52800024 mov w4, #0x1 // #1 4008b8: 29000801 stp w1, w2, [x0] 4008bc: 91001000 add x0, x0, #0x4 4008c0: eb00007f cmp x3, x0 4008c4: 54ffff21 b.ne 4008a8 <sort_array+0x70> 4008c8: 35fffec4 cbnz w4, 4008a0 <sort_array+0x68>
..... 34: 00000000004008b0 -> 00000000004008b4 0 cycles P 0 ..... 35: 00000000004008c4 -> 00000000004008a8 0 cycles P 0 ..... 36: 00000000004008b0 -> 00000000004008a8 0 cycles P 0
edge #36 does not exist in the code: the trace is not correct here. 4008b0 is "b.ge 4008bc" and should either jump to 4008bc or fall through to the next instruction 4008b4, and the trace wrongly jumps to 4008a8.
Several hundred jumps later, we see this following sequence:
..... 40: 00000000004008c4 -> 00000000004008a8 0 cycles P 0 ..... 41: 00000000004008b0 -> 00000000004008b4 0 cycles P 0 ..... 42: 00000000004008c4 -> 00000000004008b4 0 cycles P 0 ..... 43: 00000000004008c4 -> 00000000004008a8 0 cycles P 0
where edge #42 is not correct either: 4008c4 should either branch to 4008a8 or fall through to 4008c8.
Maybe these inconsistencies are due to interruptions in trace recordings? I think that these interruptions could not be avoided in trace collections.
Dehao, could these wrong edges be fixed in the compiler when reading the coverage file?
I cannot see an easy way for compiler/create_gcov tool to cover these issue. Why can't trace collection tool fix these issues? Looks a bug to me.
My thinking was that the compiler knows that there are no edges between the blocks at these addresses and may just ignore the counts at these addresses.
Maybe we can figure out why this pattern occurs and try to solve it in either perf inject or in the decoder? The pattern looks very regular.
These first two errors occur at a distance of 389 branches, the next error occurs after again 389 branches. If we call the first incorrect jump "A" and the second incorrect jump "B", we have this pattern:
A, 389 correct jumps, B, 389 correct jumps, A, 389 correct jumps, B,
There are 343 occurrences of A and 322 of B in a trace of sorting 3000 elements.