next/master boot: 210 boots: 35 failed, 174 passed with 1 conflict (next-20171115)
Full Boot Summary: https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20171115/ Full Build Summary: https://kernelci.org/build/next/branch/master/kernel/next-20171115/
Tree: next Branch: master Git Describe: next-20171115 Git Commit: 63fb091c80188ec51f53514d07de907c1dd3d61d Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Tested: 35 unique boards, 16 SoC families, 30 builds out of 213
Boot Regressions Detected:
arm:
exynos_defconfig: exynos4412-odroidx2_rootfs:nfs: lab-collabora: failing since 6 days (last pass: next-20171107 - first fail: next-20171108)
imx_v4_v5_defconfig: imx27-phytec-phycard-s-rdk: lab-pengutronix: failing since 5 days (last pass: next-20171108 - first fail: next-20171109)
multi_v7_defconfig: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+CONFIG_ARM_LPAE=y: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+CONFIG_EFI=y: am335x-boneblack: lab-collabora: new failure (last pass: next-20171114) tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+CONFIG_EFI=y+CONFIG_ARM_LPAE=y: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+CONFIG_LKDTM=y: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+CONFIG_PROVE_LOCKING=y: imx6q-nitrogen6x: lab-free-electrons: failing since 81 days (last pass: next-20170727 - first fail: next-20170825) tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+CONFIG_SMP=n: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y+CONFIG_ARM_MODULE_PLTS=y: am335x-boneblack: lab-collabora: new failure (last pass: next-20171114) exynos5250-snow: lab-collabora: new failure (last pass: next-20171114) tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
multi_v7_defconfig+kselftest: exynos5250-snow: lab-collabora: failing since 1 day (last pass: next-20171110 - first fail: next-20171113) exynos5422-odroidxu3: lab-collabora: new failure (last pass: next-20171110) exynos5800-peach-pi: lab-collabora: failing since 4 days (last pass: next-20171108 - first fail: next-20171110) tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
omap2plus_defconfig: am335x-boneblack_rootfs:nfs: lab-collabora: failing since 6 days (last pass: next-20171107 - first fail: next-20171108)
tegra_defconfig: tegra124-jetson-tk1_rootfs:nfs: lab-collabora: failing since 6 days (last pass: next-20171107 - first fail: next-20171108) tegra124-nyan-big: lab-collabora: failing since 6 days (last pass: next-20171107 - first fail: next-20171108)
x86:
defconfig+CONFIG_LKDTM=y: x86-atom330: lab-mhart: new failure (last pass: next-20171114)
defconfig+kselftest: x86-atom330: lab-mhart: failing since 5 days (last pass: next-20171108 - first fail: next-20171109)
x86_64_defconfig: x86-atom330: lab-mhart: failing since 1 day (last pass: next-20171113 - first fail: next-20171114)
Boot Failures Detected:
arm:
multi_v7_defconfig+kselftest exynos5250-snow: 1 failed lab exynos5422-odroidxu3: 1 failed lab exynos5800-peach-pi: 1 failed lab rk3288-rock2-square: 1 failed lab tegra124-nyan-big: 1 failed lab
multi_v7_defconfig+CONFIG_LKDTM=y tegra124-nyan-big: 1 failed lab
exynos_defconfig exynos4412-odroidx2_rootfs:nfs: 1 failed lab
multi_v7_defconfig tegra124-nyan-big: 1 failed lab
omap2plus_defconfig am335x-boneblack_rootfs:nfs: 1 failed lab
multi_v7_defconfig+CONFIG_ARM_LPAE=y tegra124-nyan-big: 1 failed lab
tegra_defconfig tegra124-nyan-big: 1 failed lab
multi_v7_defconfig+CONFIG_EFI=y+CONFIG_ARM_LPAE=y tegra124-nyan-big: 1 failed lab
multi_v7_defconfig+CONFIG_EFI=y am335x-boneblack: 1 failed lab tegra124-nyan-big: 1 failed lab
multi_v7_defconfig+CONFIG_SMP=n tegra124-nyan-big: 1 failed lab
multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y+CONFIG_ARM_MODULE_PLTS=y am335x-boneblack: 1 failed lab exynos5250-snow: 1 failed lab tegra124-nyan-big: 1 failed lab
multi_v7_defconfig+CONFIG_PROVE_LOCKING=y imx6q-nitrogen6x: 1 failed lab rk3288-rock2-square: 1 failed lab tegra124-nyan-big: 1 failed lab
imx_v4_v5_defconfig imx27-phytec-phycard-s-rdk: 1 failed lab
arm64:
defconfig+CONFIG_KASAN=y r8a7795-salvator-x: 1 failed lab
defconfig+kselftest apq8016-sbc: 1 failed lab bcm2837-rpi-3-b: 1 failed lab meson-gxl-s905x-khadas-vim: 1 failed lab
defconfig+CONFIG_LKDTM=y r8a7795-salvator-x: 1 failed lab
defconfig r8a7795-salvator-x: 1 failed lab
defconfig+CONFIG_EXPERT=y+CONFIG_ACPI=y r8a7795-salvator-x: 1 failed lab
defconfig+CONFIG_OF_UNITTEST=y r8a7795-salvator-x: 1 failed lab
defconfig+CONFIG_CPU_BIG_ENDIAN=y r8a7795-salvator-x: 1 failed lab
defconfig+CONFIG_RANDOMIZE_BASE=y r8a7795-salvator-x: 1 failed lab
x86:
defconfig+kselftest x86-atom330: 1 failed lab
x86_64_defconfig x86-atom330: 1 failed lab
defconfig+CONFIG_LKDTM=y x86-atom330: 1 failed lab
Conflicting Boot Failure Detected: (These likely are not failures as other labs are reporting PASS. Needs review.)
arm:
tegra_defconfig: tegra124-jetson-tk1_rootfs:nfs: lab-mhart: PASS lab-collabora: FAIL
--- For more info write to info@kernelci.org
On Wed, Nov 15, 2017 at 9:13 AM, kernelci.org bot bot@kernelci.org wrote:
imx_v4_v5_defconfig: imx27-phytec-phycard-s-rdk: lab-pengutronix https://kernelci.org/boot/id/5a0bf25f59b5149ef01cdd3b/: failing since 5 days https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20171115/#regressions (last pass: next-20171108 https://kernelci.org/boot/id/5a02ca3059b5147ea81cdd21/ - first fail: next-20171109 https://kernelci.org/boot/id/5a0417d359b514c5411cdd35/)
What is the commit causing this boot regression?
multi_v7_defconfig+CONFIG_PROVE_LOCKING=y:
imx6q-nitrogen6x: lab-free-electrons https://kernelci.org/boot/id/5a0c0e5659b514af3d1cdd1a/: failing since 81 days https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20171115/#regressions (last pass: next-20170727 https://kernelci.org/boot/id/5979a0c359b51412eb71ddf7/ - first fail: next-20170825 https://kernelci.org/boot/id/59a0072959b514741bef1de2/)
Is this a real error?
multi_v7_defconfig+CONFIG_PROVE_LOCKING=y succeeds on imx6q-wandboard: https://storage.kernelci.org/next/master/next-20171115/arm/multi_v7_defconfi...
On 15/11/17 11:13, kernelci.org bot wrote:
next/master boot: 210 boots: 35 failed, 174 passed with 1 conflict (next-20171115)
Full Boot Summary: https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20171115/ Full Build Summary: https://kernelci.org/build/next/branch/master/kernel/next-20171115/
Tree: next Branch: master Git Describe: next-20171115 Git Commit: 63fb091c80188ec51f53514d07de907c1dd3d61d Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Tested: 35 unique boards, 16 SoC families, 30 builds out of 213
Boot Regressions Detected:
arm:
[...]
multi_v7_defconfig: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
I've run another automated bisection with some tweaks to remove known failures and found this breaking change:
commit 859eb05676f67d4960130dff36d3368006716110 Author: Shawn Nematbakhsh shawnn@chromium.org Date: Fri Sep 8 13:50:11 2017 -0700
platform/chrome: Use proper protocol transfer function
The bisection was run with CONFIG_MODULES and CONFIG_DRM_NOUVEAU disabled and d89e2378a97fafdc74cbf997e7c88af75b81610a ("drivers: flag buses which demand DMA configuration") reverted in each iteration as these things are known to cause other boot failures. See the full log here (from staging.kernelci.org Jenkins):
https://people.collabora.com/~gtucker/kernelci/bisections/20171116-nyan-big-...
As the kernelci.org automated bisection tool is still experimental, I then did a couple of checks to confirm whether this commit was still an issue:
* 859eb056 with MODULES and DRM_NOUVEAU disabled, failed: https://lava.collabora.co.uk/scheduler/job/992002
* same as above but with 859eb056 reverted in-place, passes: https://lava.collabora.co.uk/scheduler/job/992003
* next-20171115 with MODULES and DRM_NOUVEAU disabled, and d89e2378 reverted, fails: https://lava.collabora.co.uk/scheduler/job/992004
* same as above but with 859eb056 reverted on top, passes: https://lava.collabora.co.uk/scheduler/job/992005
So this shows the commit found by the bisection seems to be still causing problems on tegra124-nyan-big. I haven't investigated any further, I don't know if other platforms show the same symptoms. At least this should narrow things down a bit.
Note: With this in mind, another possible bisection would be to enable DRM_NOUVEAU again while reverting the 2 known breaking commits and try to find what is actually causing the issue in that driver. I'm not planning to do this right now though...
Hope this helps!
Guillaume
For more info write to info@kernelci.org
Kernel-build-reports mailing list Kernel-build-reports@lists.linaro.org https://lists.linaro.org/mailman/listinfo/kernel-build-reports
On 16/11/17 14:42, Guillaume Tucker wrote:
On 15/11/17 11:13, kernelci.org bot wrote:
next/master boot: 210 boots: 35 failed, 174 passed with 1 conflict (next-20171115)
Full Boot Summary: https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20171115/ Full Build Summary: https://kernelci.org/build/next/branch/master/kernel/next-20171115/
Tree: next Branch: master Git Describe: next-20171115 Git Commit: 63fb091c80188ec51f53514d07de907c1dd3d61d Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Tested: 35 unique boards, 16 SoC families, 30 builds out of 213
Boot Regressions Detected:
arm:
[...]
multi_v7_defconfig: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
I've run another automated bisection with some tweaks to remove known failures and found this breaking change:
commit 859eb05676f67d4960130dff36d3368006716110 Author: Shawn Nematbakhsh shawnn@chromium.org Date: Fri Sep 8 13:50:11 2017 -0700
platform/chrome: Use proper protocol transfer function
Should have added, this is essentially the kernel error:
[ 1.711581] kernel BUG at drivers/platform/chrome/cros_ec_proto.c:34! [ 1.718004] Internal error: Oops - BUG: 0 [#1] SMP ARM
Please see the links to the LAVA jobs below for the full logs.
The bisection was run with CONFIG_MODULES and CONFIG_DRM_NOUVEAU disabled and d89e2378a97fafdc74cbf997e7c88af75b81610a ("drivers: flag buses which demand DMA configuration") reverted in each iteration as these things are known to cause other boot failures. See the full log here (from staging.kernelci.org Jenkins):
https://people.collabora.com/~gtucker/kernelci/bisections/20171116-nyan-big-...
As the kernelci.org automated bisection tool is still experimental, I then did a couple of checks to confirm whether this commit was still an issue:
859eb056 with MODULES and DRM_NOUVEAU disabled, failed: https://lava.collabora.co.uk/scheduler/job/992002
same as above but with 859eb056 reverted in-place, passes: https://lava.collabora.co.uk/scheduler/job/992003
next-20171115 with MODULES and DRM_NOUVEAU disabled, and d89e2378 reverted, fails: https://lava.collabora.co.uk/scheduler/job/992004
same as above but with 859eb056 reverted on top, passes: https://lava.collabora.co.uk/scheduler/job/992005
So this shows the commit found by the bisection seems to be still causing problems on tegra124-nyan-big. I haven't investigated any further, I don't know if other platforms show the same symptoms. At least this should narrow things down a bit.
Note: With this in mind, another possible bisection would be to enable DRM_NOUVEAU again while reverting the 2 known breaking commits and try to find what is actually causing the issue in that driver. I'm not planning to do this right now though...
Hope this helps!
Guillaume
For more info write to info@kernelci.org
Kernel-build-reports mailing list Kernel-build-reports@lists.linaro.org https://lists.linaro.org/mailman/listinfo/kernel-build-reports
On 16/11/17 14:50, Guillaume Tucker wrote:
On 16/11/17 14:42, Guillaume Tucker wrote:
On 15/11/17 11:13, kernelci.org bot wrote:
next/master boot: 210 boots: 35 failed, 174 passed with 1 conflict (next-20171115)
Full Boot Summary: https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20171115/
Full Build Summary: https://kernelci.org/build/next/branch/master/kernel/next-20171115/
Tree: next Branch: master Git Describe: next-20171115 Git Commit: 63fb091c80188ec51f53514d07de907c1dd3d61d Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Tested: 35 unique boards, 16 SoC families, 30 builds out of 213
Boot Regressions Detected:
arm:
[...]
multi_v7_defconfig: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass: next-20171102 - first fail: next-20171103)
I've run another automated bisection with some tweaks to remove known failures and found this breaking change:
commit 859eb05676f67d4960130dff36d3368006716110 Author: Shawn Nematbakhsh shawnn@chromium.org Date: Fri Sep 8 13:50:11 2017 -0700
platform/chrome: Use proper protocol transfer function
Should have added, this is essentially the kernel error:
[ 1.711581] kernel BUG at drivers/platform/chrome/cros_ec_proto.c:34! [ 1.718004] Internal error: Oops - BUG: 0 [#1] SMP ARM
Please see the links to the LAVA jobs below for the full logs.
This one is a known issue (which I believe I have mentioned a couple times). There is a fix available [0].
I am not sure that continuing to bisect this is going to bare any fruit due to the number of issues plaguing this board at the moment. It seems to be a DRM related issue and when I get sometime I will see if I can figure out what is causing this. This is the furtherest I have gotten so far [1].
Jon
[0] https://patchwork.kernel.org/patch/9974835/ [1] https://www.spinics.net/lists/arm-kernel/msg616616.html
On 16/11/17 17:55, Jon Hunter wrote:
On 16/11/17 14:50, Guillaume Tucker wrote:
On 16/11/17 14:42, Guillaume Tucker wrote:
On 15/11/17 11:13, kernelci.org bot wrote:
next/master boot: 210 boots: 35 failed, 174 passed with 1 conflict (next-20171115)
Full Boot Summary: https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20171115/
Full Build Summary: https://kernelci.org/build/next/branch/master/kernel/next-20171115/
Tree: next Branch: master Git Describe: next-20171115 Git Commit: 63fb091c80188ec51f53514d07de907c1dd3d61d Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Tested: 35 unique boards, 16 SoC families, 30 builds out of 213
Boot Regressions Detected:
arm:
[...]
multi_v7_defconfig: tegra124-nyan-big: lab-collabora: failing since 11 days (last pass:
next-20171102 - first fail: next-20171103)
I've run another automated bisection with some tweaks to remove known failures and found this breaking change:
commit 859eb05676f67d4960130dff36d3368006716110 Author: Shawn Nematbakhsh shawnn@chromium.org Date: Fri Sep 8 13:50:11 2017 -0700
platform/chrome: Use proper protocol transfer function
Should have added, this is essentially the kernel error:
[ 1.711581] kernel BUG at drivers/platform/chrome/cros_ec_proto.c:34! [ 1.718004] Internal error: Oops - BUG: 0 [#1] SMP ARM
Please see the links to the LAVA jobs below for the full logs.
This one is a known issue (which I believe I have mentioned a couple times). There is a fix available [0].
Right, it's true that this kernel error showed up in the previous bisection I ran last week. As this time it found another commit I thought it might be worthwhile sharing the result.
I am not sure that continuing to bisect this is going to bare any fruit due to the number of issues plaguing this board at the moment. It seems to be a DRM related issue and when I get sometime I will see if I can figure out what is causing this. This is the furtherest I have gotten so far [1].
Thanks for the update. Please keep in mind that DRM_MOUVEAU was disabled in every iteration this time, so there might also be something else unrelated to DRM.
Agreed, it's not worth bisecting much further. In fact I found these issues mostly as a side effect of working on the automatic bisection mechanism in kernelci.org, and linux-next is especially tricky to bisect. On this subject, we'll probably start by enabling this on stable trees and mainline, then see what we can do with linux-next and other tricky trees at a later stage.
Thanks, Guillaume
[0] https://patchwork.kernel.org/patch/9974835/ [1] https://www.spinics.net/lists/arm-kernel/msg616616.html
kernel-build-reports@lists.linaro.org