Dear fellow firmware aficionados,
Static ACPI has been adopted by Mercedes and other silicon vendors to: - meet the safety requirements - stay away from DT lifecycle issues - leverage chiplet and CXL bindings - truly multi-host/hypervisor (or even secure/no-secure should people want it) as bindings are defined in an ad-hoc forum (not by an OS community)
DT community leaders and enthusiasts, I believe discussion on the bigger picture related to DT relevance in the long run may be needed as I believe many embedded solutions will follow Mercedes example.
Constructively yours,
François-Frédéric
PS: static ACPI can be handled by a simple parser, do not execute any ACPI byte code, is findable by EFI tables, code base is even smaller than libfdt.
ff ff@shokubai.tech schrieb am Mo., 8. Sept. 2025, 19:27:
Dear fellow firmware aficionados,
Static ACPI has been adopted by Mercedes and other silicon vendors to:
- meet the safety requirements
- stay away from DT lifecycle issues
- leverage chiplet and CXL bindings
- truly multi-host/hypervisor (or even secure/no-secure should people want
it) as bindings are defined in an ad-hoc forum (not by an OS community)
Hello François,
Thanks for sharing.
Which organization do you refer to by ad-hoc forum? Ad-hoc does not sound like a specification body. Wouldn't this work be done in the UEFI Forum?
If ACPI looses the dynamic powers of ASL, what purpose would it serve that is not already covered by device-trees?
Do the Mercedes aficionados plan to upstream the drivers changes?
Best regards
Heinrich
DT community leaders and enthusiasts, I believe discussion on the bigger picture related to DT relevance in the long run may be needed as I believe many embedded solutions will follow Mercedes example.
Constructively yours,
François-Frédéric
PS: static ACPI can be handled by a simple parser, do not execute any ACPI byte code, is findable by EFI tables, code base is even smaller than libfdt. _______________________________________________ boot-architecture mailing list -- boot-architecture@lists.linaro.org To unsubscribe send an email to boot-architecture-leave@lists.linaro.org
On 8 Sep 2025, at 20:43, Heinrich Schuchardt heinrich.schuchardt@canonical.com wrote:
ff ff@shokubai.tech schrieb am Mo., 8. Sept. 2025, 19:27: Dear fellow firmware aficionados,
Static ACPI has been adopted by Mercedes and other silicon vendors to: - meet the safety requirements - stay away from DT lifecycle issues - leverage chiplet and CXL bindings - truly multi-host/hypervisor (or even secure/no-secure should people want it) as bindings are defined in an ad-hoc forum (not by an OS community)
Hello François,
Thanks for sharing.
Which organization do you refer to by ad-hoc forum? Ad-hoc does not sound like a specification body. Wouldn't this work be done in the UEFI Forum? Yes
If ACPI looses the dynamic powers of ASL, what purpose would it serve that is not already covered by device-trees? Absolutely. And so from a descriptive point of view they are « equal ». The existential DT problem is its life cycle , i.e. not provided by firmware (secure or not). A new forum should be established to address arm, riscv x86 to define DT (EBBR is a good example that cross arch collaboration can be done. And Linux community shall stop tinkering all the time with this.
Do the Mercedes aficionados plan to upstream the drivers changes? It will/is in efi forum. There will be public publications by other vendors.
Best regards
Heinrich
DT community leaders and enthusiasts, I believe discussion on the bigger picture related to DT relevance in the long run may be needed as I believe many embedded solutions will follow Mercedes example.
Constructively yours,
François-Frédéric
PS: static ACPI can be handled by a simple parser, do not execute any ACPI byte code, is findable by EFI tables, code base is even smaller than libfdt. _______________________________________________ boot-architecture mailing list -- boot-architecture@lists.linaro.orgmailto:boot-architecture@lists.linaro.org To unsubscribe send an email to boot-architecture-leave@lists.linaro.orgmailto:boot-architecture-leave@lists.linaro.org
Hi,
On Mon, 8 Sept 2025 at 15:41, ff ff@shokubai.tech wrote:
On 8 Sep 2025, at 20:43, Heinrich Schuchardt heinrich.schuchardt@canonical.com wrote:
ff ff@shokubai.tech schrieb am Mo., 8. Sept. 2025, 19:27: Dear fellow firmware aficionados,
Static ACPI has been adopted by Mercedes and other silicon vendors to:
- meet the safety requirements
- stay away from DT lifecycle issues
- leverage chiplet and CXL bindings
- truly multi-host/hypervisor (or even secure/no-secure should people want it) as bindings are defined in an ad-hoc forum (not by an OS community)
Hello François,
Thanks for sharing.
Which organization do you refer to by ad-hoc forum? Ad-hoc does not sound like a specification body. Wouldn't this work be done in the UEFI Forum? Yes
If ACPI looses the dynamic powers of ASL, what purpose would it serve that is not already covered by device-trees? Absolutely. And so from a descriptive point of view they are « equal ». The existential DT problem is its life cycle , i.e. not provided by firmware (secure or not). A new forum should be established to address arm, riscv x86 to define DT (EBBR is a good example that cross arch collaboration can be done. And Linux community shall stop tinkering all the time with this.
Do the Mercedes aficionados plan to upstream the drivers changes? It will/is in efi forum. There will be public publications by other vendors.
Best regards
Heinrich
DT community leaders and enthusiasts, I believe discussion on the bigger picture related to DT relevance in the long run may be needed as I believe many embedded solutions will follow Mercedes example.
IMO the reason ACPI doesn't have to worry about the OS needing a particular devicetree is that the ACPI tables don't describe everything. The new way to handle this seems to be with an OEM- or device-specific Windows driver. I'm not sure how that would work in Linux. My Qualcomm laptop (using Linux) currently just reboots if it gets too hot.
We can deal with the devicetree being in two places, e.g. see this blog post[1] and standard boot.
Regards, Simon
[1] https://u-boot.org/blog/supercharging-fits-u-boots-new-two-stage-boot-capabi... [2] https://docs.u-boot.org/en/latest/develop/bootstd/overview.html
On Fri, 19 Sept 2025 at 17:10, Simon Glass sjg@chromium.org wrote:
Hi,
On Mon, 8 Sept 2025 at 15:41, ff ff@shokubai.tech wrote:
On 8 Sep 2025, at 20:43, Heinrich Schuchardt heinrich.schuchardt@canonical.com wrote:
ff ff@shokubai.tech schrieb am Mo., 8. Sept. 2025, 19:27: Dear fellow firmware aficionados,
Static ACPI has been adopted by Mercedes and other silicon vendors to:
- meet the safety requirements
- stay away from DT lifecycle issues
- leverage chiplet and CXL bindings
- truly multi-host/hypervisor (or even secure/no-secure should people want it) as bindings are defined in an ad-hoc forum (not by an OS community)
Hello François,
Thanks for sharing.
Which organization do you refer to by ad-hoc forum? Ad-hoc does not sound like a specification body. Wouldn't this work be done in the UEFI Forum? Yes
If ACPI looses the dynamic powers of ASL, what purpose would it serve that is not already covered by device-trees? Absolutely. And so from a descriptive point of view they are « equal ». The existential DT problem is its life cycle , i.e. not provided by firmware (secure or not). A new forum should be established to address arm, riscv x86 to define DT (EBBR is a good example that cross arch collaboration can be done. And Linux community shall stop tinkering all the time with this.
Do the Mercedes aficionados plan to upstream the drivers changes? It will/is in efi forum. There will be public publications by other vendors.
Best regards
Heinrich
DT community leaders and enthusiasts, I believe discussion on the bigger picture related to DT relevance in the long run may be needed as I believe many embedded solutions will follow Mercedes example.
IMO the reason ACPI doesn't have to worry about the OS needing a particular devicetree is that the ACPI tables don't describe everything.
ACPI tables and AML do describe everything that cannot describe itself, otherwise, how would the OS know about the presence of those undescribed peripherals?
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
But the problem re having to worry about the OS needing a particular devicetree has nothing to do with any of this. The problem here is that there is no process or hygiene in the kernel community around backward compatibility. The exact same piece of equipment is described in a different way in every kernel version. And how these descriptions differ from one another is not documented.
If DT bindings were versioned, and drivers would remain compatible with the old version as support for a new one is added, many of the kernel vs DT version issues would go away, and only actual bugs/inaccuracies in device trees would require firmware updates or other means to switch over to an updated version.
However, there is no ambition whatsoever in the Linux community to address these issues. Developers are actively opposed to putting DTs in firmware, because then they are on the hook to honour the bindings indefinitely.
The new way to handle this seems to be with an OEM- or device-specific Windows driver. I'm not sure how that would work in Linux.
This is why the Windows-on-ARM ACPI laptops cannot boot in ACPI mode in Linux: ACPI is not suitable for describing these systems, and so on Windows, they basically re-invented board files for ACPI (called PEP). This is mostly due to the complex SoC topology, which pure ACPI cannot describe with sufficient accuracy and so they just ship some board-specific drivers and wire them up via minimal ACPI abstractions.
My Qualcomm laptop (using Linux) currently just reboots if it gets too hot.
Not sure what you are trying to say here. Is this a dig at ACPI? Or Windows? Or both?
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
On Fri, 19 Sept 2025 at 17:10, Simon Glass sjg@chromium.org wrote:
Hi,
On Mon, 8 Sept 2025 at 15:41, ff ff@shokubai.tech wrote:
On 8 Sep 2025, at 20:43, Heinrich Schuchardt heinrich.schuchardt@canonical.com wrote:
ff ff@shokubai.tech schrieb am Mo., 8. Sept. 2025, 19:27: Dear fellow firmware aficionados,
Static ACPI has been adopted by Mercedes and other silicon vendors to:
- meet the safety requirements
- stay away from DT lifecycle issues
- leverage chiplet and CXL bindings
- truly multi-host/hypervisor (or even secure/no-secure should people want it) as bindings are defined in an ad-hoc forum (not by an OS community)
Hello François,
Thanks for sharing.
Which organization do you refer to by ad-hoc forum? Ad-hoc does not sound like a specification body. Wouldn't this work be done in the UEFI Forum? Yes
If ACPI looses the dynamic powers of ASL, what purpose would it serve that is not already covered by device-trees? Absolutely. And so from a descriptive point of view they are « equal ». The existential DT problem is its life cycle , i.e. not provided by firmware (secure or not). A new forum should be established to address arm, riscv x86 to define DT (EBBR is a good example that cross arch collaboration can be done. And Linux community shall stop tinkering all the time with this.
Do the Mercedes aficionados plan to upstream the drivers changes? It will/is in efi forum. There will be public publications by other vendors.
Best regards
Heinrich
DT community leaders and enthusiasts, I believe discussion on the bigger picture related to DT relevance in the long run may be needed as I believe many embedded solutions will follow Mercedes example.
IMO the reason ACPI doesn't have to worry about the OS needing a particular devicetree is that the ACPI tables don't describe everything.
ACPI tables and AML do describe everything that cannot describe itself, otherwise, how would the OS know about the presence of those undescribed peripherals?
Indeed, but you have actually explained this yourself below.
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
But the problem re having to worry about the OS needing a particular devicetree has nothing to do with any of this. The problem here is that there is no process or hygiene in the kernel community around backward compatibility. The exact same piece of equipment is described in a different way in every kernel version. And how these descriptions differ from one another is not documented.
If DT bindings were versioned, and drivers would remain compatible with the old version as support for a new one is added, many of the kernel vs DT version issues would go away, and only actual bugs/inaccuracies in device trees would require firmware updates or other means to switch over to an updated version.
However, there is no ambition whatsoever in the Linux community to address these issues. Developers are actively opposed to putting DTs in firmware, because then they are on the hook to honour the bindings indefinitely.
Yes, I agree. I did see that Rob Herring did some work on checking for incompatible changes in the schema, but I have not been following it.
For now we need to support having the DT in both firmware and the OS.
The new way to handle this seems to be with an OEM- or device-specific Windows driver. I'm not sure how that would work in Linux.
This is why the Windows-on-ARM ACPI laptops cannot boot in ACPI mode in Linux: ACPI is not suitable for describing these systems, and so on Windows, they basically re-invented board files for ACPI (called PEP). This is mostly due to the complex SoC topology, which pure ACPI cannot describe with sufficient accuracy and so they just ship some board-specific drivers and wire them up via minimal ACPI abstractions.
This is exactly my point.
My Qualcomm laptop (using Linux) currently just reboots if it gets too hot.
Not sure what you are trying to say here. Is this a dig at ACPI? Or Windows? Or both?
Neither...I'm just pointing out the implications of ACPI for these systems. Without a driver for the complex thermal (in this case) trade-offs, they are not reliable. We need DT rather than ACPI. We also need a way to control the pan / read temperature sensors etc., which may mean talking to an EC. So we also need a kernel driver for that, or even ideally some standard message format.
Regards, Simon
On Tue, 23 Sept 2025 at 21:32, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
No. The problem is not the complexity itself, but the fact that it is exposed to software.
x86 systems are just as complex, but they a) make more effort to abstract away the OS visible differences in firmware, and b) design the system with ACPI in mind, e.g., masquerade on-board peripherals as PCIe (so-called root complex integrated endpoints) so they can describe themselves, and use PCI standard abstractions for configuration and power management.
My Qualcomm laptop (using Linux) currently just reboots if it gets too hot.
Not sure what you are trying to say here. Is this a dig at ACPI? Or Windows? Or both?
Neither...I'm just pointing out the implications of ACPI for these systems. Without a driver for the complex thermal (in this case) trade-offs, they are not reliable.
Indeed.
We need DT rather than ACPI.
I tend to agree with you, but not for the reason you might think.
The ACPI vs DT gets very religious and heated at times, but it is often like watching people arguing over whether hammers are fundamentally better than screwdrivers: it really depends a lot on whether you are using nails or screws, and ACPI is really a much better solution than DT for certain markets.
However, the reason I think we need DT for these systems is the fact that there is prior art there. Many of these SoCs and subsystems (e.g., Qualcomm) are already shipping in major volumes with DT on Android phones, as well as Chrome OS, which are markets where performance and energy use are meticulously measured and managed.
Any effort to bring up Linux+ACPI on those SoCs in parallel for a niche market such as Linux laptops is bound to be futile, and it is much better to build on the existing DT support to fill in the blanks.
The problem, of course, is that the idea that we would maintain the DTs for these systems in the kernel tree is laughable. So either these systems need to ship as vertically integrated systems (Android, CrOS), or we need to muster the self discipline to create a DT description and *stick with it* rather than drop it like a brick as soon as the Linux minor version changes, so that we can support users installing their own Linux distros.
Hi Ard,
On Wed, 24 Sept 2025 at 10:15, Ard Biesheuvel ardb@kernel.org wrote:
On Tue, 23 Sept 2025 at 21:32, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
No. The problem is not the complexity itself, but the fact that it is exposed to software.
x86 systems are just as complex, but they a) make more effort to abstract away the OS visible differences in firmware, and b) design the system with ACPI in mind, e.g., masquerade on-board peripherals as PCIe (so-called root complex integrated endpoints) so they can describe themselves, and use PCI standard abstractions for configuration and power management.
RIght. But are you saying that Windows shouldn't have PEP drivers? Or Linux shouldn't need them?
My Qualcomm laptop (using Linux) currently just reboots if it gets too hot.
Not sure what you are trying to say here. Is this a dig at ACPI? Or Windows? Or both?
Neither...I'm just pointing out the implications of ACPI for these systems. Without a driver for the complex thermal (in this case) trade-offs, they are not reliable.
Indeed.
We need DT rather than ACPI.
I tend to agree with you, but not for the reason you might think.
The ACPI vs DT gets very religious and heated at times, but it is often like watching people arguing over whether hammers are fundamentally better than screwdrivers: it really depends a lot on whether you are using nails or screws, and ACPI is really a much better solution than DT for certain markets.
However, the reason I think we need DT for these systems is the fact that there is prior art there. Many of these SoCs and subsystems (e.g., Qualcomm) are already shipping in major volumes with DT on Android phones, as well as Chrome OS, which are markets where performance and energy use are meticulously measured and managed.
Any effort to bring up Linux+ACPI on those SoCs in parallel for a niche market such as Linux laptops is bound to be futile, and it is much better to build on the existing DT support to fill in the blanks.
OK, makes sense. I agree a religious discussion isn't very useful and you've pointed out the difference in design which explains a lot of this.
The problem, of course, is that the idea that we would maintain the DTs for these systems in the kernel tree is laughable. So either these systems need to ship as vertically integrated systems (Android, CrOS), or we need to muster the self discipline to create a DT description and *stick with it* rather than drop it like a brick as soon as the Linux minor version changes, so that we can support users installing their own Linux distros.
Yes.
I'm assuming no one has a magic solution for this?
One option could be for OEMs to provide a devicetree package for each kernel version, perhaps in a /boot/oem directory with the firmware / bootloader selecting the closest one available. In other words, we try to solve the problem of 'OEMs owning the platform vs, distros owning the OS' by separating the concerns.
I suppose another would be to separate the DTs into a package for each SoC vendor or family (but still distributed by the distro and associated with the kernel), so we don't need to install lots of unnecessary cruft.
Regards, Simon
On Wed, 24 Sept 2025 at 18:27, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Wed, 24 Sept 2025 at 10:15, Ard Biesheuvel ardb@kernel.org wrote:
On Tue, 23 Sept 2025 at 21:32, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
No. The problem is not the complexity itself, but the fact that it is exposed to software.
x86 systems are just as complex, but they a) make more effort to abstract away the OS visible differences in firmware, and b) design the system with ACPI in mind, e.g., masquerade on-board peripherals as PCIe (so-called root complex integrated endpoints) so they can describe themselves, and use PCI standard abstractions for configuration and power management.
RIght. But are you saying that Windows shouldn't have PEP drivers? Or Linux shouldn't need them?
ACPI + PEP does not provide the advantage of the higher abstraction level that 'pure' ACPI provides. Windows only supports ACPI, so PEP was bolted onto the side to be able to support these systems. Linux should not implement ACPI + PEP, because it serves the same purpose as DT (i.e., a more vertically integrated system), so we already solved that problem.
...
The problem, of course, is that the idea that we would maintain the DTs for these systems in the kernel tree is laughable. So either these systems need to ship as vertically integrated systems (Android, CrOS), or we need to muster the self discipline to create a DT description and *stick with it* rather than drop it like a brick as soon as the Linux minor version changes, so that we can support users installing their own Linux distros.
Yes.
I'm assuming no one has a magic solution for this?
Well, if we cared about breaking DT compatibility as much as Linus makes us care about breaking user space, the problem wouldn't exist.
One option could be for OEMs to provide a devicetree package for each kernel version, perhaps in a /boot/oem directory with the firmware / bootloader selecting the closest one available. In other words, we try to solve the problem of 'OEMs owning the platform vs, distros owning the OS' by separating the concerns.
No. The problem is on the kernel side, and that is where we should fix it.
Perhaps add a meta-property to DT bindings that indicate whether they will be kept compatible going forward, and tell OEMs to only use ones that do?
That will create some friction in the beginning, but at least it highlights the problem where it actually occurs: changes being made to drivers and bindings with zero regard for systems using them in production.
I suppose another would be to separate the DTs into a package for each SoC vendor or family (but still distributed by the distro and associated with the kernel), so we don't need to install lots of unnecessary cruft.
What we should be addressing is this mindset that DTs are perpetually evolving things even when the hardware they describe has not changed in years.
Le 29 sept. 2025 à 14:54, Ard Biesheuvel ardb@kernel.org a écrit :
On Wed, 24 Sept 2025 at 18:27, Simon Glass <sjg@chromium.orgmailto:sjg@chromium.org> wrote:
Hi Ard,
On Wed, 24 Sept 2025 at 10:15, Ard Biesheuvel ardb@kernel.org wrote:
On Tue, 23 Sept 2025 at 21:32, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
No. The problem is not the complexity itself, but the fact that it is exposed to software.
x86 systems are just as complex, but they a) make more effort to abstract away the OS visible differences in firmware, and b) design the system with ACPI in mind, e.g., masquerade on-board peripherals as PCIe (so-called root complex integrated endpoints) so they can describe themselves, and use PCI standard abstractions for configuration and power management.
RIght. But are you saying that Windows shouldn't have PEP drivers? Or Linux shouldn't need them?
ACPI + PEP does not provide the advantage of the higher abstraction level that 'pure' ACPI provides. Windows only supports ACPI, so PEP was bolted onto the side to be able to support these systems. Linux should not implement ACPI + PEP, because it serves the same purpose as DT (i.e., a more vertically integrated system), so we already solved that problem.
The data representation is mostly solved but not in a multi-OS multi-hypervisor environment. I believe the root cause is that DT (like ACPI) is not used as hardware representation but as « centralizing everything a driver of a particular OS needs to know to do its job » and the "particular OS » is Linux. That’s very visible on the power, clock and IRQ domains: System Device Tree was proposed to better describe hardware relations/dependencies. Other illustrations that DT is not used purely as HW representation: clock domain representation need to be very complete for secure world of some SoCs while the Linux world only sees a fraction of it; panels with basic VGA could be bound to a minimalistic scheme for secure world while being fully described for Linux, SerDes configuration that is not a choice but a consequence of board wiring (which may be hats). I am not saying that there should be a unique DT for the entire system. But that there should be a unique hardware description (probably System DT + DT) with different « projections » for secure FW, non secure FW, OS…. That organization would fit a firmware supplied DT and need an organization independent from any OS, FW, Hypervisor origin. ...
The problem, of course, is that the idea that we would maintain the DTs for these systems in the kernel tree is laughable. So either these systems need to ship as vertically integrated systems (Android, CrOS), or we need to muster the self discipline to create a DT description and *stick with it* rather than drop it like a brick as soon as the Linux minor version changes, so that we can support users installing their own Linux distros.
Yes.
I'm assuming no one has a magic solution for this?
Well, if we cared about breaking DT compatibility as much as Linus makes us care about breaking user space, the problem wouldn't exist.
One option could be for OEMs to provide a devicetree package for each kernel version, perhaps in a /boot/oem directory with the firmware / bootloader selecting the closest one available. In other words, we try to solve the problem of 'OEMs owning the platform vs, distros owning the OS' by separating the concerns.
No. The problem is on the kernel side, and that is where we should fix it.
Perhaps add a meta-property to DT bindings that indicate whether they will be kept compatible going forward, and tell OEMs to only use ones that do?
That will create some friction in the beginning, but at least it highlights the problem where it actually occurs: changes being made to drivers and bindings with zero regard for systems using them in production.
I suppose another would be to separate the DTs into a package for each SoC vendor or family (but still distributed by the distro and associated with the kernel), so we don't need to install lots of unnecessary cruft.
What we should be addressing is this mindset that DTs are perpetually evolving things even when the hardware they describe has not changed in years.
Hi Ard,
On Mon, 29 Sept 2025 at 06:54, Ard Biesheuvel ardb@kernel.org wrote:
On Wed, 24 Sept 2025 at 18:27, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Wed, 24 Sept 2025 at 10:15, Ard Biesheuvel ardb@kernel.org wrote:
On Tue, 23 Sept 2025 at 21:32, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
No. The problem is not the complexity itself, but the fact that it is exposed to software.
x86 systems are just as complex, but they a) make more effort to abstract away the OS visible differences in firmware, and b) design the system with ACPI in mind, e.g., masquerade on-board peripherals as PCIe (so-called root complex integrated endpoints) so they can describe themselves, and use PCI standard abstractions for configuration and power management.
RIght. But are you saying that Windows shouldn't have PEP drivers? Or Linux shouldn't need them?
ACPI + PEP does not provide the advantage of the higher abstraction level that 'pure' ACPI provides. Windows only supports ACPI, so PEP was bolted onto the side to be able to support these systems. Linux should not implement ACPI + PEP, because it serves the same purpose as DT (i.e., a more vertically integrated system), so we already solved that problem.
Good.
...
The problem, of course, is that the idea that we would maintain the DTs for these systems in the kernel tree is laughable. So either these systems need to ship as vertically integrated systems (Android, CrOS), or we need to muster the self discipline to create a DT description and *stick with it* rather than drop it like a brick as soon as the Linux minor version changes, so that we can support users installing their own Linux distros.
Yes.
I'm assuming no one has a magic solution for this?
Well, if we cared about breaking DT compatibility as much as Linus makes us care about breaking user space, the problem wouldn't exist.
But there is the question of when a DT compatible is considered stable. See below.
One option could be for OEMs to provide a devicetree package for each kernel version, perhaps in a /boot/oem directory with the firmware / bootloader selecting the closest one available. In other words, we try to solve the problem of 'OEMs owning the platform vs, distros owning the OS' by separating the concerns.
No. The problem is on the kernel side, and that is where we should fix it.
Perhaps add a meta-property to DT bindings that indicate whether they will be kept compatible going forward, and tell OEMs to only use ones that do?
That will create some friction in the beginning, but at least it highlights the problem where it actually occurs: changes being made to drivers and bindings with zero regard for systems using them in production.
I suppose another would be to separate the DTs into a package for each SoC vendor or family (but still distributed by the distro and associated with the kernel), so we don't need to install lots of unnecessary cruft.
What we should be addressing is this mindset that DTs are perpetually evolving things even when the hardware they describe has not changed in years.
There is also the somewhat separate challenge of vendors failing to upstream their code and DT before the hardware has been released, meaning that a downstream kernel/DT is used for a period of a year or two. Perhaps we should have a node and a compatible (prefix?) to indicate the schema is subject to change?
It is surprising to me that Linus is not worried about this issue. Have you asked him about it?
Regards, Simon
On Mon, Sep 29, 2025 at 7:55 AM Ard Biesheuvel ardb@kernel.org wrote:
On Wed, 24 Sept 2025 at 18:27, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Wed, 24 Sept 2025 at 10:15, Ard Biesheuvel ardb@kernel.org wrote:
On Tue, 23 Sept 2025 at 21:32, Simon Glass sjg@chromium.org wrote:
Hi Ard,
On Fri, 19 Sept 2025 at 09:50, Ard Biesheuvel ardb@kernel.org wrote:
The main difference is the level of abstraction: AML carries code logic along with the device description that can en/disable the device and put it into different power states. This is backed by so-called OperationRegions, which are ways to expose [abstracted] SPI, I2C and serial busses to the AML interpreter (as well as MMIO memory) so that the code sequences effectuating things like power state changes can be reduced to pokes of device register, regardless of how those are accessed on the particular system.
On x86, many onboard devices are simply described as PCIe devices, even though they are not actually connected to any PCIe fabric. This solves the self-description problem, vastly reducing the number of devices that need to be described via AML.
Also, there is a lot more homogeneity in how the system topology is constructed: on embedded systems, it is quite common to, e.g., tie the PHY interrupt line from the PCIe NIC to some GPIO controller that is not a naturally associated with that device at all, and this is something ACPI struggles with, and where DT shines.
DT simply operates at a different abstraction level - it describes every detail of the system topology, including every clock generator and power source. This makes it very flexible and very powerful, but also a maintenance burden: e.g., if some OEM issues a v2 of some board where one clock generator IC has been replaced because the original is EOL, it requires a new DT and potentially an OS update if the new part was not supported yet. ACPI is more flexible here, as it can simply ship different ACPI tables that make the v2 board look 100% identical to the v1 as far as the OS is concerned.
There is also the PEP addition you mention below, which I tend to see as an admission that ACPI cannot handle the complexity of modern systems.
No. The problem is not the complexity itself, but the fact that it is exposed to software.
x86 systems are just as complex, but they a) make more effort to abstract away the OS visible differences in firmware, and b) design the system with ACPI in mind, e.g., masquerade on-board peripherals as PCIe (so-called root complex integrated endpoints) so they can describe themselves, and use PCI standard abstractions for configuration and power management.
RIght. But are you saying that Windows shouldn't have PEP drivers? Or Linux shouldn't need them?
ACPI + PEP does not provide the advantage of the higher abstraction level that 'pure' ACPI provides. Windows only supports ACPI, so PEP was bolted onto the side to be able to support these systems. Linux should not implement ACPI + PEP, because it serves the same purpose as DT (i.e., a more vertically integrated system), so we already solved that problem.
...
The problem, of course, is that the idea that we would maintain the DTs for these systems in the kernel tree is laughable. So either these systems need to ship as vertically integrated systems (Android, CrOS), or we need to muster the self discipline to create a DT description and *stick with it* rather than drop it like a brick as soon as the Linux minor version changes, so that we can support users installing their own Linux distros.
Yes.
I'm assuming no one has a magic solution for this?
Well, if we cared about breaking DT compatibility as much as Linus makes us care about breaking user space, the problem wouldn't exist.
I try, but I'm not Linus nor can I police everything. I think we need tools to detect this first, then we can decide if and when compatibility breaks are okay. Sometimes they are unavoidable or just don't matter (i.e. new h/w which has no users). How to distinguish stable vs. unstable platforms has been discussed multiple times in the past with no conclusion.
One option could be for OEMs to provide a devicetree package for each kernel version, perhaps in a /boot/oem directory with the firmware / bootloader selecting the closest one available. In other words, we try to solve the problem of 'OEMs owning the platform vs, distros owning the OS' by separating the concerns.
No. The problem is on the kernel side, and that is where we should fix it.
Perhaps add a meta-property to DT bindings that indicate whether they will be kept compatible going forward, and tell OEMs to only use ones that do?
The challenge is there are multiple aspects to being compatible. It's at both a binding level and the DTB as a whole.
At a binding level, there's changing required properties or the entries for properties (e.g. a new required clock). Now that we have schemas, we can actually check for these changes. I have a PoC tool that can detect these changes. It seems to work okay unless the schema is restructured in some way in addition. I haven't figured out how exactly to integrate it into our processes.
At a DTB level, we need to check for changed compatibles. For example, a platform changes from fixed clocks to a clock controller breaks forward compatibility as an existing OS will not have the clock driver. Adding pinctrl or power-domains later on creates similar problems.
These aren't really hard tools to write, but no one seems to care enough to do something other than complain.
Rob
boot-architecture@lists.linaro.org