10.6. PSCI OS-initiated mode
- Author:
Maulik Shah & Wing Li
- Organization:
Qualcomm Innovation Center, Inc. & Google LLC
- Contact:
Maulik Shah <quic_mkshah@quicinc.com> & Wing Li <wingers@google.com>
- Status:
Accepted
10.6.1. Introduction
10.6.1.1. Power state coordination
A power domain topology is a logical hierarchy of power domains in a system that arises from the physical dependencies between power domains.
Local power states describe power states for an individual node, and composite power states describe the combined power states for an individual node and its parent node(s).
Entry into low-power states for a topology node above the core level requires coordinating its children nodes. For example, in a system with a power domain that encompasses a shared cache, and a separate power domain for each core that uses the shared cache, the core power domains must be powered down before the shared cache power domain can be powered down.
PSCI supports two modes of power state coordination: platform-coordinated and OS-initiated.
10.6.1.1.1. Platform-coordinated
Platform-coordinated mode is the default mode of power state coordination, and is currently the only supported mode in TF-A.
In platform-coordinated mode, the platform is responsible for coordinating power states, and chooses the deepest power state for a topology node that can be tolerated by its children.
10.6.1.1.2. OS-initiated
OS-initiated mode is optional.
In OS-initiated mode, the calling OS is responsible for coordinating power states, and may request for a topology node to enter a low-power state when its last child enters the low-power state.
10.6.2. Motivation
There are two reasons why OS-initiated mode might be a more suitable option than platform-coordinated mode for a platform.
10.6.2.1. Scalability
In platform-coordinated mode, each core independently selects their own local power states, and doesn’t account for composite power states that are shared between cores.
In OS-initiated mode, the OS has knowledge of the next wakeup event for each core, and can have more precise control over the entry, exit, and wakeup latencies when deciding if a composite power state (e.g. for a cluster) is appropriate. This is especially important for multi-cluster SMP systems and heterogeneous systems like big.LITTLE, where different processor types can have different power efficiencies.
10.6.2.2. Simplicity
In platform-coordinated mode, the OS doesn’t have visibility when the last core at a power level enters a low-power state. If the OS wants to perform last man activity (e.g. powering off a shared resource when it is no longer needed), it would have to communicate with an API side channel to know when it can do so. This could result in a design smell where the platform is using platform-coordinated mode when it should be using OS-initiated mode instead.
In OS-initiated mode, the OS can perform last man activity if it selects a composite power state when the last core enters a low-power state. This eliminates the need for a side channel, and uses the well documented API between the OS and the platform.
10.6.2.3. Current vendor implementations and workarounds
STMicroelectronics
For their ARM32 platforms, they’re using OS-initiated mode implemented in OP-TEE.
For their future ARM64 platforms, they are interested in using OS-initiated mode in TF-A.
Qualcomm
For their mobile platforms, they’re using OS-initiated mode implemented in their own custom secure monitor firmware.
For their Chrome OS platforms, they’re using platform-coordinated mode in TF-A with custom driver logic to perform last man activity.
Google
They’re using platform-coordinated mode in TF-A with custom driver logic to perform last man activity.
Both Qualcomm and Google would like to be able to use OS-initiated mode in TF-A in order to simplify custom driver logic.
10.6.3. Requirements
10.6.3.1. PSCI_FEATURES
PSCI_FEATURES is for checking whether or not a PSCI function is implemented and what its properties are.
-
PSCI_FEATURES
- Parameters:
func_id – 0x8400_000A.
psci_func_id – the function ID of a PSCI function.
- Return values:
NOT_SUPPORTED – if the function is not implemented.
feature flags associated with the function – if the function is implemented.
10.6.3.1.1. CPU_SUSPEND feature flags
Reserved, bits[31:2]
Power state parameter format, bit[1]
A value of 0 indicates the original format is used.
A value of 1 indicates the extended format is used.
OS-initiated mode, bit[0]
A value of 0 indicates OS-initiated mode is not supported.
A value of 1 indicates OS-initiated mode is supported.
See sections 5.1.14 and 5.15 of the PSCI spec (DEN0022D.b) for more details.
10.6.3.2. PSCI_SET_SUSPEND_MODE
PSCI_SET_SUSPEND_MODE is for switching between the two different modes of power state coordination.
-
PSCI_SET_SUSPEND_MODE
- Parameters:
func_id – 0x8400_000F.
mode – 0 indicates platform-coordinated mode, 1 indicates OS-initiated mode.
- Return values:
SUCCESS – if the request is successful.
NOT_SUPPORTED – if OS-initiated mode is not supported.
INVALID_PARAMETERS – if the requested mode is not a valid value (0 or 1).
DENIED – if the cores are not in the correct state.
Switching from platform-coordinated to OS-initiated is only allowed if the following conditions are met:
All cores are in one of the following states:
Running.
Off, through a call to CPU_OFF or not yet booted.
Suspended, through a call to CPU_DEFAULT_SUSPEND.
None of the cores has called CPU_SUSPEND since the last change of mode or boot.
Switching from OS-initiated to platform-coordinated is only allowed if all cores other than the calling core are off, either through a call to CPU_OFF or not yet booted.
If these conditions are not met, the PSCI implementation must return DENIED.
See sections 5.1.19 and 5.20 of the PSCI spec (DEN0022D.b) for more details.
10.6.3.3. CPU_SUSPEND
CPU_SUSPEND is for moving a topology node into a low-power state.
-
CPU_SUSPEND
- Parameters:
func_id – 0xC400_0001.
power_state – the requested low-power state to enter.
entry_point_address – the address at which the core must resume execution following wakeup from a powerdown state.
context_id – this field specifies a pointer to the saved context that must be restored on a core following wakeup from a powerdown state.
- Return values:
SUCCESS – if the request is successful.
INVALID_PARAMETERS – in OS-initiated mode, this error is returned when a low-power state is requested for a topology node above the core level, and at least one of the node’s children is in a local low-power state that is incompatible with the request.
INVALID_ADDRESS – if the entry_point_address argument is invalid.
DENIED – only in OS-initiated mode; this error is returned when a low-power state is requested for a topology node above the core level, and at least one of the node’s children is running, i.e. not in a low-power state.
In platform-coordinated mode, the PSCI implementation coordinates requests from all cores to determine the deepest power state to enter.
In OS-initiated mode, the calling OS is making an explicit request for a specific power state, as opposed to expressing a vote. The PSCI implementation must comply with the request, unless the request is not consistent with the implementation’s view of the system’s state, in which case, the implementation must return INVALID_PARAMETERS or DENIED.
See sections 5.1.2 and 5.4 of the PSCI spec (DEN0022D.b) for more details.
10.6.3.3.1. Power state formats
Original format
Power Level, bits[25:24]
The requested level in the power domain topology to enter a low-power state.
State Type, bit[16]
A value of 0 indicates a standby or retention state.
A value of 1 indicates a powerdown state.
State ID, bits[15:0]
Field to specify the requested composite power state.
The state ID encodings must uniquely describe every possible composite power state.
In OS-initiated mode, the state ID encoding must allow expressing the power level at which the calling core is the last to enter a powerdown state.
Extended format
State Type, bit[30]
State ID, bits[27:0]
10.6.3.3.2. Races in OS-initiated mode
In OS-initiated mode, there are race windows where the OS’s view and implementation’s view of the system’s state differ. It is possible for the OS to make requests that are invalid given the implementation’s view of the system’s state. For example, the OS might request a powerdown state for a node from one core, while at the same time, the implementation observes that another core in that node is powering up.
To address potential race conditions in power state requests:
The calling OS must specify in each CPU_SUSPEND request the deepest power level for which it sees the calling core as the last running core (last man). This is required even if the OS doesn’t want the node at that power level to enter a low-power state.
The implementation must validate that the requested power states in the CPU_SUSPEND request are consistent with the system’s state, and that the calling core is the last core running at the requested power level, or deny the request otherwise.
See sections 4.2.3.2, 6.2, and 6.3 of the PSCI spec (DEN0022D.b) for more details.
10.6.4. Caveats
10.6.4.1. CPU_OFF
CPU_OFF is always platform-coordinated, regardless of whether the power state coordination mode for suspend is platform-coordinated or OS-initiated. If all cores in a topology node call CPU_OFF, the last core will power down the node.
In OS-initiated mode, if a subset of the cores in a topology node has called CPU_OFF, the last running core may call CPU_SUSPEND to request a powerdown state at or above that node’s power level.
See section 5.5.2 of the PSCI spec (DEN0022D.b) for more details.
10.6.5. Implementation
10.6.5.1. Current implementation of platform-coordinated mode
Platform-coordinated is currently the only supported power state coordination mode in TF-A.
The functions of interest in the psci_cpu_suspend
call stack are as follows:
psci_validate_power_state
This function calls a platform specific
validate_power_state
handler, which takes thepower_state
parameter, and updates thestate_info
object with the requested states for each power level.
psci_find_target_suspend_lvl
This function takes the
state_info
object containing the requested power states for each power level, and returns the deepest power level that was requested to enter a low power state, i.e. the target power level.
psci_do_state_coordination
This function takes the target power level and the
state_info
object containing the requested power states for each power level, and updates thestate_info
object with the coordinated target power state for each level.
pwr_domain_suspend
This is a platform specific handler that takes the
state_info
object containing the target power states for each power level, and transitions each power level to the specified power state.
10.6.5.2. Proposed implementation of OS-initiated mode
To add support for OS-initiated mode, the following changes are proposed:
Add a boolean build option
PSCI_OS_INIT_MODE
for a platform to enable optional support for PSCI OS-initiated mode. This build option defaults to 0.
Note
If PSCI_OS_INIT_MODE=0
, the following changes will not be compiled into
the build.
Update
psci_features
to return 1 in bit[0] to indicate support for OS-initiated mode for CPU_SUSPEND.Define a
suspend_mode
enum:PLAT_COORD
andOS_INIT
.Define a
psci_suspend_mode
global variable with a default value ofPLAT_COORD
.Implement a new function handler
psci_set_suspend_mode
for PSCI_SET_SUSPEND_MODE.Since
psci_validate_power_state
calls a platform specificvalidate_power_state
handler, the platform implementation should populate thestate_info
object based on the state ID from the givenpower_state
parameter.psci_find_target_suspend_lvl
remains unchanged.Implement a new function
psci_validate_state_coordination
that ensures the request satisfies the following conditions, and denies any requests that don’t:The requested power states for each power level are consistent with the system’s state
The calling core is the last core running at the requested power level
This function differs from
psci_do_state_coordination
in that:The
psci_req_local_pwr_states
map is not modified if the request were to be deniedThe
state_info
argument is never modified since it contains the power states requested by the calling OS
Update
psci_cpu_suspend_start
to do the following:If
PSCI_SUSPEND_MODE
isPLAT_COORD
, callpsci_do_state_coordination
.If
PSCI_SUSPEND_MODE
isOS_INIT
, callpsci_validate_state_coordination
. If validation fails, propagate the error up the call stack.
Add a new optional member
pwr_domain_validate_suspend
toplat_psci_ops_t
to allow the platform to optionally perform validations based on hardware states.The platform specific
pwr_domain_suspend
handler remains unchanged.

10.6.6. Testing
The proposed patches can be found at https://review.trustedfirmware.org/q/topic:psci-osi.
10.6.6.1. Testing on FVP and Google platforms
The proposed patches add a new CPU Suspend in OSI mode test suite to TF-A Tests.
This has been enabled and verified on the FVP_Base_RevC-2xAEMvA platform and
Google platforms, and excluded from all other platforms via the build option
PLAT_TESTS_SKIP_LIST
.
10.6.6.2. Testing on STM32MP15
The proposed patches have been tested and verified on the STM32MP15 platform, which has a single cluster with 2 CPUs, by Gabriel Fernandez <gabriel.fernandez@st.com> from STMicroelectronics with this device tree configuration:
cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu0: cpu@0 {
device_type = "cpu";
compatible = "arm,cortex-a7";
reg = <0>;
enable-method = "psci";
power-domains = <&CPU_PD0>;
power-domain-names = "psci";
};
cpu1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a7";
reg = <1>;
enable-method = "psci";
power-domains = <&CPU_PD1>;
power-domain-names = "psci";
};
idle-states {
cpu_retention: cpu-retention {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x00000001>;
entry-latency-us = <130>;
exit-latency-us = <620>;
min-residency-us = <700>;
local-timer-stop;
};
};
domain-idle-states {
CLUSTER_STOP: core-power-domain {
compatible = "domain-idle-state";
arm,psci-suspend-param = <0x01000001>;
entry-latency-us = <230>;
exit-latency-us = <720>;
min-residency-us = <2000>;
local-timer-stop;
};
};
};
psci {
compatible = "arm,psci-1.0";
method = "smc";
CPU_PD0: power-domain-cpu0 {
#power-domain-cells = <0>;
power-domains = <&pd_core>;
domain-idle-states = <&cpu_retention>;
};
CPU_PD1: power-domain-cpu1 {
#power-domain-cells = <0>;
power-domains = <&pd_core>;
domain-idle-states = <&cpu_retention>;
};
pd_core: power-domain-cluster {
#power-domain-cells = <0>;
domain-idle-states = <&CLUSTER_STOP>;
};
};
10.6.6.3. Testing on Qualcomm SC7280
The proposed patches have been tested and verified on the SC7280 platform by Maulik Shah <quic_mkshah@quicinc.com> from Qualcomm with this device tree configuration:
cpus {
#address-cells = <2>;
#size-cells = <0>;
CPU0: cpu@0 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x0>;
enable-method = "psci";
power-domains = <&CPU_PD0>;
power-domain-names = "psci";
};
CPU1: cpu@100 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x100>;
enable-method = "psci";
power-domains = <&CPU_PD1>;
power-domain-names = "psci";
};
CPU2: cpu@200 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x200>;
enable-method = "psci";
power-domains = <&CPU_PD2>;
power-domain-names = "psci";
};
CPU3: cpu@300 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x300>;
enable-method = "psci";
power-domains = <&CPU_PD3>;
power-domain-names = "psci";
}
CPU4: cpu@400 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x400>;
enable-method = "psci";
power-domains = <&CPU_PD4>;
power-domain-names = "psci";
};
CPU5: cpu@500 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x500>;
enable-method = "psci";
power-domains = <&CPU_PD5>;
power-domain-names = "psci";
};
CPU6: cpu@600 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x600>;
enable-method = "psci";
power-domains = <&CPU_PD6>;
power-domain-names = "psci";
};
CPU7: cpu@700 {
device_type = "cpu";
compatible = "arm,kryo";
reg = <0x0 0x700>;
enable-method = "psci";
power-domains = <&CPU_PD7>;
power-domain-names = "psci";
};
idle-states {
entry-method = "psci";
LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
compatible = "arm,idle-state";
idle-state-name = "little-power-down";
arm,psci-suspend-param = <0x40000003>;
entry-latency-us = <549>;
exit-latency-us = <901>;
min-residency-us = <1774>;
local-timer-stop;
};
LITTLE_CPU_SLEEP_1: cpu-sleep-0-1 {
compatible = "arm,idle-state";
idle-state-name = "little-rail-power-down";
arm,psci-suspend-param = <0x40000004>;
entry-latency-us = <702>;
exit-latency-us = <915>;
min-residency-us = <4001>;
local-timer-stop;
};
BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
compatible = "arm,idle-state";
idle-state-name = "big-power-down";
arm,psci-suspend-param = <0x40000003>;
entry-latency-us = <523>;
exit-latency-us = <1244>;
min-residency-us = <2207>;
local-timer-stop;
};
BIG_CPU_SLEEP_1: cpu-sleep-1-1 {
compatible = "arm,idle-state";
idle-state-name = "big-rail-power-down";
arm,psci-suspend-param = <0x40000004>;
entry-latency-us = <526>;
exit-latency-us = <1854>;
min-residency-us = <5555>;
local-timer-stop;
};
};
domain-idle-states {
CLUSTER_SLEEP_0: cluster-sleep-0 {
compatible = "arm,idle-state";
idle-state-name = "cluster-power-down";
arm,psci-suspend-param = <0x40003444>;
entry-latency-us = <3263>;
exit-latency-us = <6562>;
min-residency-us = <9926>;
local-timer-stop;
};
};
};
psci {
compatible = "arm,psci-1.0";
method = "smc";
CPU_PD0: cpu0 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>;
};
CPU_PD1: cpu1 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>;
};
CPU_PD2: cpu2 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>;
};
CPU_PD3: cpu3 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&LITTLE_CPU_SLEEP_0 &LITTLE_CPU_SLEEP_1>;
};
CPU_PD4: cpu4 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>;
};
CPU_PD5: cpu5 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>;
};
CPU_PD6: cpu6 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>;
};
CPU_PD7: cpu7 {
#power-domain-cells = <0>;
power-domains = <&CLUSTER_PD>;
domain-idle-states = <&BIG_CPU_SLEEP_0 &BIG_CPU_SLEEP_1>;
};
CLUSTER_PD: cpu-cluster0 {
#power-domain-cells = <0>;
domain-idle-states = <&CLUSTER_SLEEP_0>;
};
};
10.6.6.4. Comparisons on Qualcomm SC7280
10.6.6.4.1. CPUIdle states
8 CPUs, 1 L3 cache
Platform-coordinated mode
CPUIdle states
State0 - WFI
State1 - Core collapse
State2 - Rail collapse
State3 - L3 cache off and system resources voted off
OS-initiated mode
CPUIdle states
State0 - WFI
State1 - Core collapse
State2 - Rail collapse
Cluster domain idle state
State3 - L3 cache off and system resources voted off

10.6.6.4.2. Results
The following stats have been captured with fixed CPU frequencies from the use case of 10 seconds of device idle with the display turned on and Wi-Fi and modem turned off.
Count refers to the number of times a CPU or cluster entered power collapse.
Residency refers to the time in seconds a CPU or cluster stayed in power collapse.
The results are an average of 3 iterations of actual counts and residencies.

OS-initiated mode was able to scale better than platform-coordinated mode for multiple CPUs. The count and residency results for state3 (i.e. a cluster domain idle state) in OS-initiated mode for multiple CPUs were much closer to the results for a single CPU than in platform-coordinated mode.
Copyright (c) 2023, Arm Limited and Contributors. All rights reserved.