Five EmbedDev logo Five EmbedDev

An Embedded RISC-V Blog
The RISC-V Instruction Set Manual, Volume II: Privileged Architecture

5 Hypervisor Extension, Version 0.5

Warning! This draft specification is likely to change before being accepted as standard by the RISC-V Foundation.

This chapter describes the RISC-V hypervisor extension, which virtualizes the supervisor-level architecture to support the efficient hosting of guest operating systems atop a type-1 or type-2 hypervisor. The hypervisor extension changes supervisor mode into hypervisor-extended supervisor mode (HS-mode, or hypervisor mode for short), where a hypervisor or a hosting-capable operating system runs. The hypervisor extension also adds another stage of address translation, from guest physical addresses to supervisor physical addresses, to virtualize the memory and memory-mapped I/O subsystems for a guest operating system. HS-mode acts the same as S-mode, but with additional instructions and CSRs that control the new stage of address translation and support hosting a guest OS in virtual S-mode (VS-mode). Regular S-mode operating systems can execute without modification either in HS-mode or as VS-mode guests.

In HS-mode, an OS or hypervisor interacts with the machine through the same SBI as an OS normally does from S-mode. An HS-mode hypervisor is expected to implement the SBI for its VS-mode guest.

The hypervisor extension is enabled by setting bit 7 in the misa CSR, which corresponds to the letter H. When misa[7] is clear, the hart behaves as though this extension were not implemented, and attempts to use hypervisor CSRs or instructions raise an illegal instruction exception. Implementations that include the hypervisor extension are encouraged not to hardwire misa[7], so that the extension may be disabled.

This draft is based on earlier proposals by John Hauser and Paolo Bonzini.

The baseline privileged architecture is designed to simplify the use of classic virtualization techniques, where a guest OS is run at user-level, as the few privileged instructions can be easily detected and trapped. The hypervisor extension improves virtualization performance by reducing the frequency of these traps.

The hypervisor extension has been designed to be efficiently emulable on platforms that do not implement the extension, by running the hypervisor in S-mode and trapping into M-mode for hypervisor CSR accesses and to maintain shadow page tables. The majority of CSR accesses for type-2 hypervisors are valid S-mode accesses so need not be trapped. Hypervisors can support nested virtualization analogously.

5.1 Privilege Modes

The current virtualization mode, denoted V, indicates whether the hart is currently executing in a guest. When V=1, the hart is either in virtual S-mode (VS-mode), or in virtual U-mode (VU-mode) atop a guest OS running in VS-mode. When V=0, the hart is either in M-mode, in HS-mode, or in U-mode atop an OS running in HS-mode. The virtualization mode also indicates whether two-stage address translation is active (V=1) or inactive (V=0). Table 1.1 lists the possible operating modes of a RISC-V hart with the hypervisor extension.

Operating modes with the hypervisor extension.

5.2 Hypervisor and Virtual Supervisor CSRs

An OS or hypervisor running in HS-mode uses the supervisor CSRs to interact with the exception, interrupt, and address-translation subsystems. Additional CSRs are provided to HS-mode, but not to VS-mode, to manage two-stage address translation and to control the behavior of a VS-mode guest: hstatus, hedeleg, hideleg, hip, hie, hgeip, hgeie, hcounteren, htimedelta, htimedeltah, htval, htinst, and hgatp.

Furthermore, several virtual supervisor CSRs (VS CSRs) are replicas of the normal supervisor CSRs. For example, vsstatus is the VS CSR that duplicates the usual sstatus CSR.

When V=1, the VS CSRs substitute for the corresponding supervisor CSRs, taking over all functions of the usual supervisor CSRs except as specified otherwise. Instructions that normally read or modify a supervisor CSR shall instead access the corresponding VS CSR. In VS-mode, an attempt to read or write a VS CSR directly by its own separate CSR address causes an illegal instruction exception. The VS CSRs can be directly accessed only from M-mode or HS-mode.

While V=1, the normal HS-level supervisor CSRs that are replaced by VS CSRs retain their values but do not affect the behavior of the machine unless specifically documented to do so. Conversely, when V=0, the VS CSRs do not ordinarily affect the behavior of the machine other than being readable and writable by CSR instructions.

A few standard supervisor CSRs (scounteren and, if the N extension is implemented, sedeleg and sideleg) have no matching VS CSR. These supervisor CSRs continue to have their usual function and accessibility even when V=1, except with VS-mode and VU-mode substituting for HS-mode and U-mode. Hypervisor software is expected to manually swap the contents of these registers as needed.

Matching VS CSRs exist only for the supervisor CSRs that must be duplicated, which are mainly those that get automatically written by traps or that impact instruction execution immediately after trap entry and/or right before SRET, when software alone is unable to swap a CSR at exactly the right moment. Currently, most supervisor CSRs fall into this category, but future ones might not.

In this chapter, we use the term HSXLEN to refer to the effective XLEN when executing in HS-mode, and VSXLEN to refer to the effective XLEN when executing in VS-mode.

5.2.1 Hypervisor Status Register (hstatus)

The hstatus register is an HSXLEN-bit read/write register formatted as shown in Figure 1.2 when HSXLEN=32 and Figure 1.3 when HSXLEN=64. The hstatus register provides facilities analogous to the mstatus register for tracking and controlling the exception behavior of a VS-mode guest.

Hypervisor status register (hstatus) for RV32.
Hypervisor status register (hstatus) for RV64.

The VSXL field controls the effective XLEN for VS-mode (known as VSXLEN), which may differ from the XLEN for HS-mode (HSXLEN). When HSXLEN=32, the VSXL field does not exist, and VSXLEN=32. When HSXLEN=64, VSXL is a WARL field that is encoded the same as the MXL field of misa, shown in Table [misabase] on page . In particular, the implementation may hardwire VSXL so that VSXLEN=HSXLEN.

If HSXLEN is changed from 32 to a wider width, and if field VSXL is not hardwired to a forced value, it gets the value corresponding to the widest supported width not wider than the new HSXLEN.

The hstatus fields VTSR and VTVM are defined analogously to the mstatus fields TSR and TVM, but affect the trapping behavior of the SRET and virtual-memory management instructions only when V=1.

The VGEIN (Virtual Guest External Interrupt Number) field selects a guest external interrupt source for VS-level external interrupts. VGEIN is a WLRL field that must be able to hold values between zero and the maximum guest external interrupt number (known as GEILEN), inclusive. When VGEIN=0, no guest external interrupt source is selected for VS-level external interrupts. GEILEN may be zero, in which case VGEIN may be hardwired to zero. Guest external interrupts are explained in Section 1.2.4, and the use of VGEIN is covered further in Section 1.2.3.

The SPV bit (Supervisor Previous Virtualization Mode) is written by the implementation whenever a trap is taken into HS-mode. Just as the SPP bit in sstatus is set to the privilege mode at the time of the trap, the SPV bit in hstatus is set to the value of the virtualization mode V at the time of the trap. When an SRET instruction is executed when V=0, V is set to SPV.

When a trap is taken into HS-mode, bits SP2V and SP2P are set to the values that SPV and the HS-level SPP had before the trap. When an SRET instruction is executed when V=0, the reverse assignments occur: after SPV and sstatus.SPP have supplied the new virtualization and privilege modes, they are written with SP2V and SP2P, respectively.

The VSBE bit is a WARL field that controls the endianness of explicit memory accesses made from VS-mode. If VSBE=0, explicit load and store memory accesses made from VS-mode are little-endian, and if VSBE=1, they are big-endian. VSBE also controls the endianness of all implicit accesses to VS-level memory management data structures, such as page tables. An implementation may hardwire VSBE to specify always the same endianness as for HS-mode.

The SPRV bit modifies the privilege of explicit memory accesses made in HS-mode. When SPRV=0, translation and protection behave as normal. When SPRV=1, explicit memory accesses in HS-mode are translated and protected, and endianness is applied, as though the current virtualization mode were set to hstatus.SPV and the current privilege mode were set to the HS-level SPP. Table [h-sprv] enumerates the cases.

SPRV SPV SPP Effect
0 Normal access; current privilege and virtualization modes apply.
1 0 0 U-level access with HS-level translation and protection only.
1 0 1 HS-level access with HS-level translation and protection only.
1 1 0 VU-level access with two-stage translation and protection. The HS-level MXR bit makes any executable page readable. vsstatus.MXR makes readable those pages marked executable at the VS translation stage, but only if readable at the guest-physical translation stage.
1 1 1 VS-level access with two-stage translation and protection. The HS-level MXR bit makes any executable page readable. vsstatus.MXR makes readable those pages marked executable at the VS translation stage, but only if readable at the guest-physical translation stage. vsstatus.SUM applies instead of the HS-level SUM bit.

An MRET or SRET instruction that changes the operating mode to U-mode, VS-mode, or VU-mode also sets SPRV=0.

5.2.2 Hypervisor Trap Delegation Registers (hedeleg and hideleg)

Registers hedeleg and hideleg are HSXLEN-bit read/write registers, formatted as shown in Figures 1.4 and 1.5 respectively. By default, all traps at any privilege level are handled in M-mode, though M-mode usually uses the medeleg and mideleg CSRs to delegate some traps to HS-mode. The hedeleg and hideleg CSRs allow these traps to be further delegated to a VS-mode guest; their layout is the same as medeleg and mideleg.

Hypervisor exception delegation register (hedeleg).
Hypervisor interrupt delegation register (hideleg).
Bit Attribute Corresponding Exception
0 (See text) Instruction address misaligned
1 Writable Instruction access fault
2 Writable Illegal instruction
3 Writable Breakpoint
4 Writable Load address misaligned
5 Writable Load access fault
6 Writable Store/AMO address misaligned
7 Writable Store/AMO access fault
8 Writable Environment call from U-mode or VU-mode
9 Read-only 0 Environment call from HS-mode
11 Read-only 0 Environment call from M-mode
12 Writable Instruction page fault
13 Writable Load page fault
15 Writable Store/AMO page fault
20 Read-only 0 Instruction guest-page fault
21 Read-only 0 Load guest-page fault
23 Read-only 0 Store/AMO guest-page fault

A synchronous trap that has been delegated to HS-mode (using medeleg) is further delegated to VS-mode if V=1 before the trap and the corresponding hedeleg bit is set. Each bit of hedeleg shall be either writable or hardwired to zero. Many bits of hedeleg are required specifically to be writable or zero, as enumerated in Table [tab:hedeleg-bits]. Bit 0, corresponding to instruction address misaligned exceptions, must be writable if IALIGN=32.

Requiring that certain bits of hedeleg be writable reduces some of the burden on a hypervisor to handle variations of implementation.

An interrupt that has been delegated to HS-mode (using mideleg) is further delegated to VS-mode if the corresponding hideleg bit is set. Among bits 15:0 of hideleg, only bits 10, 6, and 2 (corresponding to the standard VS-level interrupts) shall be writable, and the others shall be hardwired to zero.

When a virtual supervisor external interrupt (code 10) is delegated to VS-mode, it is automatically translated by the machine into a supervisor external interrupt (code 9) for VS-mode, including the value written to vscause on an interrupt trap. Likewise, a virtual supervisor timer interrupt (6) is translated into a supervisor timer interrupt (5) for VS-mode, and a virtual supervisor software interrupt (2) is translated into a supervisor software interrupt (1) for VS-mode. Similar translations may or may not be done for platform or custom interrupt causes (codes 16 and above).

5.2.3 Hypervisor Interrupt Registers (hip and hie)

Registers hip and hie are HSXLEN-bit read/write registers that supplement HS-level’s sip and sie respectively. The hip register indicates pending VS-level and hypervisor-specific interrupts, while hie contains enable bits for the same interrupts. Like sip and sie, interrupt cause number i corresponds with bit i in both hip and hie.

Hypervisor interrupt-pending register (hip).
Hypervisor interrupt-enable register (hie).

For each writable bit in sie, the corresponding bit shall be hardwired to zero in both hip and hie. Hence, the nonzero bits in sie and hie are always mutually exclusive, and likewise for sip and hip.

The active bits of hip and hie cannot be placed in HS-level’s sip and sie because doing so would make it impossible for software to emulate the hypervisor extension on platforms that do not implement it in hardware.

If bit i of sie is hardwired to zero, the same bit in register hip may be writable or may be read-only. When bit i in hip is writable, a pending interrupt i can be cleared by writing 0 to this bit. If interrupt i can become pending in hip but bit i in hip is read-only, the implementation must provide some other mechanism for clearing the pending interrupt (which may involve a call to the execution environment).

A bit in hie shall be writable if the corresponding interrupt can ever become pending in hip. Bits of hie that are not writable shall be hardwired to zero.

The standard portions (bits 15:0) of registers hip and hie are formatted as shown in Figures 1.8 and 1.9 respectively.

Standard portion (bits 15:0) of hip.
Standard portion (bits 15:0) of hie.

Bits hip.SGEIP and hie.SGEIE are the interrupt-pending and interrupt-enable bits for guest external interrupts at supervisor level (HS-level). SGEIP is read-only in hip, and is 1 if and only if the bitwise logical-AND of CSRs hgeip and hgeie is nonzero in any bit. (See Section 1.2.4.)

Bits hip.VSEIP and hie.VSEIE are the interrupt-pending and interrupt-enable bits for VS-level external interrupts. VSEIP is writable in hip, and may be written by a hypervisor to indicate to VS-mode that an external interrupt is pending. Additionally, VS-level external interrupts may come from other sources, such as a platform-level interrupt controller. VS-level external interrupts are made pending based on the logical-OR of:

the software-writable VSEIP bit;

the bit of hgeip selected by hstatus.VGEIN; and

any other platform-specific external interrupt signal directed to VS-level.

When hip is read with a CSR instruction, the value of the VSEIP bit returned in the rd destination register is the logical-OR of all the sources listed above. However, the value used in the read-modify-write sequence of a CSRRS or CSRRC instruction contains only the software-writable VSEIP bit, ignoring other interrupt sources.

The VSEIP field behavior is designed to allow a hypervisor to mimic external interrupts cleanly for a guest virtual machine, without losing any real external interrupts that may be directed to VS-level. The behavior of the CSR instructions is slightly modified from regular CSR accesses as a result. This modified CSR behavior for VSEIP is the same as given to mip.SEIP for like reason.

Bits hip.VSTIP and hie.VSTIE are the interrupt-pending and interrupt-enable bits for VS-level timer interrupts. VSTIP is writable in hip, and may be written by a hypervisor to deliver timer interrupts to VS-mode.

Bits hip.VSSIP and hie.VSSIE are the interrupt-pending and interrupt-enable bits for VS-level software interrupts. VSSIP is writable in hip.

Multiple simultaneous interrupts destined for HS-mode are handled in the following decreasing priority order: SEI, SSI, STI, SGEI, VSEI, VSSI, VSTI.

5.2.4 Hypervisor Guest External Interrupt Registers (hgeip and hgeie)

The hgeip register is an HSXLEN-bit read-only register, formatted as shown in Figure 1.10, that indicates pending guest external interrupts for this hart. The hgeie register is an HSXLEN-bit read/write register, formatted as shown in Figure 1.11, that contains enable bits for the guest external interrupts at this hart. Guest external interrupt number i corresponds with bit i in both hgeip and hgeie.

Hypervisor guest external interrupt-pending register (hgeip).
Hypervisor guest external interrupt-enable register (hgeie).

Guest external interrupts represent interrupts directed to individual virtual machines at VS-level. If a RISC-V platform supports placing a physical device under the direct control of a guest OS with minimal hypervisor intervention (known as pass-through or direct assignment between a virtual machine and the physical device), then, in such circumstance, interrupts from the device are intended for a specific virtual machine. Each bit of hgeip summarizes all pending interrupts directed to one virtual hart, as collected and reported by an interrupt controller. To distinguish specific pending interrupts from multiple devices, software must query the interrupt controller.

Support for guest external interrupts requires an interrupt controller that can collect virtual-machine-directed interrupts separately from other interrupts.

The number of bits implemented in hgeip and hgeie for guest external interrupts is  and may be zero. This number is known as GEILEN. The least-significant bits are implemented first, apart from bit 0. Hence, if GEILEN is nonzero, bits GEILEN:1 shall be writable in hgeie, and all other bit positions shall be hardwired to zeros in both hgeip and hgeie.

The set of guest external interrupts received and handled at one physical hart may differ from those received at other harts. Guest external interrupt number i at one physical hart is typically expected not to be the same as guest external interrupt i at any other hart. For any one physical hart, the maximum number of virtual harts that may directly receive guest external interrupts is limited by GEILEN. The maximum this number can be for any implementation is 31 for RV32 and 63 for RV64, per physical hart.

A hypervisor is always free to emulate devices for any number of virtual harts without being limited by GEILEN. Only direct pass-through (direct assignment) of interrupts is affected by the GEILEN limit, and the limit is on the number of virtual harts receiving such interrupts, not the number of distinct interrupts received. The number of distinct interrupts a single virtual hart may receive is determined by the interrupt controller.

Register hgeie selects the subset of guest external interrupts that cause a supervisor-level (HS-level) guest external interrupt. The enable bits in hgeie do not affect the VS-level external interrupt signal selected from hgeip by hstatus.VGEIN.

5.2.5 Hypervisor Counter-Enable Register (hcounteren)

The counter-enable register hcounteren is a 32-bit register that controls the availability of the hardware performance monitoring counters to the guest virtual machine.

Hypervisor counter-enable register (hcounteren).

When the CY, TM, IR, or HPMn bit in the hcounteren register is clear, attempts to read the cycle, time, instret, or hpmcountern register while V=1 will cause an illegal instruction exception. When one of these bits is set, access to the corresponding register is permitted when V=1, unless prevented for some other reason. In VU-mode, a counter is not readable unless the applicable bits are set in both hcounteren and scounteren.

hcounteren must be implemented. However, any of the bits may contain a hardwired value of zero, indicating reads to the corresponding counter will cause an exception when V=1. Hence, they are effectively WARL fields.

5.2.6 Hypervisor Time Delta Registers (htimedelta, htimedeltah)

The htimedelta CSR is a read/write register that contains the delta between the value of the time CSR and the value returned in VS-mode or VU-mode. That is, reading the time CSR in VS or VU mode returns the sum of the contents of htimedelta and the actual value of time.

Because overflow is ignored when summing htimedelta and time, large values of htimedelta may be used to represent negative time offsets.

Hypervisor time delta register, HSXLEN=64.

For HSXLEN=32 only, htimedelta holds the lower 32 bits of the delta, and htimedeltah holds the upper 32 bits of the delta.

Hypervisor time delta registers, HSXLEN=32.

5.2.7 Hypervisor Trap Value Register (htval)

The htval register is an HSXLEN-bit read/write register formatted as shown in Figure 1.15. When a trap is taken into HS-mode, htval is written with additional exception-specific information, alongside stval, to assist software in handling the trap.

Hypervisor trap value register (htval).

When a guest-page-fault trap is taken into HS-mode, htval is written with either zero or the guest physical address that faulted, shifted right by 2 bits. For other traps, htval is set to zero, but a future standard or extension may redefine htval’s setting for other traps.

For misaligned loads and stores that cause guest-page faults, htval will contain zero or the guest physical address of the portion of the access that caused the fault. For instruction guest-page faults on systems with variable-length instructions, htval will contain zero or the guest physical address of the portion of the instruction that caused the fault.

The least-significant two bits of a faulting guest physical address are not available in htval. If needed, these bits are ordinarily the same as the least-significant two bits of the faulting virtual address in stval. For faults due to implicit memory accesses for VS-level address translation, the least-significant two bits are instead zeros. These cases can be distinguished using the value provided in register htinst.

htval is a WARL register that must be able to hold zero and may be capable of holding only an arbitrary subset of other 2-bit-shifted guest physical addresses, if any.

Unless it has reason to assume otherwise (such as a platform standard), software that writes a value to htval should read back from htval to confirm the stored value.

5.2.8 Hypervisor Trap Instruction Register (htinst)

The htinst register is an HSXLEN-bit read/write register formatted as shown in Figure 1.16. When a trap is taken into HS-mode, htinst is written with a value that, if nonzero, provides information about the instruction that trapped, to assist software in handling the trap. The values that may be written to htinst on a trap are documented in Section 1.7.3.

Hypervisor trap instruction register (htinst).

htinst is a WARL register that need only be able to hold the values that the implementation may automatically write to it on a trap.

5.2.9 Hypervisor Guest Address Translation and Protection Register (hgatp)

The hgatp register is an HSXLEN-bit read/write register, formatted as shown in Figure 1.17 for HSXLEN=32 and Figure 1.18 for HSXLEN=64, which controls guest physical address translation and protection. Similar to CSR satp, this register holds the physical page number (PPN) of the guest-physical root page table; a virtual machine identifier (VMID), which facilitates address-translation fences on a per-virtual-machine basis; and the MODE field, which selects the address-translation scheme for guest physical addresses. When mstatus.TVM=1, attempts to read or write hgatp while executing in HS-mode will raise an illegal instruction exception.

RV32 Hypervisor guest address translation and protection register hgatp.
RV64 Hypervisor guest address translation and protection register hgatp, for MODE values Bare, Sv39x4, and Sv48x4.

Table 1.19 shows the encodings of the MODE field for RV32 and RV64. When MODE=Bare, guest physical addresses are equal to supervisor physical addresses, and there is no further memory protection for a guest virtual machine beyond the physical memory protection scheme described in Section [sec:pmp]. In this case, the remaining fields in hgatp must be set to zeros.

For RV32, the only other valid setting for MODE is Sv32x4, which is a modification of the usual Sv32 paged virtual-memory scheme, extended to support 34-bit guest physical addresses. For RV64, modes Sv39x4 and Sv48x4 are defined as modifications of the Sv39 and Sv48 paged virtual-memory schemes. All of these paged virtual-memory schemes are described in Section 1.5.1. An additional RV64 scheme, Sv57x4, may be defined in a later version of this specification.

The remaining MODE settings for RV64 are reserved for future use and may define different interpretations of the other fields in hgatp.

Encoding of hgatp MODE field.

RV64 implementations are not required to support all defined RV64 MODE settings.

A write to hgatp with an unsupported MODE value is not ignored as it is for satp. Instead, the fields of hgatp are WARL in the normal way, when so indicated.

As explained in Section 1.5.1, for the paged virtual-memory schemes (Sv32x4, Sv39x4, and Sv48x4), the root page table is 16 KiB and must be aligned to a 16-KiB boundary. In these modes, the lowest two bits of the physical page number (PPN) in hgatp always read as zeros. An implementation that supports only the defined paged virtual-memory schemes and/or Bare may hardwire PPN[1:0] to zero.

The number of VMID bits is  and may be zero. The number of implemented VMID bits, termed VMIDLEN , may be determined by writing one to every bit position in the VMID field, then reading back the value in hgatp to see which bit positions in the VMID field hold a one. The least-significant bits of VMID are implemented first: that is, if VMIDLEN > 0, VMID[VMIDLEN-1:0] is writable. The maximal value of VMIDLEN, termed VMIDMAX, is 7 for Sv32x4 or 14 for Sv39x4 and Sv48x4.

Note that writing hgatp does not imply any ordering constraints between page-table updates and subsequent guest physical address translations. If the new virtual machine’s guest physical page tables have been modified, it may be necessary to execute an HFENCE.GVMA instruction (see Section 1.3.1) before or after writing hgatp.

5.2.10 Virtual Supervisor Status Register (vsstatus)

The vsstatus register is a VSXLEN-bit read/write register that is VS-mode’s version of supervisor register sstatus, formatted as shown in Figure 1.21 when VSXLEN=32 and Figure 1.21 when VSXLEN=64. When V=1, vsstatus substitutes for the usual sstatus, so instructions that normally read or modify sstatus actually access vsstatus instead.

Virtual supervisor status register (vsstatus) for RV32.
Virtual supervisor status register (vsstatus) for RV64.

The UXL field controls the effective XLEN for VU-mode, which may differ from the XLEN for VS-mode (VSXLEN). When VSXLEN=32, the UXL field does not exist, and VU-mode XLEN=32. When VSXLEN=64, UXL is a WARL field that is encoded the same as the MXL field of misa, shown in Table [misabase] on page . In particular, the implementation may hardwire field UXL so that VU-mode XLEN=VSXLEN.

If VSXLEN is changed from 32 to a wider width, and if field UXL is not hardwired to a forced value, it gets the value corresponding to the widest supported width not wider than the new VSXLEN.

When V=1, both vsstatus.FS and the HS-level sstatus.FS are in effect. Attempts to execute a floating-point instruction when either field is 0 (Off) raise an illegal-instruction exception. Modifying the floating-point state when V=1 causes both fields to be set to 3 (Dirty).

For a hypervisor to benefit from the extension context status, it must have its own copy in the HS-level sstatus, maintained independently of a guest OS running in VS-mode. While a version of the extension context status obviously must exist in vsstatus for VS-mode, a hypervisor cannot rely on this version being maintained correctly, given that VS-level software can change vsstatus.FS arbitrarily. If the HS-level sstatus.FS were not independently active and maintained by the hardware in parallel with vsstatus.FS while V=1, hypervisors would always be forced to conservatively swap all floating-point state when context-switching between virtual machines.

Read-only fields SD and XS summarize the extension context status as it is visible to VS-mode only. For example, the value of the HS-level sstatus.FS does not affect vsstatus.SD.

An implementation may hardwire field UBE to be always the same as hstatus.VSBE.

When V=0, vsstatus does not directly affect the behavior of the machine, unless the MPRV feature in the mstatus register or the SPRV feature in the hstatus register is used to execute a load or store as though V=1.

5.2.11 Virtual Supervisor Interrupt Registers (vsip and vsie)

The vsip and vsie registers are VSXLEN-bit read/write registers that are VS-mode’s versions of supervisor CSRs sip and sie, formatted as shown in Figures 1.22 and 1.23 respectively. When V=1, vsip and vsie substitute for the usual sip and sie, so instructions that normally read or modify sip/sie actually access vsip/vsie instead. However, interrupts directed to HS-level continue to be indicated in the HS-level sip register, not in vsip, when V=1.

Virtual supervisor interrupt-pending register (vsip).
Virtual supervisor interrupt-enable register (vsie).

The standard portions (bits 15:0) of registers vsip and vsie are formatted as shown in Figures 1.24 and 1.25 respectively.

Standard portion (bits 15:0) of vsip.
Standard portion (bits 15:0) of vsie.

When bit 10 of hideleg is zero, vsip.SEIP and vsie.SEIE are read-only zeros. Else, vsip.SEIP is a read-only alias of hip.VSEIP, and vsie.SEIE is an alias of hie.VSEIE.

When bit 6 of hideleg is zero, vsip.STIP and vsie.STIE are read-only zeros. Else, vsip.STIP is a read-only alias of hip.VSTIP, and vsie.STIE is an alias of hie.VSTIE.

When bit 2 of hideleg is zero, vsip.SSIP and vsie.SSIE are read-only zeros. Else, vsip.SSIP and vsie.SSIE are aliases (both writable) of hip.VSSIP and hie.VSSIE.

5.2.12 Virtual Supervisor Trap Vector Base Address Register (vstvec)

The vstvec register is a VSXLEN-bit read/write register that is VS-mode’s version of supervisor register stvec, formatted as shown in Figure 1.26. When V=1, vstvec substitutes for the usual stvec, so instructions that normally read or modify stvec actually access vstvec instead. When V=0, vstvec does not directly affect the behavior of the machine.

Virtual supervisor trap vector base address register (vstvec).

5.2.13 Virtual Supervisor Scratch Register (vsscratch)

The vsscratch register is a VSXLEN-bit read/write register that is VS-mode’s version of supervisor register sscratch, formatted as shown in Figure 1.27. When V=1, vsscratch substitutes for the usual sscratch, so instructions that normally read or modify sscratch actually access vsscratch instead. The contents of vsscratch never directly affect the behavior of the machine.

Virtual supervisor scratch register (vsscratch).

5.2.14 Virtual Supervisor Exception Program Counter (vsepc)

The vsepc register is a VSXLEN-bit read/write register that is VS-mode’s version of supervisor register sepc, formatted as shown in Figure 1.28. When V=1, vsepc substitutes for the usual sepc, so instructions that normally read or modify sepc actually access vsepc instead. When V=0, vsepc does not directly affect the behavior of the machine.

vsepc is a WARL register that must be able to hold the same set of values that sepc can hold.

Virtual supervisor exception program counter (vsepc).

5.2.15 Virtual Supervisor Cause Register (vscause)

The vscause register is a VSXLEN-bit read/write register that is VS-mode’s version of supervisor register scause, formatted as shown in Figure 1.29. When V=1, vscause substitutes for the usual scause, so instructions that normally read or modify scause actually access vscause instead. When V=0, vscause does not directly affect the behavior of the machine.

vscause is a WLRL register that must be able to hold the same set of values that scause can hold.

Virtual supervisor cause register (vscause).

5.2.16 Virtual Supervisor Trap Value Register (vstval)

The vstval register is a VSXLEN-bit read/write register that is VS-mode’s version of supervisor register stval, formatted as shown in Figure 1.30. When V=1, vstval substitutes for the usual stval, so instructions that normally read or modify stval actually access vstval instead. When V=0, vstval does not directly affect the behavior of the machine.

vstval is a WARL register that must be able to hold the same set of values that stval can hold.

Virtual supervisor trap value register (vstval).

5.2.17 Virtual Supervisor Address Translation and Protection Register (vsatp)

The vsatp register is a VSXLEN-bit read/write register that is VS-mode’s version of supervisor register satp, formatted as shown in Figure 1.31 for VSXLEN=32 and Figure 1.32 for VSXLEN=64. When V=1, vsatp substitutes for the usual satp, so instructions that normally read or modify satp actually access vsatp instead. vsatp controls VS-level address translation, the first stage of two-stage translation for guest virtual addresses (see Section 1.5).

When V=0, a write to vsatp with an unsupported MODE value is not ignored as it is for satp. Instead, the fields of vsatp are WARL in the normal way.

When V=0, vsatp does not directly affect the behavior of the machine, unless the MPRV feature in the mstatus register or the SPRV feature in the hstatus register is used to execute a load or store as though V=1.

RV32 virtual supervisor address translation and protection register vsatp.
RV64 virtual supervisor address translation and protection register vsatp, for MODE values Bare, Sv39, and Sv48.

5.3 Hypervisor Instructions

The hypervisor extension adds two privileged fence instructions.

5.3.1 Hypervisor Memory-Management Fence Instructions

image

The hypervisor memory-management fence instructions, HFENCE.GVMA and HFENCE.VVMA, are valid only in HS-mode when mstatus.TVM=0, or in M-mode (irrespective of mstatus.TVM). These instructions perform a function similar to SFENCE.VMA (Section [sec:sfence.vma]), except applying to the guest-physical memory-management data structures controlled by CSR hgatp (HFENCE.GVMA) or the VS-level memory-management data structures controlled by CSR vsatp (HFENCE.VVMA). Instruction SFENCE.VMA applies only to the memory-management data structures controlled by the current satp (either the HS-level satp when V=0 or vsatp when V=1).

If an HFENCE.VVMA instruction executes without trapping, its effect is much the same as temporarily entering VS-mode and executing SFENCE.VMA. Executing an HFENCE.VVMA guarantees that any previous stores already visible to the current hart are ordered before all subsequent implicit reads by that hart of the VS-level memory-management data structures, when those implicit reads are for instructions that

are subsequent to the HFENCE.VVMA, and

execute when hgatp.VMID has the same setting as it did when HFENCE.VVMA executed.

Implicit reads need not be ordered when hgatp.VMID is different than at the time HFENCE.VVMA executed. If operand rs1x0, it specifies a single guest virtual address, and if operand rs2x0, it specifies a single guest address-space identifier (ASID).

An HFENCE.VVMA instruction applies only to a single virtual machine, identified by the setting of hgatp.VMID when HFENCE.VVMA executes.

When rs2x0, bits XLEN-1:ASIDMAX of the value held in rs2 are reserved for future use and should be zeroed by software and ignored by current implementations. Furthermore, if ASIDLEN < ASIDMAX, the implementation shall ignore bits ASIDMAX-1:ASIDLEN of the value held in rs2.

Simpler implementations of HFENCE.VVMA can ignore the guest virtual address in rs1 and the guest ASID value in rs2, as well as hgatp.VMID, and always perform a global fence for the VS-level memory management of all virtual machines, or even a global fence for all memory-management data structures.

Executing an HFENCE.GVMA instruction guarantees that any previous stores already visible to the current hart are ordered before all subsequent implicit reads by that hart of guest-physical memory-management data structures done for instructions that follow the HFENCE.GVMA. If operand rs1x0, it specifies a single guest physical address, shifted right by 2 bits, and if operand rs2x0, it specifies a single virtual machine identifier (VMID).

For HFENCE.GVMA, a guest physical address specified in rs1 is shifted right by 2 bits to accommodate addresses wider than the current XLEN. For RV32, the hypervisor extension permits guest physical addresses as wide as 34 bits, and rs1 specifies bits 33:2 of such an address. This shift-by-2 encoding of guest physical addresses matches the encoding of physical addresses in PMP address registers (Section [sec:pmp]) and in page table entries (Sections [sec:sv32], [sec:sv39], and [sec:sv48]).

When rs2x0, bits XLEN-1:VMIDMAX of the value held in rs2 are reserved for future use and should be zeroed by software and ignored by current implementations. Furthermore, if VMIDLEN < VMIDMAX, the implementation shall ignore bits VMIDMAX-1:VMIDLEN of the value held in rs2.

Simpler implementations of HFENCE.GVMA can ignore the guest physical address in rs1 and the VMID value in rs2 and always perform a global fence for the guest-physical memory management of all virtual machines, or even a global fence for all memory-management data structures.

5.4 Machine-Level CSRs

The hypervisor extension augments or modifies machine CSRs mstatus, mstatush, mideleg, mip, and mie, and adds CSRs mtval2 and mtinst.

5.4.1 Machine Status Registers (mstatus and mstatush)

The hypervisor extension adds one field, MPV, to the machine-level mstatus or mstatush CSR, and modifies the behavior of several existing mstatus fields. Figure 1.33 shows the modified mstatus register when the hypervisor extension is implemented and MXLEN=64. When MXLEN=32, the hypervisor extension adds MPV not to mstatus but to mstatush, which must exist. Figure 1.34 shows the mstatush register when the hypervisor extension is implemented and MXLEN=32.

Machine status register (mstatus) for RV64 when the hypervisor extension is implemented.
Additional machine status register (mstatush) for RV32 when the hypervisor extension is implemented. The format of mstatus is unchanged for RV32.

The MPV bit (Machine Previous Virtualization Mode) is written by the implementation whenever a trap is taken into M-mode. Just as the MPP bit is set to the privilege mode at the time of the trap, the MPV bit is set to the value of the virtualization mode V at the time of the trap. When an MRET instruction is executed, the virtualization mode V is set to MPV, unless MPP=3, in which case V remains 0.

The TSR and TVM fields of mstatus affect execution only in HS-mode, not in VS-mode. The TW field affects execution in all modes except M-mode.

MPRV MPV MPP Effect
0 Normal access; current privilege and virtualization modes apply.
1 0 0 U-level access with HS-level translation and protection only.
1 0 1 HS-level access with HS-level translation and protection only.
1 3 M-level access with no translation.
1 1 0 VU-level access with two-stage translation and protection. The HS-level MXR bit makes any executable page readable. vsstatus.MXR makes readable those pages marked executable at the VS translation stage, but only if readable at the guest-physical translation stage.
1 1 1 VS-level access with two-stage translation and protection. The HS-level MXR bit makes any executable page readable. vsstatus.MXR makes readable those pages marked executable at the VS translation stage, but only if readable at the guest-physical translation stage. vsstatus.SUM applies instead of the HS-level SUM bit.

The hypervisor extension changes the behavior of the the Modify Privilege field, MPRV, of mstatus. When MPRV=0, translation and protection behave as normal. When MPRV=1, explicit memory accesses are translated and protected, and endianness is applied, as though the current virtualization mode were set to MPV and the current privilege mode were set to MPP. Table [h-mprv] enumerates the cases.

The mstatus register is a superset of the HS-level sstatus register but is not a superset of vsstatus.

5.4.2 Machine Interrupt Delegation Register (mideleg)

When the hypervisor extension is implemented, bits 10, 6, and 2 of mideleg (corresponding to the standard VS-level interrupts) are each hardwired to one. Furthermore, if any guest external interrupts are implemented (GEILEN is nonzero), bit 12 of mideleg (corresponding to supervisor-level guest external interrupts) is also hardwired to one. VS-level interrupts and guest external interrupts are always delegated past M-mode to HS-mode.

5.4.3 Machine Interrupt Registers (mip and mie)

The hypervisor extension gives registers mip and mie additional active bits for the hypervisor-added interrupts. Figures 1.35 and 1.36 show the standard portions (bits 15:0) of registers mip and mie when the hypervisor extension is implemented.

Standard portion (bits 15:0) of mip.
Standard portion (bits 15:0) of mie.

Bits SGEIP, VSEIP, VSTIP, and VSSIP in mip are aliases for the same bits in hypervisor CSR hip, while SGEIE, VSEIE, VSTIE, and VSSIE in mie are aliases for the same bits in hie.

Instructions CSRRS and CSRRC have the same modified behavior for bit VSEIP in mip as they do for VSEIP in hip, as described in Section 1.2.3. There is only one software-writable VSEIP bit, accessible through either mip or hip.

5.4.4 Machine Second Trap Value Register (mtval2)

The mtval2 register is an MXLEN-bit read/write register formatted as shown in Figure 1.37. When a trap is taken into M-mode, mtval2 is written with additional exception-specific information, alongside mtval, to assist software in handling the trap.

Machine second trap value register (mtval2).

When a guest-page-fault trap is taken into M-mode, mtval2 is written with either zero or the guest physical address that faulted, shifted right by 2 bits. For other traps, mtval2 is set to zero, but a future standard or extension may redefine mtval2’s setting for other traps.

For misaligned loads and stores that cause guest-page faults, mtval2 will contain zero or the guest physical address of the portion of the access that caused the fault. For instruction guest-page faults on systems with variable-length instructions, mtval2 will contain zero or the guest physical address of the portion of the instruction that caused the fault.

mtval2 is a WARL register that must be able to hold zero and may be capable of holding only an arbitrary subset of other 2-bit-shifted guest physical addresses, if any.

5.4.5 Machine Trap Instruction Register (mtinst)

The mtinst register is an MXLEN-bit read/write register formatted as shown in Figure 1.38. When a trap is taken into M-mode, mtinst is written with a value that, if nonzero, provides information about the instruction that trapped, to assist software in handling the trap. The values that may be written to mtinst on a trap are documented in Section 1.7.3.

Machine trap instruction register (mtinst).

mtinst is a WARL register that need only be able to hold the values that the implementation may automatically write to it on a trap.

5.5 Two-Stage Address Translation

Whenever the current virtualization mode V is 1 (and assuming mstatus.MPRV=0), two-stage address translation and protection is in effect. For any virtual memory access, the original virtual address is converted in the first stage by VS-level address translation, as controlled by the vsatp register, into a guest physical address. The guest physical address is then converted in the second stage by guest physical address translation, as controlled by the hgatp register, into a supervisor physical address. Although there is no option to disable two-stage address translation when V=1, either stage of translation can be effectively disabled by zeroing the corresponding vsatp or hgatp register.

The vsstatus field MXR, which makes execute-only pages readable, only overrides VS-level page protection. Setting MXR at VS-level does not override guest-physical page protections. Setting MXR at HS-level, however, overrides both VS-level and guest-physical execute-only permissions.

When V=1, memory accesses that would normally bypass address translation are subject to guest physical address translation alone. This includes memory accesses made in support of VS-level address translation, such as reads and writes of VS-level page tables.

Machine-level physical memory protection applies to supervisor physical addresses and is in effect regardless of virtualization mode.

5.5.1 Guest Physical Address Translation

The mapping of guest physical addresses to supervisor physical addresses is controlled by CSR hgatp (Section 1.2.9).

When the address translation scheme selected by the MODE field of hgatp is Bare, guest physical addresses are equal to supervisor physical addresses without modification, and no memory protection applies in the trivial translation of guest physical addresses to supervisor physical addresses.

When hgatp.MODE specifies a translation scheme of Sv32x4, Sv39x4, or Sv48x4, guest physical address translation is a variation on the usual page-based virtual address translation scheme of Sv32, Sv39, or Sv48, respectively. In each case, the size of the incoming address is widened by 2 bits (to 34, 41, or 50 bits). To accommodate the 2 extra bits, the root page table (only) is expanded by a factor of four to be 16 KiB instead of the usual 4 KiB. Matching its larger size, the root page table also must be aligned to a 16 KiB boundary instead of the usual 4 KiB page boundary. Except as noted, all other aspects of Sv32, Sv39, or Sv48 are adopted unchanged for guest physical address translation. Non-root page tables and all page table entries (PTEs) have the same formats as documented in Sections [sec:sv32], [sec:sv39], and [sec:sv48].

For Sv32x4, an incoming guest physical address is partitioned into a virtual page number (VPN) and page offset as shown in Figure 1.39. This partitioning is identical to that for an Sv32 virtual address as depicted in Figure [sv32va] (page ), except with 2 more bits at the high end in VPN[1]. (Note that the fields of a partitioned guest physical address also correspond one-for-one with the structure that Sv32 assigns to a physical address, depicted in Figure [rv32va].)

Sv32x4 virtual address (guest physical address).

For Sv39x4, an incoming guest physical address is partitioned as shown in Figure 1.40. This partitioning is identical to that for an Sv39 virtual address as depicted in Figure [sv39va] (page ), except with 2 more bits at the high end in VPN[2]. Address bits 63:41 must all be zeros, or else a guest-page-fault exception occurs.

Sv39x4 virtual address (guest physical address).

For Sv48x4, an incoming guest physical address is partitioned as shown in Figure 1.41. This partitioning is identical to that for an Sv48 virtual address as depicted in Figure [sv48va] (page ), except with 2 more bits at the high end in VPN[3]. Address bits 63:50 must all be zeros, or else a guest-page-fault exception occurs.

Sv48x4 virtual address (guest physical address).

The page-based guest physical address translation scheme for RV32, Sv32x4, is defined to support a 34-bit guest physical address so that an RV32 hypervisor need not be limited in its ability to virtualize real 32-bit RISC-V machines, even those with 33-bit or 34-bit physical addresses. This may include the possibility of a machine virtualizing itself, if it happens to use 33-bit or 34-bit physical addresses. Multiplying the size and alignment of the root page table by a factor of four is the cheapest way to extend Sv32 to cover a 34-bit address. The possible wastage of 12 KiB for an unnecessarily large root page table is expected to be of negligible consequence for most (maybe all) real uses.

A consistent ability to virtualize machines having as much as four times the physical address space as virtual address space is believed to be of some utility also for RV64. For a machine implementing 39-bit virtual addresses (Sv39), for example, this allows the hypervisor extension to support up to a 41-bit guest physical address space without either necessitating hardware support for 48-bit virtual addresses (Sv48) or falling back to emulating the larger address space using shadow page tables.

The conversion of an Sv32x4, Sv39x4, or Sv48x4 guest physical address is accomplished with the same algorithm used for Sv32, Sv39, or Sv48, as presented in Section [sv32algorithm], except that:

in step 1, $a = {\tt hgatp}.PPN\times {PAGESIZE}$;

the current privilege mode is always taken to be U-mode; and

guest-page-fault exceptions are raised instead of regular page-fault exceptions.

For guest physical address translation, all memory accesses (including those made to access data structures for VS-level address translation) are considered to be user-level accesses, as though executed in U-mode. Access type permissions—readable, writable, or executable—are checked during guest physical address translation the same as for VS-level address translation. For a memory access made to support VS-level address translation (such as to read/write a VS-level page table), permissions are checked as though for a load or store, not for the original access type. However, any exception is always reported for the original access type (instruction, load, or store/AMO).

Guest physical address translation uses the identical format for PTEs as regular address translation, even including the U bit, due to the possibility of sharing some (or all) page tables between guest physical address translation and regular HS-level address translation. Regardless of whether this usage will ever become common, we chose not to preclude it.

5.5.2 Guest-Page Faults

Guest-page-fault traps may be delegated from M-mode to HS-mode under the control of CSR medeleg, but cannot be delegated to other operating modes. On a guest-page fault, CSR mtval or stval is written with the faulting guest virtual address as usual, and mtval2 or htval is written either with zero or with the faulting guest physical address, shifted right by 2 bits. CSR mtinst or htinst may also be written with information about the faulting instruction or other reason for the access, as explained in Section 1.7.3.

When an instruction fetch or a misaligned memory access straddles a page boundary, two different address translations are involved. When a guest-page fault occurs in such a circumstance, the faulting virtual address written to mtval/stval is the same as would be required for a regular page fault. Thus, the faulting virtual address may be a page-boundary address that is higher than the instruction’s original virtual address, if the byte at that page boundary is among the accessed bytes. A nonzero guest physical address written by the machine to mtval2/htval shall always correspond to the exact guest virtual address written to mtval/stval.

5.5.3 Memory-Management Fences

The behavior of the SFENCE.VMA instruction is affected by the current virtualization mode V. When V=0, the virtual-address argument is an HS-level virtual address, and the ASID argument is an HS-level ASID. The instruction orders stores only to HS-level address-translation structures with subsequent HS-level address translations.

When V=1, the virtual-address argument to SFENCE.VMA is a guest virtual address within the current virtual machine, and the ASID argument is a VS-level ASID within the current virtual machine. The current virtual machine is identified by the VMID field of CSR hgatp, and the effective ASID can be considered to be the combination of this VMID with the VS-level ASID. The SFENCE.VMA instruction orders stores only to the VS-level address-translation structures with subsequent VS-level address translations for the same virtual machine, i.e., only when hgatp.VMID is the same as when the SFENCE.VMA executed.

Hypervisor instructions HFENCE.GVMA and HFENCE.VVMA provide additional memory-management fences to complement SFENCE.VMA. These instructions are described in Section 1.3.1.

Section [pmp-vmem] discusses the intersection between physical memory protection (PMP) and page-based address translation. It is noted there that, when PMP settings are modified in a manner that affects either the physical memory that holds page tables or the physical memory to which page tables point, M-mode software must synchronize the PMP settings with the virtual memory system. For HS-level address translation, this is accomplished by executing in M-mode an SFENCE.VMA instruction with rs1=x0 and rs2=x0, after the PMP CSRs are written. If guest physical address translation is in use, synchronization with its data structures is also needed. When PMP settings are modified in a manner that affects either the physical memory that holds guest-physical page tables or the physical memory to which guest-physical page tables point, an HFENCE.GVMA instruction with rs1=x0 and rs2=x0 must be executed in M-mode after the PMP CSRs are written. An HFENCE.VVMA instruction is not required.

5.6 WFI in Virtual Operating Modes

Executing instruction WFI when V=1 causes an illegal instruction exception, unless it completes within an implementation-specific, bounded time limit.

The behavior required of WFI in VS-mode and VU-mode is the same as required of it in U-mode when S-mode exists.

5.7 Traps

5.7.1 Trap Cause Codes

Interrupt Exception Code Description
1 0 Reserved
1 1 Supervisor software interrupt
1 2 Virtual supervisor software interrupt
1 3 Machine software interrupt
1 4 Reserved
1 5 Supervisor timer interrupt
1 6 Virtual supervisor timer interrupt
1 7 Machine timer interrupt
1 8 Reserved
1 9 Supervisor external interrupt
1 10 Virtual supervisor external interrupt
1 11 Machine external interrupt
1 12 Supervisor guest external interrupt
1 13–15 Reserved
1 16 Available for platform or custom use
0 0 Instruction address misaligned
0 1 Instruction access fault
0 2 Illegal instruction
0 3 Breakpoint
0 4 Load address misaligned
0 5 Load access fault
0 6 Store/AMO address misaligned
0 7 Store/AMO access fault
0 8 Environment call from U-mode or VU-mode
0 9 Environment call from HS-mode
0 10 Environment call from VS-mode
0 11 Environment call from M-mode
0 12 Instruction page fault
0 13 Load page fault
0 14 Reserved
0 15 Store/AMO page fault
0 16–19 Reserved
0 20 Instruction guest-page fault
0 21 Load guest-page fault
0 22 Reserved
0 23 Store/AMO guest-page fault
0 24–31 Available for custom use
0 32–47 Reserved
0 48–63 Available for custom use
0 64 Reserved

The hypervisor extension augments the trap cause encoding. Table [hcauses] lists the possible M-mode and HS-mode trap cause codes when the hypervisor extension is implemented. Codes are added for VS-level interrupts (interrupts 2, 6, 10) and for guest-page faults (exceptions 20, 21, 23). Furthermore, environment calls from VS-mode are assigned cause 10, whereas those from HS-mode or S-mode use cause 9 as usual.

HS-mode and VS-mode ECALLs use different cause values so they can be delegated separately.

5.7.2 Trap Entry

When a trap occurs in HS-mode or U-mode, it goes to M-mode, unless delegated by medeleg or mideleg, in which case it goes to HS-mode. When a trap occurs in VS-mode or VU-mode, it goes to M-mode, unless delegated by medeleg or mideleg, in which case it goes to HS-mode, unless further delegated by hedeleg or hideleg, in which case it goes to VS-mode.

When a trap is taken into M-mode, virtualization mode V gets set to 0, and in mstatus (or mstatush) MPV and MPP are set according to Table [h-mpp]. A trap into M-mode also writes CSRs mepc, mcause, mtval, mtval2, and mtinst.

Previous Mode MPV MPP
U-mode 0 0
HS-mode 0 1
M-mode 0 3
VU-mode 1 0
VS-mode 1 1

When a trap is taken into HS-mode, virtualization mode V is first set to 0, hstatus.SP2V is set to hstatus.SPV, hstatus.SP2P is set to sstatus.SPP, and lastly hstatus.SPV and sstatus.SPP are set according to Table [h-spp]. A trap into HS-mode also writes CSRs sepc, scause, stval, htval, and htinst.

Previous Mode SPV SPP
U-mode 0 0
HS-mode 0 1
VU-mode 1 0
VS-mode 1 1

When a trap is taken into VS-mode, vsstatus.SPP is set according to Table [h-vspp]. Register hstatus and the HS-level sstatus are not modified, and the virtualization mode V remains 1. A trap into VS-mode also writes CSRs vsepc, vscause, and vstval.

Previous Mode SPP
VU-mode 0
VS-mode 1

5.7.3 Transformed Instruction or Pseudoinstruction for mtinst or htinst

On any trap into M-mode or HS-mode, one of these values is written automatically into the appropriate trap instruction CSR, mtinst or htinst:

zero;

a transformation of the trapping instruction;

a custom value (allowed only if the trapping instruction is nonstandard); or

a special pseudoinstruction.

Except when a pseudoinstruction value is required (described later), the value written to mtinst or htinst may always be zero, indicating that the hardware is providing no information in the register for this particular trap.

The value written to the trap instruction CSR serves two purposes. The first is to improve the speed of instruction emulation in a trap handler, partly by allowing the handler to skip loading the trapping instruction from memory, and partly by obviating some of the work of decoding and executing the instruction. The second purpose is to supply, via pseudoinstructions, additional information about guest-page-fault exceptions caused by implicit memory accesses done for VS-level address translation.

A transformation of the trapping instruction is written instead of simply a copy of the original instruction in order to minimize the burden for hardware yet still provide to a trap handler the information needed to emulate the instruction. An implementation may at any time reduce its effort by substituting zero in place of the transformed instruction.

On an interrupt, the value written to the trap instruction register is always zero. On a synchronous exception, if a nonzero value is written, one of the following shall be true about the value:

  • Bit 0 is 1, and replacing bit 1 with 1 makes the value into a valid encoding of a standard instruction.

    In this case, the instruction that trapped is the same kind as indicated by the register value, and the register value is the transformation of the trapping instruction, as defined later. For example, if bits 1:0 are binary 11 and the register value is the encoding of a standard LW (load word) instruction, then the trapping instruction is LW, and the register value is the transformation of the trapping LW instruction.

  • Bit 0 is 1, and replacing bit 1 with 1 makes the value into an instruction encoding that is explicitly available for a custom instruction (not an unused reserved encoding).

    This is a custom value. The instruction that trapped is a nonstandard instruction. The interpretation of a custom value is not otherwise specified by this standard.

  • The value is one of the special pseudoinstructions defined later, all of which have bits 1:0 equal to 00.

These three cases exclude a large number of other possible values, such as all those having bits 1:0 equal to binary 10. A future standard or extension may define additional cases, thus allowing values that are currently excluded. Software may safely treat an unrecognized value in a trap instruction register the same as zero.

To be forward-compatible with future revisions of this standard, software that interprets a nonzero value from mtinst or htinst must fully verify that the value conforms to one of the cases listed above. For instance, for RV64, discovering that bits 6:0 of mtinst are 0000011 and bits 14:12 are 010 is not sufficient to establish that the first case applies and the trapping instruction is a standard LW instruction; rather, software must also confirm that bits 63:32 of mtinst are all zeros. A future standard might define new values for 64-bit mtinst that are nonzero in bits 63:32 yet may coincidentally have in bits 31:0 the same bit patterns as standard RV64 instructions.

Unlike for standard instructions, there is no requirement that the instruction encoding of a custom value be of the same “kind” as the instruction that trapped (or even have any correlation with the trapping instruction).

Table [tab:tinst-values] shows the values that may be automatically written to the trap instruction register for each standard exception cause. For exceptions that prevent the fetching of an instruction, only zero or a pseudoinstruction value may be written. A custom value may be automatically written only if the instruction that traps is nonstandard. A future standard or extension may permit other values to be written, chosen from the set of allowed values established earlier.

Transformed Pseudo-
Standard Custom instruction
Exception Zero Instruction Value Value
Instruction address misaligned Yes No Yes No
Instruction access fault Yes No No No
Illegal instruction Yes No No No
Breakpoint Yes No Yes No
Load address misaligned Yes Yes Yes No
Load access fault Yes Yes Yes No
Store/AMO address misaligned Yes Yes Yes No
Store/AMO access fault Yes Yes Yes No
Environment call Yes No Yes No
Instruction page fault Yes No No No
Load page fault Yes Yes Yes No
Store/AMO page fault Yes Yes Yes No
Instruction guest-page fault Yes No No Yes
Load guest-page fault Yes Yes Yes Yes
Store/AMO guest-page fault Yes Yes Yes Yes

As enumerated in the table, a synchronous exception may write to the trap instruction register a standard transformation of the trapping instruction only for exceptions that arise from explicit memory accesses (from loads, stores, and AMO instructions). Accordingly, standard transformations are currently defined only for these memory-access instructions. If a synchronous trap occurs for a standard instruction for which no transformation has been defined, the trap instruction register shall be written with zero (or, under certain circumstances, with a special pseudoinstruction value).

For a standard load instruction that is not a compressed instruction and is one of LB, LBU, LH, LHU, LW, LWU, LD, FLW, FLD, or FLQ, the transformed instruction has the format shown in Figure 1.42.

Transformed noncompressed load instruction (LB, LBU, LH, LHU, LW, LWU, LD, FLW, FLD, or FLQ). Fields funct3, rd, and opcode are the same as the trapping load instruction.

For a standard store instruction that is not a compressed instruction and is one of SB, SH, SW, SD, FSW, FSD, or FSQ, the transformed instruction has the format shown in Figure 1.43.

Transformed noncompressed store instruction (SB, SH, SW, SD, FSW, FSD, or FSQ). Fields rs2, funct3, and opcode are the same as the trapping store instruction.

For a standard atomic instruction (load-reserved, store-conditional, or AMO instruction), the transformed instruction has the format shown in Figure 1.44.

Transformed atomic instruction (load-reserved, store-conditional, or AMO instruction). All fields are the same as the trapping instruction except bits 19:15, Addr. Offset.

In the transformed instructions above, the Addr. Offset field that replaces the instruction’s rs1 field in bits 19:15 is the positive difference between the faulting virtual address (written to mtval or stval) and the original virtual address. This difference can be nonzero only for an access or page fault that occurs for a misaligned memory access. Note also that, for basic loads and stores, the transformations replace the instruction’s immediate offset fields with zero.

For a standard compressed instruction (16-bit size), the transformed instruction is found as follows:

  1. Expand the compressed instruction to its 32-bit equivalent.

  2. Transform the 32-bit equivalent instruction.

  3. Replace bit 1 with a 0.

Bits 1:0 of a transformed standard instruction will be binary 01 if the trapping instruction is compressed and 11 if not.

In decoding the contents of mtinst or htinst, once software has determined that the register contains the encoding of a standard basic load (LB, LBU, LH, LHU, LW, LWU, LD, FLW, FLD, or FLQ) or basic store (SB, SH, SW, SD, FSW, FSD, or FSQ), it is not necessary to confirm also that the immediate offset fields (31:25, and 24:20 or 11:7) are zeros. The knowledge that the register’s value is the encoding of a basic load/store is sufficient to prove that the trapping instruction is of the same kind.

A future version of this standard may add information to the fields that are currently zeros; however, for backwards compatiblity, any such information will be for performance purposes only and can safely be ignored.

For guest-page faults, the trap instruction register is written with a special pseudoinstruction value if: (a) the faulting memory access is an implicit access for VS-level address translation, and (b) a nonzero value (the faulting guest physical address) is written to mtval2 or htval. If both conditions are met, the value written to mtinst or htinst must be taken from Table [tab:pseudoinsts]; zero is not allowed.

Value Meaning
0x00002000 32-bit read for VS-level address translation (RV32)
0x00002020 32-bit write for VS-level address translation (RV32)
0x00003000 64-bit read for VS-level address translation (RV64)
0x00003020 64-bit write for VS-level address translation (RV64)

The defined pseudoinstruction values are designed to correspond closely with the encodings of basic loads and stores, as illustrated by Table [tab:pseudoinsts-basis].

Encoding Instruction
0x00002003 lw x0,0(x0)
0x00002023 sw x0,0(x0)
0x00003003 ld x0,0(x0)
0x00003023 sd x0,0(x0)

A write pseudoinstruction (0x00002020 or 0x00003020) is used for the case that the machine is attempting automatically to update bits A and/or D in VS-level page tables. All other implicit memory accesses for VS-level address translation will be reads. If a machine never automatically updates bits A or D in VS-level page tables (leaving this to software), the write case will never arise. The fact that such a page table update must actually be atomic, not just a simple write, is ignored for the pseudoinstruction.

If the conditions that necessitate a pseudoinstruction value can ever occur for M-mode, then mtinst cannot be hardwired entirely to zero; and likewise for HS-mode and htinst. However, in that case, the trap instruction registers may minimally support only values 0 and 0x00002000 or 0x00003000, and possibly 0x00002020 or 0x00003020, requiring as few as one or two flip-flops in hardware, per register.

There is no harm here in ignoring the atomicity requirement for page table updates, because a hypervisor is not expected in these circumstances to emulate an implicit memory access that fails. Rather, the hypervisor is given enough information about the faulting access to be able to make the memory accessible (e.g., by restoring a missing page of virtual memory) before resuming execution by retrying the faulting instruction.

5.7.4 Trap Return

The MRET instruction is used to return from a trap taken into M-mode. MRET first determines what the new operating mode will be according to the values of MPP and MPV in mstatus or mstatush, as encoded in Table [h-mpp]. MRET then in mstatus/mstatush sets MPV=0, MPP=0, MIE=MPIE, and MPIE=1. If the new operating mode will be U, VS, or VU, MRET also sets hstatus.SPRV=0. Lastly, MRET sets the virtualization and privilege modes as previously determined, and sets pc=mepc.

The SRET instruction is used to return from a trap taken into HS-mode or VS-mode. Its behavior depends on the current virtualization mode.

When executed in M-mode or HS-mode (i.e., V=0), SRET first determines what the new operating mode will be according to the values in hstatus.SPV and sstatus.SPP, as encoded in Table [h-spp]. SRET then sets hstatus.SPV=hstatus.SP2V, sstatus.SPP=hstatus.SP2P, hstatus.SP2V=0, hstatus.SP2P=0, sstatus.SIE=sstatus.SPIE, and sstatus.SPIE=1. If the new operating mode will be U, VS, or VU, SRET also sets hstatus.SPRV=0. Lastly, SRET sets the virtualization and privilege modes as previously determined, and sets pc=sepc.

When executed in VS-mode (i.e., V=1), SRET sets the privilege mode according to Table [h-vspp], then in vsstatus sets SPP=0, SIE=SPIE, and SPIE=1, and lastly sets pc=vsepc.