Five EmbedDev logo Five EmbedDev

An Embedded RISC-V Blog

Intro

For information on assembler programming:
GCC gives direct access to instructions via __asm__. e.g.
  • No argument instructions:
        __asm__ volatile ("nop");
        __asm__ volatile ("wfi");  
  • With register arguments:
    __asm__ volatile ("csrrw    %0, mie, %1"  /* read and write atomically */
                          : "=r" (ret) /* output: register %0 */
                          : "r" (value)  /* input: register %1 */
                          : /* clobbers: none */);
Opcodes are listed in machine readable format here.

Instruction List


rv32 rv64 rv128 a c d f m n q v custom csr supervisor hypervisor

rv32

conditional branches unconditional jumps programmers model for base integer isa integer register immediate instructions integer computational instructions
integer register register operations load and store instructions sec:fence rv32 environment call and breakpoints

rv32 / conditional-branches

RV32I Base Integer Instruction Set, Version 2.1 / Control Transfer Instructions
beq rs1, rs2, bimm12
BEQ and BNE take the branch if registers rs1 and rs2 are equal or unequal respectively
blt rs1, rs2, bimm12
BLT and BLTU take the branch if rs1 is less than rs2 , using signed and unsigned comparison respectively
Note, BGT, BGTU, BLE, and BLEU can be synthesized by reversing the operands to BLT, BLTU, BGE, and BGEU, respectively.
bge rs1, rs2, bimm12
BGE and BGEU take the branch if rs1 is greater than or equal to rs2 , using signed and unsigned comparison respectively
bltu rs1, rs2, bimm12
Signed array bounds may be checked with a single BLTU instruction, since any negative index will compare greater than any nonnegative bound.

rv32 / unconditional-jumps

RV32I Base Integer Instruction Set, Version 2.1 / Control Transfer Instructions
jalr rd, rs1, imm12
The indirect jump instruction JALR (jump and link register) uses the I-type encoding
The JALR instruction was defined to enable a two-instruction sequence to jump anywhere in a 32-bit absolute address range
Note that the JALR instruction does not treat the 12-bit immediate as multiples of 2 bytes, unlike the conditional branch instructions
In practice, most uses of JALR will have either a zero immediate or be paired with a LUI or AUIPC, so the slight reduction in range is not significant.
Clearing the least-significant bit when calculating the JALR target address both simplifies the hardware slightly and allows the low bit of function pointers to be used to store auxiliary information
When used with a base rs1 = x0 , JALR can be used to implement a single instruction subroutine call to the lowest
JALR instructions should push/pop a RAS as shown in the Table 

rv32 / programmers-model-for-base-integer-isa

RV32I Base Integer Instruction Set, Version 2.1 / Programmers’ Model for Base Integer ISA
jal rd, jimm20
See the descriptions of the JAL and JALR instructions.

rv32 / integer-register-immediate-instructions

RV32I Base Integer Instruction Set, Version 2.1 / Integer Computational Instructions
lui rd, imm20
LUI (load upper immediate) is used to build 32-bit constants and uses the U-type format
LUI places the U-immediate value in the top 20 bits of the destination register rd , filling in the lowest 12 bits with zeros.
auipc rd, imm20
AUIPC (add upper immediate to pc ) is used to build pc -relative addresses and uses the U-type format
AUIPC forms a 32-bit offset from the 20-bit U-immediate, filling in the lowest 12 bits with zeros, adds this offset to the address of the AUIPC instruction, then places the result in register rd .
The AUIPC instruction supports two-instruction sequences to access arbitrary offsets from the PC for both control-flow transfers and data accesses
The combination of an AUIPC and the 12-bit immediate in a JALR can transfer control to any 32-bit PC-relative address, while an AUIPC plus the 12-bit immediate offset in regular load or store instructions can access any 32-bit PC-relative data address.
addi rd, rs1, imm12
ADDI adds the sign-extended 12-bit immediate to register rs1
ADDI rd, rs1, 0 is used to implement the MV rd, rs1 assembler pseudoinstruction.
slli rd, rs1
The right shift type is encoded in bit 30. SLLI is a logical left shift (zeros are shifted into the lower bits); SRLI is a logical right shift (zeros are shifted into the upper bits); and SRAI is an arithmetic right shift (the original sign bit is copied into the vacated upper bits).
slti rd, rs1, imm12
SLTI (set less than immediate) places the value 1 in register rd rs1 is less than the sign-extended immediate when both are treated as signed numbers, else 0 is written to rd
sltiu rd, rs1, imm12
SLTIU is similar but compares the values as unsigned numbers (i.e., the immediate is first sign-extended to XLEN bits then treated as an unsigned number)
Note, SLTIU rd, rs1, 1 sets rd rs1 equals zero, otherwise sets rd to 0 (assembler pseudoinstruction SEQZ rd, rs ).
xori rd, rs1, imm12
Note, XORI rd, rs1, -1 rs1 (assembler pseudoinstruction NOT rd, rs ).
andi rd, rs1, imm12
ANDI, ORI, XORI are logical operations that perform bitwise AND, OR, and XOR on register rs1 and the sign-extended 12-bit immediate and place the result in rd

rv32 / integer-computational-instructions

RV32I Base Integer Instruction Set, Version 2.1 / Integer Computational Instructions
add rd, rs1, rs2
add t0, t1, t2 slti t3, t2, 0 slt t4, t0, t1 bne t3, t4, overflow In RV64I, checks of 32-bit signed additions can be optimized further by comparing the results of ADD and ADDW on the operands.

rv32 / integer-register-register-operations

RV32I Base Integer Instruction Set, Version 2.1 / Integer Computational Instructions
sub rd, rs1, rs2
SUB performs the subtraction of rs2 from rs1
sll rd, rs1, rs2
SLL, SRL, and SRA perform logical left, logical right, and arithmetic right shifts on the value in register rs1 by the shift amount held in the lower 5 bits of register rs2 .
slt rd, rs1, rs2
SLT and SLTU perform signed and unsigned compares respectively, writing 1 to rd if
sltu rd, rs1, rs2
Note, SLTU rd , x0 , rs2 sets rd to 1 if rs2 is not equal to zero, otherwise sets rd to zero (assembler pseudoinstruction SNEZ rd, rs )
and rd, rs1, rs2
AND, OR, and XOR perform bitwise logical operations.

rv32 / load-and-store-instructions

RV32I Base Integer Instruction Set, Version 2.1 / Load and Store Instructions
lb rd, rs1, imm12
LB and LBU are defined analogously for 8-bit values
lh rd, rs1, imm12
LH loads a 16-bit value from memory, then sign-extends to 32-bits before storing in rd
lw rd, rs1, imm12
The LW instruction loads a 32-bit value from memory into rd
lhu rd, rs1, imm12
LHU loads a 16-bit value from memory but then zero extends to 32-bits before storing in rd
sw imm12hi, rs1, rs2, imm12lo
The SW, SH, and SB instructions store 32-bit, 16-bit, and 8-bit values from the low bits of register rs2 to memory.

rv32 / sec:fence

RV32I Base Integer Instruction Set, Version 2.1 / Memory Ordering Instructions
fence rs1, rd
The FENCE instruction is used to order device I/O and memory accesses as viewed by other RISC-V harts and external devices or coprocessors
Informally, no other RISC-V hart or external device can observe any operation in the successor set following a FENCE before any operation in the predecessor set preceding the FENCE
Instruction-set extensions might also describe new I/O instructions that will also be ordered using the I and O bits in a FENCE.
The fence mode field fm defines the semantics of the FENCE
A FENCE with fm =0000 orders all memory operations in its predecessor set before all memory operations in its successor set.
The optional FENCE.TSO instruction is encoded as a FENCE instruction with fm =1000, predecessor =RW, and successor =RW
FENCE.TSO orders all load operations in its predecessor set before all memory operations in its successor set, and all store operations in its predecessor set before all store operations in its successor set
This leaves non-AMO store operations in the FENCE.TSO’s predecessor set unordered with non-AMO loads in its successor set.
The FENCE.TSO encoding was added as an optional extension to the original base FENCE instruction encoding
The base definition requires that implementations ignore any set bits and treat the FENCE as global, and so this is a backwards-compatible extension.
The unused fields in the FENCE instructions— rs1 and rd —are reserved for finer-grain fences in future extensions

rv32 / rv32

RV32I Base Integer Instruction Set, Version 2.1 /
ecall
RV32I contains 40 unique instructions, though a simple implementation might cover the ECALL/EBREAK instructions with a single SYSTEM hardware instruction that always traps and might be able to implement the FENCE instruction as a NOP, reducing base instruction count to 38 total

rv32 / environment-call-and-breakpoints

RV32I Base Integer Instruction Set, Version 2.1 / Environment Call and Breakpoints
ebreak
The EBREAK instruction is used to return control to a debugging environment.
EBREAK was primarily designed to be used by a debugger to cause execution to stop and fall back into the debugger
EBREAK is also used by the standard gcc compiler to mark code paths that should not be executed.
Another use of EBREAK is to support “semihosting”, where the execution environment includes a debugger that can provide services over an alternate system call interface built around the EBREAK instruction
Because the RISC-V base ISA does not provide more than one EBREAK instruction, RISC-V semihosting uses a special sequence of instructions to distinguish a semihosting EBREAK from a debugger inserted EBREAK

rv64

integer register immediate instructions integer register register operations load and store instructions

rv64 / integer-register-immediate-instructions

RV64I Base Integer Instruction Set, Version 2.1 / Integer Computational Instructions
addiw rd, rs1, imm12
ADDIW is an RV64I instruction that adds the sign-extended 12-bit immediate to register rs1 and produces the proper sign-extension of a 32-bit result in rd
Note, ADDIW rd, rs1, 0 writes the sign-extension of the lower 32 bits of register rs1 into register rd (assembler pseudoinstruction SEXT.W).
slliw rd, rs1
SLLIW, SRLIW, and SRAIW are RV64I-only instructions that are analogously defined but operate on 32-bit values and produce signed 32-bit results
SLLIW, SRLIW, and SRAIW encodings with i m m [5] ≠ 0
Previously, SLLIW, SRLIW, and SRAIW with i m m [5] ≠ 0

rv64 / integer-register-register-operations

RV64I Base Integer Instruction Set, Version 2.1 / Integer Computational Instructions
addw rd, rs1, rs2
ADDW and SUBW are RV64I-only instructions that are defined analogously to ADD and SUB but operate on 32-bit values and produce signed 32-bit results
sllw rd, rs1, rs2
SLLW, SRLW, and SRAW are RV64I-only instructions that are analogously defined but operate on 32-bit values and produce signed 32-bit results

rv64 / load-and-store-instructions

RV64I Base Integer Instruction Set, Version 2.1 / Load and Store Instructions
ld rd, rs1, imm12
The LD instruction loads a 64-bit value from memory into register rd for RV64I.
lwu rd, rs1, imm12
The LWU instruction, on the other hand, zero-extends the 32-bit value from memory for RV64I
sd imm12hi, rs1, rs2, imm12lo
The SD, SW, SH, and SB instructions store 64-bit, 32-bit, 16-bit, and 8-bit values from the low bits of register rs2 to memory respectively.

rv128

rv128 / rv128

RV128I Base Integer Instruction Set, Version 1.7 /
fmv.x.q rd, rs1
The floating-point instruction set is unchanged, although the 128-bit Q floating-point extension can now support FMV.X.Q and FMV.Q.X instructions, together with additional FCVT instructions to and from the T (128-bit) integer format.

a

sec:amo sec:lrsc

a / sec:amo

"A" Standard Extension for Atomic Instructions, Version 2.1 / Atomic Memory Operations
amoadd.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoxor.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoor.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoand.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amomin.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amomax.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amominu.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amomaxu.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoswap.w rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoadd.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoxor.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoor.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoand.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amomin.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amomax.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amominu.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amomaxu.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1
amoswap.d rd, rs1, rs2
These AMO instructions atomically load a data value from the address in rs1 , place the value into register rd , apply a binary operator to the loaded value and the original value in rs2 , then store the result back to the address in rs1

a / sec:lrsc

"A" Standard Extension for Atomic Instructions, Version 2.1 / Load-Reserved/Store-Conditional Instructions
lr.w rd, rs1
LR.W loads a word from the address in rs1 , places the sign-extended value in rd , and registers a reservation set —a set of bytes that subsumes the bytes in the addressed word
sc.w rd, rs1, rs2
SC.W conditionally writes a word in rs2 to the address in rs1 : the SC.W succeeds only if the reservation is still valid and the reservation set contains the bytes being written
If the SC.W succeeds, the instruction writes the word in rs2 to memory, and it writes zero to rd
If the SC.W fails, the instruction does not write to memory, and it writes a nonzero value to rd
Regardless of success or failure, executing an SC.W instruction invalidates any reservation held by this hart
lr.d rd, rs1
LR.D and SC.D act analogously on doublewords and are only available on RV64. For RV64, LR.W and SC.W sign-extend the value placed in rd .

c

integer register immediate operations register based loads and stores control transfer instructions integer constant generation instructions integer register register operations
stack pointer based loads and stores compressed

c / integer-register-immediate-operations

"C" Standard Extension for Compressed Instructions, Version 2.0 / Integer Computational Instructions
c.addi4spn
In the standard RISC-V calling convention, the stack pointer sp C.ADDI4SPN is a CIW-format instruction that adds a zero -extended non-zero immediate, scaled by 4, to the stack pointer, x2 , and writes the result to rd '
C.ADDI4SPN is only valid when nzuimm ≠
c.addi
C.ADDI adds the non-zero sign-extended 6-bit immediate to the value in register rd then writes the result to rd
C.ADDI expands into addi rd, rd, nzimm[5:0]
C.ADDI is only valid when rd ≠ x0 and nzimm ≠
c.srli
C.SRLI is a CB-format instruction that performs a logical right shift of the value in register rd  ′
For RV128C, a shift amount of zero is used to encode a shift of 64. Furthermore, the shift amount is sign-extended for RV128C, and so the legal shift amounts are 1–31, 64, and 96–127. C.SRLI expands into srli rd ', rd ', shamt[5:0] , except for RV128C with shamt=0 , which expands to srli rd ', rd ', 64 .
c.srai
C.SRAI is defined analogously to C.SRLI, but instead performs an arithmetic right shift
C.SRAI expands to srai rd ', rd ', shamt[5:0] .
c.andi
C.ANDI is a CB-format instruction that computes the bitwise AND of the value in register rd  ′
. C.ANDI expands to andi rd ', rd ', imm[5:0] .
c.slli
C.SLLI is a CI-format instruction that performs a logical left shift of the value in register rd then writes the result to rd
For RV128C, a shift amount of zero is used to encode a shift of 64. C.SLLI expands into slli rd, rd, shamt[5:0] , except for RV128C with shamt=0 , which expands to slli rd, rd, 64 .

c / register-based-loads-and-stores

"C" Standard Extension for Compressed Instructions, Version 2.0 / Load and Store Instructions
c.fld
C.FLD is an RV32DC/RV64DC-only instruction that loads a double-precision floating-point value from memory into floating-point register rd  ′
c.lw
C.LW loads a 32-bit value from memory into register rd  ′
c.flw
C.FLW is an RV32FC-only instruction that loads a single-precision floating-point value from memory into floating-point register rd  ′
c.fsd
C.FSD is an RV32DC/RV64DC-only instruction that stores a double-precision floating-point value in floating-point register rs2  ′
c.sw
C.SW stores a 32-bit value in register rs2  ′
c.fsw
C.FSW is an RV32FC-only instruction that stores a single-precision floating-point value in floating-point register rs2  ′

c / control-transfer-instructions

"C" Standard Extension for Compressed Instructions, Version 2.0 / Control Transfer Instructions
c.jal
C.JAL is an RV32C-only instruction that performs the same operation as C.J, but additionally writes the address of the instruction following the jump ( pc +2) to the link register, x1
C.JAL expands to jal x1, offset[11:1] .
c.j
C.J performs an unconditional control transfer
C.J can therefore target a
C.J expands to jal x0, offset[11:1] .
c.beqz
C.BEQZ performs conditional control transfers
C.BEQZ takes the branch if the value in register rs1  ′
c.bnez
C.BNEZ is defined analogously, but it takes the branch if rs1  ′

c / integer-constant-generation-instructions

"C" Standard Extension for Compressed Instructions, Version 2.0 / Integer Computational Instructions
c.li
C.LI loads the sign-extended 6-bit immediate, imm , into register rd
C.LI expands into addi rd, x0, imm[5:0]
C.LI is only valid when rd ≠ x0 ; the code points with rd = x0 encode HINTs.
c.lui rd=2
C.LUI loads the non-zero 6-bit immediate field into bits 17–12 of the destination register, clears the bottom 12 bits, and sign-extends bit 17 into all higher bits of the destination
C.LUI expands into lui rd, nzimm[17:12]
C.LUI is only valid when rd  ≠ { x0 , x2 }

c / integer-register-register-operations

"C" Standard Extension for Compressed Instructions, Version 2.0 / Integer Computational Instructions
c.sub
C.SUB subtracts the value in register rs2  ′
. C.SUB expands into sub rd ', rd ', rs2 ' .
c.xor
C.XOR computes the bitwise XOR of the values in registers rd  ′ rs2  ′
. C.XOR expands into xor rd ', rd ', rs2 ' .
c.or
C.OR computes the bitwise OR of the values in registers rd  ′ rs2  ′
. C.OR expands into or rd ', rd ', rs2 ' .
c.and
C.AND computes the bitwise AND of the values in registers rd  ′ rs2  ′
. C.AND expands into and rd ', rd ', rs2 ' .
c.subw
C.SUBW is an RV64C/RV128C-only instruction that subtracts the value in register rs2  ′
. C.SUBW expands into subw rd ', rd ', rs2 ' .
c.addw
C.ADDW is an RV64C/RV128C-only instruction that adds the values in registers rd  ′
. C.ADDW expands into addw rd ', rd ', rs2 ' .
c.mv !rs2
C.MV copies the value in register rs2 into register rd
C.MV expands into add rd, x0, rs2
C.MV is only valid when rs2  ≠  x0 ; the code points with rs2  =  x0 correspond to the C.JR instruction
C.MV expands to a different instruction than the canonical MV pseudoinstruction, which instead uses ADDI
using register-renaming hardware, may find it more convenient to expand C.MV to MV instead of ADD, at slight additional hardware cost.
c.add !rs1, !rs2=c.jalr
C.ADD adds the values in registers rd and rs2 and writes the result to register rd
C.ADD expands into add rd, rd, rs2
C.ADD is only valid when rs2  ≠  x0 ; the code points with rs2  =  x0 correspond to the C.JALR and C.EBREAK instructions

c / stack-pointer-based-loads-and-stores

"C" Standard Extension for Compressed Instructions, Version 2.0 / Load and Store Instructions
c.fldsp
C.FLDSP is an RV32DC/RV64DC-only instruction that loads a double-precision floating-point value from memory into floating-point register rd
c.lwsp
C.LWSP loads a 32-bit value from memory into register rd
C.LWSP is only valid when rd  ≠  x0 ; the code points with rd  =  x0 are reserved.
c.flwsp
C.FLWSP is an RV32FC-only instruction that loads a single-precision floating-point value from memory into floating-point register rd
c.fsdsp
C.FSDSP is an RV32DC/RV64DC-only instruction that stores a double-precision floating-point value in floating-point register rs2 to memory
c.swsp
C.SWSP stores a 32-bit value in register rs2 to memory
c.fswsp
C.FSWSP is an RV32FC-only instruction that stores a single-precision floating-point value in floating-point register rs2 to memory

c / compressed

"C" Standard Extension for Compressed Instructions, Version 2.0 /
@c.nop
@c.addi16sp
@c.jr
@c.jalr
@c.ebreak
@c.ld
@c.sd
@c.addiw
@c.ldsp
@c.sdsp
@c.lq
@c.sq
@c.lqsp
@c.sqsp

d

sec:single float compute double precision floating point conversion and move instructions single precision floating point compare instructions double precision floating point classify instruction fld fsd

d / sec:single-float-compute

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Floating-Point Computational Instructions
fadd.d rd, rs1, rs2
FADD.S and FMUL.S perform single-precision floating-point addition and multiplication respectively, between rs1 and rs2
fsub.d rd, rs1, rs2
FSUB.S performs the single-precision floating-point subtraction of rs2 from rs1
fdiv.d rd, rs1, rs2
FDIV.S performs the single-precision floating-point division of rs1 by rs2
fmin.d rd, rs1, rs2
Floating-point minimum-number and maximum-number instructions FMIN.S and FMAX.S write, respectively, the smaller or larger of rs1 and rs2 rd
Note that in version 2.2 of the F extension, the FMIN.S and FMAX.S instructions were amended to implement the proposed IEEE 754-201x minimumNumber and maximumNumber operations, rather than the IEEE 754-2008 minNum and maxNum operations
fsqrt.d rd, rs1
FSQRT.S computes the square root of rs1
fmadd.d rd, rs1, rs2, rs3
FMADD.S multiplies the values in rs1 and rs2 , adds the value in rs3 , and writes the final result to rd
FMADD.S computes (rs1 × rs2)+rs3 .
fmsub.d rd, rs1, rs2, rs3
FMSUB.S multiplies the values in rs1 and rs2 , subtracts the value in rs3 , and writes the final result to rd
FMSUB.S computes (rs1 × rs2)-rs3 .
fnmsub.d rd, rs1, rs2, rs3
FNMSUB.S multiplies the values in rs1 and rs2 , negates the product, adds the value in rs3 , and writes the final result to rd
FNMSUB.S computes -(rs1 × rs2)+rs3 .
fnmadd.d rd, rs1, rs2, rs3
FNMADD.S multiplies the values in rs1 and rs2 , negates the product, subtracts the value in rs3 , and writes the final result to rd
FNMADD.S computes -(rs1 × rs2)-rs3 .

d / double-precision-floating-point-conversion-and-move-instructions

"D" Standard Extension for Double-Precision Floating-Point, Version 2.2 / Double-Precision Floating-Point Conversion and Move Instructions
fsgnj.d rd, rs1, rs2
Floating-point to floating-point sign-injection instructions, FSGNJ.D, FSGNJN.D, and FSGNJX.D are defined analogously to the single-precision sign-injection instruction.
fcvt.s.d rd, rs1
The double-precision to single-precision and single-precision to double-precision conversion instructions, FCVT.S.D and FCVT.D.S, are encoded in the OP-FP major opcode space and both the source and destination are floating-point registers
FCVT.S.D rounds according to the RM field; FCVT.D.S will never round.
fcvt.w.d rd, rs1
FCVT.W.D or FCVT.L.D converts a double-precision floating-point number in floating-point register rs1 to a signed 32-bit or 64-bit integer, respectively, in integer register rd
fcvt.wu.d rd, rs1
FCVT.WU.D, FCVT.LU.D, FCVT.D.WU, and FCVT.D.LU variants convert to or from unsigned integer values
fmv.x.d rd, rs1
FMV.X.D moves the double-precision value in floating-point register rs1 to a representation in IEEE 754-2008 standard encoding in integer register rd
FMV.X.D and FMV.D.X do not modify the bits being transferred; in particular, the payloads of non-canonical NaNs are preserved.
fcvt.d.w rd, rs1
FCVT.D.W or FCVT.D.L converts a 32-bit or 64-bit signed integer, respectively, in integer register rs1 into a double-precision floating-point number in floating-point register rd
Note FCVT.D.W[U] always produces an exact result and is unaffected by rounding mode.
fcvt.d.l rd, rs1
FCVT.L[U].D and FCVT.D.L[U] are RV64-only instructions
fmv.d.x rd, rs1
FMV.D.X moves the double-precision value encoded in IEEE 754-2008 standard encoding from the integer register rs1 to the floating-point register rd .

d / single-precision-floating-point-compare-instructions

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Floating-Point Compare Instructions
flt.d rd, rs1, rs2
FLT.S and FLE.S perform what the IEEE 754-2008 standard refers to as signaling comparisons: that is, they set the invalid operation exception flag if either input is NaN
feq.d rd, rs1, rs2
Floating-point compare instructions (FEQ.S, FLT.S, FLE.S) perform the specified comparison between floating-point registers (
FEQ.S performs a quiet comparison: it only sets the invalid operation exception flag if either input is a signaling NaN

d / double-precision-floating-point-classify-instruction

"D" Standard Extension for Double-Precision Floating-Point, Version 2.2 / Double-Precision Floating-Point Classify Instruction
fclass.d rd, rs1
The double-precision floating-point classify instruction, FCLASS.D, is defined analogously to its single-precision counterpart, but operates on double-precision operands.

d / fld_fsd

"D" Standard Extension for Double-Precision Floating-Point, Version 2.2 / Double-Precision Load and Store Instructions
fld rd, rs1, imm12
The FLD instruction loads a double-precision floating-point value from memory into floating-point register rd
FLD and FSD are only guaranteed to execute atomically if the effective address is naturally aligned and XLEN
FLD and FSD do not modify the bits being transferred; in particular, the payloads of non-canonical NaNs are preserved.
fsd imm12hi, rs1, rs2, imm12lo
FSD stores a double-precision value from the floating-point registers to memory

f

single precision floating point conversion and move instructions sec:single float compute single precision floating point compare instructions single precision floating point classify instruction single precision load and store instructions

f / single-precision-floating-point-conversion-and-move-instructions

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 /
xor rd, rs1, rs2
For FSGNJ, the result’s sign bit is rs2 ’s sign bit; for FSGNJN, the result’s sign bit is the opposite of rs2 ’s sign bit; and for FSGNJX, the sign bit is the XOR of the sign bits of rs1 and rs2
fsgnj.s rd, rs1, rs2
Floating-point to floating-point sign-injection instructions, FSGNJ.S, FSGNJN.S, and FSGNJX.S, produce a result that takes all bits except the sign bit from rs1
Note, FSGNJ.S rx, ry, ry moves ry to rx (assembler pseudoinstruction FMV.S rx, ry ); FSGNJN.S rx, ry, ry moves the negation of ry to rx (assembler pseudoinstruction FNEG.S rx, ry ); and FSGNJX.S rx, ry, ry moves the absolute value of ry to rx (assembler pseudoinstruction FABS.S rx, ry ).
fcvt.w.s rd, rs1
FCVT.W.S or FCVT.L.S converts a floating-point number in floating-point register rs1 to a signed 32-bit or 64-bit integer, respectively, in integer register rd
FCVT.W.S
fcvt.wu.s rd, rs1
FCVT.WU.S, FCVT.LU.S, FCVT.S.WU, and FCVT.S.LU variants convert to or from unsigned integer values
FCVT.WU.S
fcvt.l.s rd, rs1
FCVT.L.S
fcvt.lu.s rd, rs1
FCVT.LU.S
fmv.x.w rd, rs1
FMV.X.W moves the single-precision value in floating-point register rs1 rd
fcvt.s.w rd, rs1
FCVT.S.W or FCVT.S.L converts a 32-bit or 64-bit signed integer, respectively, in integer register rs1 into a floating-point number in floating-point register rd
A floating-point register can be initialized to floating-point positive zero using FCVT.S.W rd , x0 , which will never set any exception flags.
fcvt.s.l rd, rs1
FCVT.L[U].S and FCVT.S.L[U] are RV64-only instructions
fmv.w.x rd, rs1
FMV.W.X moves the single-precision value encoded in IEEE 754-2008 standard encoding from the lower 32 bits of integer register rs1 to the floating-point register rd
The FMV.W.X and FMV.X.W instructions were previously called FMV.S.X and FMV.X.S

f / sec:single-float-compute

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Floating-Point Computational Instructions
fadd.s rd, rs1, rs2
FADD.S and FMUL.S perform single-precision floating-point addition and multiplication respectively, between rs1 and rs2
fsub.s rd, rs1, rs2
FSUB.S performs the single-precision floating-point subtraction of rs2 from rs1
fdiv.s rd, rs1, rs2
FDIV.S performs the single-precision floating-point division of rs1 by rs2
fmin.s rd, rs1, rs2
Floating-point minimum-number and maximum-number instructions FMIN.S and FMAX.S write, respectively, the smaller or larger of rs1 and rs2 rd
Note that in version 2.2 of the F extension, the FMIN.S and FMAX.S instructions were amended to implement the proposed IEEE 754-201x minimumNumber and maximumNumber operations, rather than the IEEE 754-2008 minNum and maxNum operations
fsqrt.s rd, rs1
FSQRT.S computes the square root of rs1
fmadd.s rd, rs1, rs2, rs3
FMADD.S multiplies the values in rs1 and rs2 , adds the value in rs3 , and writes the final result to rd
FMADD.S computes (rs1 × rs2)+rs3 .
fmsub.s rd, rs1, rs2, rs3
FMSUB.S multiplies the values in rs1 and rs2 , subtracts the value in rs3 , and writes the final result to rd
FMSUB.S computes (rs1 × rs2)-rs3 .
fnmsub.s rd, rs1, rs2, rs3
FNMSUB.S multiplies the values in rs1 and rs2 , negates the product, adds the value in rs3 , and writes the final result to rd
FNMSUB.S computes -(rs1 × rs2)+rs3 .
fnmadd.s rd, rs1, rs2, rs3
FNMADD.S multiplies the values in rs1 and rs2 , negates the product, subtracts the value in rs3 , and writes the final result to rd
FNMADD.S computes -(rs1 × rs2)-rs3 .

f / single-precision-floating-point-compare-instructions

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Floating-Point Compare Instructions
flt.s rd, rs1, rs2
FLT.S and FLE.S perform what the IEEE 754-2008 standard refers to as signaling comparisons: that is, they set the invalid operation exception flag if either input is NaN
feq.s rd, rs1, rs2
Floating-point compare instructions (FEQ.S, FLT.S, FLE.S) perform the specified comparison between floating-point registers (
FEQ.S performs a quiet comparison: it only sets the invalid operation exception flag if either input is a signaling NaN

f / single-precision-floating-point-classify-instruction

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Floating-Point Classify Instruction
fclass.s rd, rs1
The FCLASS.S instruction examines the value in floating-point register rs1 and writes to integer register rd a 10-bit mask that indicates the class of the floating-point number
FCLASS.S does not set the floating-point exception flags

f / single-precision-load-and-store-instructions

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Load and Store Instructions
flw rd, rs1, imm12
The FLW instruction loads a single-precision floating-point value from memory into floating-point register rd
FLW and FSW are only guaranteed to execute atomically if the effective address is naturally aligned.
FLW and FSW do not modify the bits being transferred; in particular, the payloads of non-canonical NaNs are preserved.
fsw imm12hi, rs1, rs2, imm12lo
FSW stores a single-precision value from floating-point register rs2 to memory.

m

multiplication operations division operations

m / multiplication-operations

"M" Standard Extension for Integer Multiplication and Division, Version 2.0 / Multiplication Operations
mul rd, rs1, rs2
MUL performs an XLEN-bit
In RV64, MUL can be used to obtain the upper 32 bits of the 64-bit product, but signed arguments must be proper 32-bit signed values, whereas unsigned arguments must have their upper 32 bits clear
mulh rd, rs1, rs2
MULH, MULHU, and MULHSU perform the same multiplication but return the upper XLEN bits of the full 2
If both the high and low bits of the same product are required, then the recommended code sequence is: MULH[[S]U] rdh, rs1, rs2 ; MUL rdl, rs1, rs2 (source register specifiers must be in same order and rdh cannot be the same as rs1 or rs2 )
If the arguments are not known to be sign- or zero-extended, an alternative is to shift both arguments left by 32 bits, then use MULH[[S]U].
mulhsu rd, rs1, rs2
MULHSU is used in multi-word signed multiplication to multiply the most-significant word of the multiplicand (which contains the sign bit) with the less-significant words of the multiplier (which are unsigned).
mulw rd, rs1, rs2
MULW is an RV64 instruction that multiplies the lower 32 bits of the source registers, placing the sign-extension of the lower 32 bits of the result into the destination register.

m / division-operations

"M" Standard Extension for Integer Multiplication and Division, Version 2.0 / Division Operations
div rd, rs1, rs2
DIV and DIVU perform an XLEN bits by XLEN bits signed and unsigned integer division of rs1 by rs2 , rounding towards zero
If both the quotient and remainder are required from the same division, the recommended code sequence is: DIV[U] rdq, rs1, rs2 ; REM[U] rdr, rs1, rs2 ( rdq rs1 or rs2 )
DIV[W]
divu rd, rs1, rs2
DIVU[W]
rem rd, rs1, rs2
REM and REMU provide the remainder of the corresponding division operation
For REM, the sign of the result equals the sign of the dividend.
REM[W]
remu rd, rs1, rs2
REMU[W]
divw rd, rs1, rs2
DIVW and DIVUW are RV64 instructions that divide the lower 32 bits of rs1 by the lower 32 bits of rs2 , treating them as signed and unsigned integers respectively, placing the 32-bit quotient in rd , sign-extended to 64 bits
remw rd, rs1, rs2
REMW and REMUW are RV64 instructions that provide the corresponding signed and unsigned remainder operations respectively
Both REMW and REMUW always sign-extend the 32-bit result to 64 bits, including on a divide by zero.

n

n / user-status-register-ustatus

"N" Standard Extension for User-Level Interrupts, Version 1.1 / Additional CSRs
uret
A new instruction, URET, is used to return from traps in U-mode
URET copies UPIE into UIE, then sets UPIE, before copying uepc pc

q

sec:single float compute quad precision convert and move instructions single precision floating point compare instructions quad precision floating point classify instruction quad precision load and store instructions

q / sec:single-float-compute

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Floating-Point Computational Instructions
fadd.q rd, rs1, rs2
FADD.S and FMUL.S perform single-precision floating-point addition and multiplication respectively, between rs1 and rs2
fsub.q rd, rs1, rs2
FSUB.S performs the single-precision floating-point subtraction of rs2 from rs1
fdiv.q rd, rs1, rs2
FDIV.S performs the single-precision floating-point division of rs1 by rs2
fmin.q rd, rs1, rs2
Floating-point minimum-number and maximum-number instructions FMIN.S and FMAX.S write, respectively, the smaller or larger of rs1 and rs2 rd
Note that in version 2.2 of the F extension, the FMIN.S and FMAX.S instructions were amended to implement the proposed IEEE 754-201x minimumNumber and maximumNumber operations, rather than the IEEE 754-2008 minNum and maxNum operations
fsqrt.q rd, rs1
FSQRT.S computes the square root of rs1
fmadd.q rd, rs1, rs2, rs3
FMADD.S multiplies the values in rs1 and rs2 , adds the value in rs3 , and writes the final result to rd
FMADD.S computes (rs1 × rs2)+rs3 .
fmsub.q rd, rs1, rs2, rs3
FMSUB.S multiplies the values in rs1 and rs2 , subtracts the value in rs3 , and writes the final result to rd
FMSUB.S computes (rs1 × rs2)-rs3 .
fnmsub.q rd, rs1, rs2, rs3
FNMSUB.S multiplies the values in rs1 and rs2 , negates the product, adds the value in rs3 , and writes the final result to rd
FNMSUB.S computes -(rs1 × rs2)+rs3 .
fnmadd.q rd, rs1, rs2, rs3
FNMADD.S multiplies the values in rs1 and rs2 , negates the product, subtracts the value in rs3 , and writes the final result to rd
FNMADD.S computes -(rs1 × rs2)-rs3 .

q / quad-precision-convert-and-move-instructions

"Q" Standard Extension for Quad-Precision Floating-Point, Version 2.2 / Quad-Precision Convert and Move Instructions
fsgnj.q rd, rs1, rs2
Floating-point to floating-point sign-injection instructions, FSGNJ.Q, FSGNJN.Q, and FSGNJX.Q are defined analogously to the double-precision sign-injection instruction.
fcvt.s.q rd, rs1
FCVT.S.Q or FCVT.Q.S converts a quad-precision floating-point number to a single-precision floating-point number, or vice-versa, respectively
fcvt.d.q rd, rs1
FCVT.D.Q or FCVT.Q.D converts a quad-precision floating-point number to a double-precision floating-point number, or vice-versa, respectively.
fcvt.w.q rd, rs1
FCVT.W.Q or FCVT.L.Q converts a quad-precision floating-point number to a signed 32-bit or 64-bit integer, respectively
fcvt.wu.q rd, rs1
FCVT.WU.Q, FCVT.LU.Q, FCVT.Q.WU, and FCVT.Q.LU variants convert to or from unsigned integer values
fcvt.q.w rd, rs1
FCVT.Q.W or FCVT.Q.L converts a 32-bit or 64-bit signed integer, respectively, into a quad-precision floating-point number
fcvt.q.l rd, rs1
FCVT.L[U].Q and FCVT.Q.L[U] are RV64-only instructions.

q / single-precision-floating-point-compare-instructions

"F" Standard Extension for Single-Precision Floating-Point, Version 2.2 / Single-Precision Floating-Point Compare Instructions
flt.q rd, rs1, rs2
FLT.S and FLE.S perform what the IEEE 754-2008 standard refers to as signaling comparisons: that is, they set the invalid operation exception flag if either input is NaN
feq.q rd, rs1, rs2
Floating-point compare instructions (FEQ.S, FLT.S, FLE.S) perform the specified comparison between floating-point registers (
FEQ.S performs a quiet comparison: it only sets the invalid operation exception flag if either input is a signaling NaN

q / quad-precision-floating-point-classify-instruction

"Q" Standard Extension for Quad-Precision Floating-Point, Version 2.2 / Quad-Precision Floating-Point Classify Instruction
fclass.q rd, rs1
The quad-precision floating-point classify instruction, FCLASS.Q, is defined analogously to its double-precision counterpart, but operates on quad-precision operands.

q / quad-precision-load-and-store-instructions

"Q" Standard Extension for Quad-Precision Floating-Point, Version 2.2 / Quad-Precision Load and Store Instructions
flq rd, rs1, imm12
FLQ and FSQ are only guaranteed to execute atomically if the effective address is naturally aligned and XLEN=128.
FLQ and FSQ do not modify the bits being transferred; in particular, the payloads of non-canonical NaNs are preserved.

v

vector length register code vl code vector type register code vtype code vector unit stride instructions examples vector strided instructions
vector indexed instructions unit stride fault only first loads vector single width floating point add subtract instructions vector floating point min max instructions vector floating point sign injection instructions
vector floating point move instruction vector floating point merge instruction introduction vector single width floating point multiply divide instructions vector single width floating point fused multiply add instructions
vector widening floating point add subtract instructions vector widening floating point multiply vector widening floating point fused multiply add instructions vector single width floating point reduction instructions vector widening floating point reduction instructions
vector floating point dot product instruction vector single width integer add and subtract vector integer min max instructions vector bitwise logical instructions vector register gather instruction
vector slide instructions vector slidedown instructions vector integer add with carry subtract with borrow instructions vector integer merge instructions vector integer move instructions
vector single width saturating add and subtract vector single width averaging add and subtract vector single width bit shift instructions vector single width fractional multiply with rounding and saturation vector single width scaling shift instructions
vector narrowing integer right shift instructions sec narrowing vector narrowing fixed point clip instructions vector widening integer reduction instructions vector integer dot product instruction
vector single width integer reduction instructions vector compress instruction vector integer comparison instructions vector floating point compare instructions sec mask register logical
code vfirst code find first set mask bit code vmsif m code set including first mask bit code vmsbf m code set before first mask bit vector iota instruction vector element index instruction
vector integer divide instructions vector single width integer multiply instructions vector single width integer multiply add instructions vector widening integer add subtract vector widening integer multiply instructions
vector widening integer multiply add instructions vector slide1up vector slide1down instruction

v / _vector_length_register_code_vl_code

3. Vector Extension Programmer’s Model / 3.4. Vector Length Register
vsetvli zimm11, rs1, rd
vl The XLEN -bit-wide read-only vl CSR can only be updated by the vsetvli and vsetvl instructions, and the fault-only-first vector load instruction variants.

v / _vector_type_register_code_vtype_code

3. Vector Extension Programmer’s Model / 3.3. Vector type register,
vsetvl rs2, rs1, rd
vtype The read-only XLEN-wide vector type CSR, vtype provides the default type used to interpret the contents of the vector register file, and can only be updated by vsetvl{i} instructions
Allowing updates only via the vsetvl{i} vtype register state

v / _vector_unit_stride_instructions

7. Vector Loads and Stores / 7.4. Vector Unit-Stride Instructions
vlb.v rs1, vd
vlb.v vd, (rs1), vm # 8b signed
vlw.v rs1, vd
vlw.v vd, (rs1), vm # 32b signed
vle.v rs1, vd
vle.v vd, (rs1), vm # SEW
vlbu.v rs1, vd
vlbu.v vd, (rs1), vm # 8b unsigned
vlhu.v rs1, vd
vlhu.v vd, (rs1), vm # 16b unsigned
vlwu.v rs1, vd
vlwu.v vd, (rs1), vm # 32b unsigned
vsb.v rs1, vs3
vsb.v vs3, (rs1), vm # 8b store
vsh.v rs1, vs3
vsh.v vs3, (rs1), vm # 16b store
vse.v rs1, vs3
vse.v vs3, (rs1), vm # SEW store

v / _examples

6. Configuration-Setting Instructions / 6.4. Examples
vlh.v rs1, vd
vlh.v v8, (a1) # Sign-extend 16b load values to 32b elements
vlh.v v4, (a1) # Get 16b vector
vsw.v rs1, vs3
vsw.v v8, (a2) # Store vector of 32b results
vsw.v v8, (a2) # Store vector of 32b
vsrl.vx vs2, rs1, vd
vsrl.vi v8, v8, 3 # Shift elements
vsrl.vi v8, v8, 3
vsrl.vv vs2, rs1, vd
vsrl.vi v8, v8, 3 # Shift elements
vsrl.vi v8, v8, 3
vsrl.vi vs2, simm5, vd
vsrl.vi v8, v8, 3 # Shift elements
vsrl.vi v8, v8, 3
vmul.vv vs2, vs1, vd
vmul.vx v8, v8, x10 # 32b multiply result
vwmul.vv vs2, vs1, vd
vwmul.vx v8, v4, x10 # 32b in <v8--v15>
vmul.vx vs2, rs1, vd
vmul.vx v8, v8, x10 # 32b multiply result
vwmul.vx vs2, rs1, vd
vwmul.vx v8, v4, x10 # 32b in <v8--v15>

v / _vector_strided_instructions

7. Vector Loads and Stores / 7.5. Vector Strided Instructions
vlsb.v rs2, rs1, vd
vlsb.v vd, (rs1), rs2, vm # 8b
vlsh.v rs2, rs1, vd
vlsh.v vd, (rs1), rs2, vm # 16b
vlsw.v rs2, rs1, vd
vlsw.v vd, (rs1), rs2, vm # 32b
vlse.v rs2, rs1, vd
vlse.v vd, (rs1), rs2, vm # SEW
vlsbu.v rs2, rs1, vd
vlsbu.v vd, (rs1), rs2, vm # unsigned 8b
vlshu.v rs2, rs1, vd
vlshu.v vd, (rs1), rs2, vm # unsigned 16b
vlswu.v rs2, rs1, vd
vlswu.v vd, (rs1), rs2, vm # unsigned 32b
vssb.v rs2, rs1, vs3
vssb.v vs3, (rs1), rs2, vm # 8b
vssh.v rs2, rs1, vs3
vssh.v vs3, (rs1), rs2, vm # 16b
vssw.v rs2, rs1, vs3
vssw.v vs3, (rs1), rs2, vm # 32b
vsse.v rs2, rs1, vs3
vsse.v vs3, (rs1), rs2, vm # SEW

v / _vector_indexed_instructions

7. Vector Loads and Stores / 7.6. Vector Indexed Instructions
vlxb.v vs2, rs1, vd
vlxb.v vd, (rs1), vs2, vm # 8b
vlxh.v vs2, rs1, vd
vlxh.v vd, (rs1), vs2, vm # 16b
vlxw.v vs2, rs1, vd
vlxw.v vd, (rs1), vs2, vm # 32b
vlxe.v vs2, rs1, vd
vlxe.v vd, (rs1), vs2, vm # SEW
vlxbu.v vs2, rs1, vd
vlxbu.v vd, (rs1), vs2, vm # 8b unsigned
vlxhu.v vs2, rs1, vd
vlxhu.v vd, (rs1), vs2, vm # 16b unsigned
vlxwu.v vs2, rs1, vd
vlxwu.v vd, (rs1), vs2, vm # 32b unsigned
vsxb.v vs2, rs1, vs3
vsxb.v vs3, (rs1), vs2, vm # 8b
vsxh.v vs2, rs1, vs3
vsxh.v vs3, (rs1), vs2, vm # 16b
vsxw.v vs2, rs1, vs3
vsxw.v vs3, (rs1), vs2, vm # 32b
vsxe.v vs2, rs1, vs3
vsxe.v vs3, (rs1), vs2, vm # SEW
vsuxb.v vs2, rs1, vs3
vsuxb.v vs3, (rs1), vs2, vm # 8b
vsuxh.v vs2, rs1, vs3
vsuxh.v vs3, (rs1), vs2, vm # 16b
vsuxw.v vs2, rs1, vs3
vsuxw.v vs3, (rs1), vs2, vm # 32b
vsuxe.v vs2, rs1, vs3
vsuxe.v vs3, (rs1), vs2, vm # SEW

v / _unit_stride_fault_only_first_loads

7. Vector Loads and Stores / 7.7. Unit-stride Fault-Only-First Loads
vlbff.v rs1, vd
vlbff.v vd, (rs1), vm # 8b
vlbff.v v1, (a3) # Load bytes
vlhff.v rs1, vd
vlhff.v vd, (rs1), vm # 16b
vlwff.v rs1, vd
vlwff.v vd, (rs1), vm # 32b
vleff.v rs1, vd
vleff.v vd, (rs1), vm # SEW
vlbuff.v rs1, vd
vlbuff.v vd, (rs1), vm # unsigned 8b
vlhuff.v rs1, vd
vlhuff.v vd, (rs1), vm # unsigned 16b
vlwuff.v rs1, vd
vlwuff.v vd, (rs1), vm # unsigned 32b

v / _vector_single_width_floating_point_add_subtract_instructions

14. Vector Floating-Point Instructions / 14.2. Vector Single-Width Floating-Point Add/Subtract Instructions
vfadd.vf vs2, rs1, vd
vfadd.vv vd, vs2, vs1, vm # Vector-vector
vfadd.vf vd, vs2, rs1, vm # vector-scalar
vfsub.vf vs2, rs1, vd
vfsub.vv vd, vs2, vs1, vm # Vector-vector
vfsub.vf vd, vs2, rs1, vm # Vector-scalar vd[i] = vs2[i] - f[rs1]
vfadd.vv vs2, vs1, vd
vfadd.vv vd, vs2, vs1, vm # Vector-vector
vfadd.vf vd, vs2, rs1, vm # vector-scalar
vfsub.vv vs2, vs1, vd
vfsub.vv vd, vs2, vs1, vm # Vector-vector
vfsub.vf vd, vs2, rs1, vm # Vector-scalar vd[i] = vs2[i] - f[rs1]

v / _vector_floating_point_min_max_instructions

14. Vector Floating-Point Instructions / 14.9. Vector Floating-Point MIN/MAX Instructions
vfmin.vf vs2, rs1, vd
The vector floating-point vfmin and vfmax instructions have the same behavior as the corresponding scalar floating-point instructions in version 2.2 of the RISC-V F/D/Q extension.
vfmin.vv vd, vs2, vs1, vm # Vector-vector
vfmin.vf vd, vs2, rs1, vm # vector-scalar
vfmax.vf vs2, rs1, vd
vfmax.vv vd, vs2, vs1, vm # Vector-vector
vfmax.vf vd, vs2, rs1, vm # vector-scalar
vfmin.vv vs2, vs1, vd
The vector floating-point vfmin and vfmax instructions have the same behavior as the corresponding scalar floating-point instructions in version 2.2 of the RISC-V F/D/Q extension.
vfmin.vv vd, vs2, vs1, vm # Vector-vector
vfmin.vf vd, vs2, rs1, vm # vector-scalar
vfmax.vv vs2, vs1, vd
vfmax.vv vd, vs2, vs1, vm # Vector-vector
vfmax.vf vd, vs2, rs1, vm # vector-scalar

v / _vector_floating_point_sign_injection_instructions

14. Vector Floating-Point Instructions / 14.10. Vector Floating-Point Sign-Injection Instructions
vfsgnj.vf vs2, rs1, vd
vfsgnj.vv vd, vs2, vs1, vm # Vector-vector
vfsgnj.vf vd, vs2, rs1, vm # vector-scalar
vfsgnjn.vf vs2, rs1, vd
vfsgnjn.vv vd, vs2, vs1, vm # Vector-vector
vfsgnjn.vf vd, vs2, rs1, vm # vector-scalar
vfsgnjx.vf vs2, rs1, vd
vfsgnjx.vv vd, vs2, vs1, vm # Vector-vector
vfsgnjx.vf vd, vs2, rs1, vm # vector-scalar
vfsgnj.vv vs2, vs1, vd
vfsgnj.vv vd, vs2, vs1, vm # Vector-vector
vfsgnj.vf vd, vs2, rs1, vm # vector-scalar
vfsgnjn.vv vs2, vs1, vd
vfsgnjn.vv vd, vs2, vs1, vm # Vector-vector
vfsgnjn.vf vd, vs2, rs1, vm # vector-scalar
vfsgnjx.vv vs2, vs1, vd
vfsgnjx.vv vd, vs2, vs1, vm # Vector-vector
vfsgnjx.vf vd, vs2, rs1, vm # vector-scalar

v / _vector_floating_point_move_instruction

14. Vector Floating-Point Instructions / 14.14. Vector Floating-Point Move Instruction
vfmv.s.f rs1, vd
Note vfmv.v.f instruction shares the encoding with the vfmerge.vfm vm=1 and vs2=v0
Note vfmv.v.f substitutes a canonical NaN for f[rs1] if the latter is not properly NaN-boxed
vfmv.v.f vd, rs1 # vd[i] = f[rs1]
vfmv.v.f rs1, vd
Note vfmv.v.f instruction shares the encoding with the vfmerge.vfm vm=1 and vs2=v0
Note vfmv.v.f substitutes a canonical NaN for f[rs1] if the latter is not properly NaN-boxed
vfmv.v.f vd, rs1 # vd[i] = f[rs1]
vfmv.f.s vs2, rd
Note vfmv.v.f instruction shares the encoding with the vfmerge.vfm vm=1 and vs2=v0
Note vfmv.v.f substitutes a canonical NaN for f[rs1] if the latter is not properly NaN-boxed
vfmv.v.f vd, rs1 # vd[i] = f[rs1]

v / _vector_floating_point_merge_instruction

14. Vector Floating-Point Instructions / 14.13. Vector Floating-Point Merge Instruction
vfmerge.vfm vs2, rs1, vd
The vfmerge.vfm instruction is always masked ( vm=0 )
Note vfmerge.vfm substitutes a canonical NaN for f[rs1] if the latter is not properly NaN-boxed
vfmerge.vfm vd, vs2, rs1, v0 # vd[i] = v0[i].LSB ? f[rs1] : vs2[i]

v / _introduction

1. Introduction /
vfeq.vf vs2, rs1, vd
vfle.vf vs2, rs1, vd
vford.vf vs2, rs1, vd
vflt.vf vs2, rs1, vd
vfne.vf vs2, rs1, vd
vfgt.vf vs2, rs1, vd
vfge.vf vs2, rs1, vd
vfeq.vv vs2, vs1, vd
vfle.vv vs2, vs1, vd
vford.vv vs2, vs1, vd
vflt.vv vs2, vs1, vd
vfne.vv vs2, vs1, vd
vfunary0.vv vs2, vs1, vd
vfunary1.vv vs2, vs1, vd
vseq.vx vs2, rs1, vd
vsne.vx vs2, rs1, vd
vsltu.vx vs2, rs1, vd
vslt.vx vs2, rs1, vd
vsleu.vx vs2, rs1, vd
vsle.vx vs2, rs1, vd
vsgtu.vx vs2, rs1, vd
vsgt.vx vs2, rs1, vd
vwsmaccu.vx vs2, rs1, vd
vwsmacc.vx vs2, rs1, vd
vwsmaccsu.vx vs2, rs1, vd
vwsmaccus.vx vs2, rs1, vd
vseq.vv vs2, rs1, vd
vsne.vv vs2, rs1, vd
vsltu.vv vs2, rs1, vd
vslt.vv vs2, rs1, vd
vsleu.vv vs2, rs1, vd
vsle.vv vs2, rs1, vd
vwsmaccu.vv vs2, rs1, vd
vwsmacc.vv vs2, rs1, vd
vwsmaccsu.vv vs2, rs1, vd
vseq.vi vs2, simm5, vd
vsne.vi vs2, simm5, vd
vsleu.vi vs2, simm5, vd
vsle.vi vs2, simm5, vd
vsgtu.vi vs2, simm5, vd
vsgt.vi vs2, simm5, vd
vext.x.v vs2, rs1, vd
vmpopc.m vs2, vs1, rd
vmfirst.m vs2, vs1, rd

v / _vector_single_width_floating_point_multiply_divide_instructions

14. Vector Floating-Point Instructions / 14.4. Vector Single-Width Floating-Point Multiply/Divide Instructions
vfdiv.vf vs2, rs1, vd
vfdiv.vv vd, vs2, vs1, vm # Vector-vector
vfdiv.vf vd, vs2, rs1, vm # vector-scalar
vfrdiv.vf vs2, rs1, vd
vfrdiv.vf vd, vs2, rs1, vm # scalar-vector, vd[i] = f[rs1]/vs2[i]
vfmul.vf vs2, rs1, vd
vfmul.vv vd, vs2, vs1, vm # Vector-vector
vfmul.vf vd, vs2, rs1, vm # vector-scalar
vfdiv.vv vs2, vs1, vd
vfdiv.vv vd, vs2, vs1, vm # Vector-vector
vfdiv.vf vd, vs2, rs1, vm # vector-scalar
vfmul.vv vs2, vs1, vd
vfmul.vv vd, vs2, vs1, vm # Vector-vector
vfmul.vf vd, vs2, rs1, vm # vector-scalar

v / _vector_single_width_floating_point_fused_multiply_add_instructions

14. Vector Floating-Point Instructions / 14.6. Vector Single-Width Floating-Point Fused Multiply-Add Instructions
vfmadd.vf vs2, rs1, vd
vfmadd.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) + vs2[i]
vfmadd.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) + vs2[i]
vfnmadd.vf vs2, rs1, vd
vfnmadd.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) - vs2[i]
vfnmadd.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) - vs2[i]
vfmsub.vf vs2, rs1, vd
vfmsub.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) - vs2[i]
vfmsub.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) - vs2[i]
vfnmsub.vf vs2, rs1, vd
vfnmsub.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) + vs2[i]
vfnmsub.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) + vs2[i]
vfmacc.vf vs2, rs1, vd
vfmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vfmacc.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) + vd[i]
vfnmacc.vf vs2, rs1, vd
vfnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i]
vfnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i]
vfmsac.vf vs2, rs1, vd
vfmsac.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) - vd[i]
vfmsac.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) - vd[i]
vfnmsac.vf vs2, rs1, vd
vfnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i]
vfnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i]
vfmadd.vv vs2, vs1, vd
vfmadd.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) + vs2[i]
vfmadd.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) + vs2[i]
vfnmadd.vv vs2, vs1, vd
vfnmadd.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) - vs2[i]
vfnmadd.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) - vs2[i]
vfmsub.vv vs2, vs1, vd
vfmsub.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) - vs2[i]
vfmsub.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) - vs2[i]
vfnmsub.vv vs2, vs1, vd
vfnmsub.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) + vs2[i]
vfnmsub.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) + vs2[i]
vfmacc.vv vs2, vs1, vd
vfmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vfmacc.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) + vd[i]
vfnmacc.vv vs2, vs1, vd
vfnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i]
vfnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i]
vfmsac.vv vs2, vs1, vd
vfmsac.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) - vd[i]
vfmsac.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) - vd[i]
vfnmsac.vv vs2, vs1, vd
vfnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i]
vfnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i]

v / _vector_widening_floating_point_add_subtract_instructions

14. Vector Floating-Point Instructions / 14.3. Vector Widening Floating-Point Add/Subtract Instructions
vfwadd.vf vs2, rs1, vd
vfwadd.vv vd, vs2, vs1, vm # vector-vector
vfwadd.vf vd, vs2, rs1, vm # vector-scalar
vfwadd.wv vd, vs2, vs1, vm # vector-vector
vfwadd.wf vd, vs2, rs1, vm # vector-scalar
vfwsub.vf vs2, rs1, vd
vfwsub.vv vd, vs2, vs1, vm # vector-vector
vfwsub.vf vd, vs2, rs1, vm # vector-scalar
vfwsub.wv vd, vs2, vs1, vm # vector-vector
vfwsub.wf vd, vs2, rs1, vm # vector-scalar
vfwadd.wf vs2, rs1, vd
vfwadd.vv vd, vs2, vs1, vm # vector-vector
vfwadd.vf vd, vs2, rs1, vm # vector-scalar
vfwadd.wv vd, vs2, vs1, vm # vector-vector
vfwadd.wf vd, vs2, rs1, vm # vector-scalar
vfwsub.wf vs2, rs1, vd
vfwsub.vv vd, vs2, vs1, vm # vector-vector
vfwsub.vf vd, vs2, rs1, vm # vector-scalar
vfwsub.wv vd, vs2, vs1, vm # vector-vector
vfwsub.wf vd, vs2, rs1, vm # vector-scalar
vfwadd.vv vs2, vs1, vd
vfwadd.vv vd, vs2, vs1, vm # vector-vector
vfwadd.vf vd, vs2, rs1, vm # vector-scalar
vfwadd.wv vd, vs2, vs1, vm # vector-vector
vfwadd.wf vd, vs2, rs1, vm # vector-scalar
vfwsub.vv vs2, vs1, vd
vfwsub.vv vd, vs2, vs1, vm # vector-vector
vfwsub.vf vd, vs2, rs1, vm # vector-scalar
vfwsub.wv vd, vs2, vs1, vm # vector-vector
vfwsub.wf vd, vs2, rs1, vm # vector-scalar
vfwadd.wv vs2, vs1, vd
vfwadd.vv vd, vs2, vs1, vm # vector-vector
vfwadd.vf vd, vs2, rs1, vm # vector-scalar
vfwadd.wv vd, vs2, vs1, vm # vector-vector
vfwadd.wf vd, vs2, rs1, vm # vector-scalar
vfwsub.wv vs2, vs1, vd
vfwsub.vv vd, vs2, vs1, vm # vector-vector
vfwsub.vf vd, vs2, rs1, vm # vector-scalar
vfwsub.wv vd, vs2, vs1, vm # vector-vector
vfwsub.wf vd, vs2, rs1, vm # vector-scalar

v / _vector_widening_floating_point_multiply

14. Vector Floating-Point Instructions / 14.5. Vector Widening Floating-Point Multiply
vfwmul.vf vs2, rs1, vd
vfwmul.vv vd, vs2, vs1, vm # vector-vector
vfwmul.vf vd, vs2, rs1, vm # vector-scalar
vfwmul.vv vs2, vs1, vd
vfwmul.vv vd, vs2, vs1, vm # vector-vector
vfwmul.vf vd, vs2, rs1, vm # vector-scalar

v / _vector_widening_floating_point_fused_multiply_add_instructions

14. Vector Floating-Point Instructions / 14.7. Vector Widening Floating-Point Fused Multiply-Add Instructions
vfwmacc.vf vs2, rs1, vd
vfwmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vfwmacc.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) + vd[i]
vfwnmacc.vf vs2, rs1, vd
vfwnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i]
vfwnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i]
vfwmsac.vf vs2, rs1, vd
vfwmsac.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) - vd[i]
vfwmsac.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) - vd[i]
vfwnmsac.vf vs2, rs1, vd
vfwnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i]
vfwnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i]
vfwmacc.vv vs2, vs1, vd
vfwmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vfwmacc.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) + vd[i]
vfwnmacc.vv vs2, vs1, vd
vfwnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i]
vfwnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i]
vfwmsac.vv vs2, vs1, vd
vfwmsac.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) - vd[i]
vfwmsac.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) - vd[i]
vfwnmsac.vv vs2, vs1, vd
vfwnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i]
vfwnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i]

v / _vector_single_width_floating_point_reduction_instructions

15. Vector Reduction Operations / 15.3. Vector Single-Width Floating-Point Reduction Instructions
vfredsum.vs vs2, vs1, vd
vfredsum.vs vd, vs2, vs1, vm # Unordered sum
vfredosum.vs vs2, vs1, vd
vfredosum.vs vd, vs2, vs1, vm # Ordered sum
vfredmin.vs vs2, vs1, vd
vfredmin.vs vd, vs2, vs1, vm # Minimum value
vfredmax.vs vs2, vs1, vd
vfredmax.vs vd, vs2, vs1, vm # Maximum value

v / _vector_widening_floating_point_reduction_instructions

15. Vector Reduction Operations / 15.4. Vector Widening Floating-Point Reduction Instructions
vfwredsum.vs vs2, vs1, vd
vfwredsum.vs vd, vs2, vs1, vm # Unordered sum
vfwredosum.vs vs2, vs1, vd
vfwredosum.vs vd, vs2, vs1, vm # Ordered sum

v / _vector_floating_point_dot_product_instruction

19. Divided Element Extension ('Zvediv') / 19.4. Vector Floating-Point Dot Product Instruction
vfdot.vv vs2, vs1, vd
The floating-point dot-product reduction vfdot.vv performs an element-wise multiplication between the source sub-elements then accumulates the results into the destination vector element
vfdot.vv vd, vs2, vs1, vm # Vector-vector
vfdot.vv vd, vs2, vs1, vm # vd[i][31:0] += vs2[i][31:16] * vs1[i][31:16]
vfdot.vv v1, v2, v3 # v1[i][31:0] += v2[i][31:16]*v3[i][31:16] + v2[i][16:0]*v3[i][16:0]
vfdot.vv v1, v2, v3

v / _vector_single_width_integer_add_and_subtract

12. Vector Integer Arithmetic Instructions / 12.1. Vector Single-Width Integer Add and Subtract
vadd.vx vs2, rs1, vd
vadd.vv vd, vs2, vs1, vm # Vector-vector
vadd.vx vd, vs2, rs1, vm # vector-scalar
vadd.vi vd, vs2, imm, vm # vector-immediate
vsub.vx vs2, rs1, vd
vsub.vv vd, vs2, vs1, vm # Vector-vector
vsub.vx vd, vs2, rs1, vm # vector-scalar
vrsub.vx vs2, rs1, vd
vrsub.vx vd, vs2, rs1, vm # vd[i] = rs1 - vs2[i]
vrsub.vi vd, vs2, imm, vm # vd[i] = imm - vs2[i]
vadd.vv vs2, rs1, vd
vadd.vv vd, vs2, vs1, vm # Vector-vector
vadd.vx vd, vs2, rs1, vm # vector-scalar
vadd.vi vd, vs2, imm, vm # vector-immediate
vsub.vv vs2, rs1, vd
vsub.vv vd, vs2, vs1, vm # Vector-vector
vsub.vx vd, vs2, rs1, vm # vector-scalar
vadd.vi vs2, simm5, vd
vadd.vv vd, vs2, vs1, vm # Vector-vector
vadd.vx vd, vs2, rs1, vm # vector-scalar
vadd.vi vd, vs2, imm, vm # vector-immediate
vrsub.vi vs2, simm5, vd
vrsub.vx vd, vs2, rs1, vm # vd[i] = rs1 - vs2[i]
vrsub.vi vd, vs2, imm, vm # vd[i] = imm - vs2[i]

v / _vector_integer_min_max_instructions

12. Vector Integer Arithmetic Instructions / 12.8. Vector Integer Min/Max Instructions
vminu.vx vs2, rs1, vd
vminu.vv vd, vs2, vs1, vm # Vector-vector
vminu.vx vd, vs2, rs1, vm # vector-scalar
vmin.vx vs2, rs1, vd
vmin.vv vd, vs2, vs1, vm # Vector-vector
vmin.vx vd, vs2, rs1, vm # vector-scalar
vmaxu.vx vs2, rs1, vd
vmaxu.vv vd, vs2, vs1, vm # Vector-vector
vmaxu.vx vd, vs2, rs1, vm # vector-scalar
vmax.vx vs2, rs1, vd
vmax.vv vd, vs2, vs1, vm # Vector-vector
vmax.vx vd, vs2, rs1, vm # vector-scalar
vminu.vv vs2, rs1, vd
vminu.vv vd, vs2, vs1, vm # Vector-vector
vminu.vx vd, vs2, rs1, vm # vector-scalar
vmin.vv vs2, rs1, vd
vmin.vv vd, vs2, vs1, vm # Vector-vector
vmin.vx vd, vs2, rs1, vm # vector-scalar
vmaxu.vv vs2, rs1, vd
vmaxu.vv vd, vs2, vs1, vm # Vector-vector
vmaxu.vx vd, vs2, rs1, vm # vector-scalar
vmax.vv vs2, rs1, vd
vmax.vv vd, vs2, vs1, vm # Vector-vector
vmax.vx vd, vs2, rs1, vm # vector-scalar

v / _vector_bitwise_logical_instructions

12. Vector Integer Arithmetic Instructions / 12.4. Vector Bitwise Logical Instructions
vand.vx vs2, rs1, vd
vand.vv vd, vs2, vs1, vm # Vector-vector
vand.vx vd, vs2, rs1, vm # vector-scalar
vand.vi vd, vs2, imm, vm # vector-immediate
vor.vx vs2, rs1, vd
vor.vv vd, vs2, vs1, vm # Vector-vector
vor.vx vd, vs2, rs1, vm # vector-scalar
vor.vi vd, vs2, imm, vm # vector-immediate
vxor.vx vs2, rs1, vd
Note vxor vnot.v
vxor.vv vd, vs2, vs1, vm # Vector-vector
vxor.vx vd, vs2, rs1, vm # vector-scalar
vxor.vi vd, vs2, imm, vm # vector-immediate
vand.vv vs2, rs1, vd
vand.vv vd, vs2, vs1, vm # Vector-vector
vand.vx vd, vs2, rs1, vm # vector-scalar
vand.vi vd, vs2, imm, vm # vector-immediate
vor.vv vs2, rs1, vd
vor.vv vd, vs2, vs1, vm # Vector-vector
vor.vx vd, vs2, rs1, vm # vector-scalar
vor.vi vd, vs2, imm, vm # vector-immediate
vxor.vv vs2, rs1, vd
Note vxor vnot.v
vxor.vv vd, vs2, vs1, vm # Vector-vector
vxor.vx vd, vs2, rs1, vm # vector-scalar
vxor.vi vd, vs2, imm, vm # vector-immediate
vand.vi vs2, simm5, vd
vand.vv vd, vs2, vs1, vm # Vector-vector
vand.vx vd, vs2, rs1, vm # vector-scalar
vand.vi vd, vs2, imm, vm # vector-immediate
vor.vi vs2, simm5, vd
vor.vv vd, vs2, vs1, vm # Vector-vector
vor.vx vd, vs2, rs1, vm # vector-scalar
vor.vi vd, vs2, imm, vm # vector-immediate
vxor.vi vs2, simm5, vd
Note vxor vnot.v
vxor.vv vd, vs2, vs1, vm # Vector-vector
vxor.vx vd, vs2, rs1, vm # vector-scalar
vxor.vi vd, vs2, imm, vm # vector-immediate

v / _vector_register_gather_instruction

17. Vector Permutation Instructions / 17.4. Vector Register Gather Instruction
vrgather.vx vs2, rs1, vd
For any vrgather instruction, the destination vector register group cannot overlap with the source vector register groups, including the mask register if the operation is masked, otherwise an illegal instruction exception is raised.
Note vrgather.vv can only reference vector elements 0-255.
vrgather.vv vd, vs2, vs1, vm # vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]];
vrgather.vx vd, vs2, rs1, vm # vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[x[rs1]]
vrgather.vi vd, vs2, uimm, vm # vd[i] = (uimm >= VLMAX) ? 0 : vs2[uimm]
vrgather.vv vs2, rs1, vd
For any vrgather instruction, the destination vector register group cannot overlap with the source vector register groups, including the mask register if the operation is masked, otherwise an illegal instruction exception is raised.
Note vrgather.vv can only reference vector elements 0-255.
vrgather.vv vd, vs2, vs1, vm # vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]];
vrgather.vx vd, vs2, rs1, vm # vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[x[rs1]]
vrgather.vi vd, vs2, uimm, vm # vd[i] = (uimm >= VLMAX) ? 0 : vs2[uimm]
vrgather.vi vs2, simm5, vd
For any vrgather instruction, the destination vector register group cannot overlap with the source vector register groups, including the mask register if the operation is masked, otherwise an illegal instruction exception is raised.
Note vrgather.vv can only reference vector elements 0-255.
vrgather.vv vd, vs2, vs1, vm # vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]];
vrgather.vx vd, vs2, rs1, vm # vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[x[rs1]]
vrgather.vi vd, vs2, uimm, vm # vd[i] = (uimm >= VLMAX) ? 0 : vs2[uimm]

v / _vector_slide_instructions

17. Vector Permutation Instructions / 17.3. Vector Slide Instructions
vslideup.vx vs2, rs1, vd
Note vslideup and vslidedown
For all of the vslideup , vslidedown , vslide1up , and vslide1down instructions, if vstart ≥ vl , the instruction performs no operation and leaves the destination vector register unchanged.
vslideup.vi vs2, simm5, vd
Note vslideup and vslidedown
For all of the vslideup , vslidedown , vslide1up , and vslide1down instructions, if vstart ≥ vl , the instruction performs no operation and leaves the destination vector register unchanged.

v / _vector_slidedown_instructions

17. Vector Permutation Instructions / 17.3. Vector Slide Instructions
vslidedown.vx vs2, rs1, vd
For vslidedown , the value in vl specifies the number of destination elements that are written.
vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i+rs1]
vslidedown.vi vd, vs2, uimm[4:0], vm # vd[i] = vs2[i+uimm]
vslidedown behavior for source elements for element i in slide
vslidedown behavior for destination element i in slide
vslidedown.vi vs2, simm5, vd
For vslidedown , the value in vl specifies the number of destination elements that are written.
vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i+rs1]
vslidedown.vi vd, vs2, uimm[4:0], vm # vd[i] = vs2[i+uimm]
vslidedown behavior for source elements for element i in slide
vslidedown behavior for destination element i in slide

v / _vector_integer_add_with_carry_subtract_with_borrow_instructions

12. Vector Integer Arithmetic Instructions / 12.3. Vector Integer Add-with-Carry / Subtract-with-Borrow Instructions
vadc.vxm vs2, rs1, vd
. Due to encoding constraints, the carry input must come from the implicit v0 vadc and vsbc add or subtract the source operands and the carry-in or borrow-in, and write the result to vector register vd
For vadc and vsbc , an illegal instruction exception is raised if the destination vector register is v0 and LMUL > 1.
vadc.vvm vd, vs2, vs1, v0 # Vector-vector
vadc.vxm vd, vs2, rs1, v0 # Vector-scalar
vadc.vim vd, vs2, imm, v0 # Vector-immediate
vadc.vvm v4, v4, v8, v0 # Calc new sum
vmadc.vxm vs2, rs1, vd
vmadc and vmsbc add or subtract the source operands, optionally add the carry-in or subtract the borrow-in if masked ( vm=0 ), and write the result back to mask register vd
For vmadc and vmsbc , an illegal instruction exception is raised if the destination vector register overlaps a source vector register group and LMUL > 1.
vmadc.vvm vd, vs2, vs1, v0 # Vector-vector
vmadc.vxm vd, vs2, rs1, v0 # Vector-scalar
vmadc.vim vd, vs2, imm, v0 # Vector-immediate
vmadc.vv vd, vs2, vs1 # Vector-vector, no carry-in
vmadc.vx vd, vs2, rs1 # Vector-scalar, no carry-in
vmadc.vi vd, vs2, imm # Vector-immediate, no carry-in
vmadc.vvm v1, v4, v8, v0 # Get carry into temp register v1
vsbc.vxm vs2, rs1, vd
The subtract with borrow instruction vsbc performs the equivalent function to support long word arithmetic for subtraction
vsbc.vvm vd, vs2, vs1, v0 # Vector-vector
vsbc.vxm vd, vs2, rs1, v0 # Vector-scalar
vmsbc.vxm vs2, rs1, vd
For vmsbc , the borrow is defined to be 1 iff the difference, prior to truncation, is negative.
vmsbc.vvm vd, vs2, vs1, v0 # Vector-vector
vmsbc.vxm vd, vs2, rs1, v0 # Vector-scalar
vmsbc.vv vd, vs2, vs1 # Vector-vector, no borrow-in
vmsbc.vx vd, vs2, rs1 # Vector-scalar, no borrow-in
vadc.vvm vs2, rs1, vd
. Due to encoding constraints, the carry input must come from the implicit v0 vadc and vsbc add or subtract the source operands and the carry-in or borrow-in, and write the result to vector register vd
For vadc and vsbc , an illegal instruction exception is raised if the destination vector register is v0 and LMUL > 1.
vadc.vvm vd, vs2, vs1, v0 # Vector-vector
vadc.vxm vd, vs2, rs1, v0 # Vector-scalar
vadc.vim vd, vs2, imm, v0 # Vector-immediate
vadc.vvm v4, v4, v8, v0 # Calc new sum
vmadc.vvm vs2, rs1, vd
vmadc and vmsbc add or subtract the source operands, optionally add the carry-in or subtract the borrow-in if masked ( vm=0 ), and write the result back to mask register vd
For vmadc and vmsbc , an illegal instruction exception is raised if the destination vector register overlaps a source vector register group and LMUL > 1.
vmadc.vvm vd, vs2, vs1, v0 # Vector-vector
vmadc.vxm vd, vs2, rs1, v0 # Vector-scalar
vmadc.vim vd, vs2, imm, v0 # Vector-immediate
vmadc.vv vd, vs2, vs1 # Vector-vector, no carry-in
vmadc.vx vd, vs2, rs1 # Vector-scalar, no carry-in
vmadc.vi vd, vs2, imm # Vector-immediate, no carry-in
vmadc.vvm v1, v4, v8, v0 # Get carry into temp register v1
vsbc.vvm vs2, rs1, vd
The subtract with borrow instruction vsbc performs the equivalent function to support long word arithmetic for subtraction
vsbc.vvm vd, vs2, vs1, v0 # Vector-vector
vsbc.vxm vd, vs2, rs1, v0 # Vector-scalar
vmsbc.vvm vs2, rs1, vd
For vmsbc , the borrow is defined to be 1 iff the difference, prior to truncation, is negative.
vmsbc.vvm vd, vs2, vs1, v0 # Vector-vector
vmsbc.vxm vd, vs2, rs1, v0 # Vector-scalar
vmsbc.vv vd, vs2, vs1 # Vector-vector, no borrow-in
vmsbc.vx vd, vs2, rs1 # Vector-scalar, no borrow-in
vadc.vim vs2, simm5, vd
. Due to encoding constraints, the carry input must come from the implicit v0 vadc and vsbc add or subtract the source operands and the carry-in or borrow-in, and write the result to vector register vd
For vadc and vsbc , an illegal instruction exception is raised if the destination vector register is v0 and LMUL > 1.
vadc.vvm vd, vs2, vs1, v0 # Vector-vector
vadc.vxm vd, vs2, rs1, v0 # Vector-scalar
vadc.vim vd, vs2, imm, v0 # Vector-immediate
vadc.vvm v4, v4, v8, v0 # Calc new sum
vmadc.vim vs2, simm5, vd
vmadc and vmsbc add or subtract the source operands, optionally add the carry-in or subtract the borrow-in if masked ( vm=0 ), and write the result back to mask register vd
For vmadc and vmsbc , an illegal instruction exception is raised if the destination vector register overlaps a source vector register group and LMUL > 1.
vmadc.vvm vd, vs2, vs1, v0 # Vector-vector
vmadc.vxm vd, vs2, rs1, v0 # Vector-scalar
vmadc.vim vd, vs2, imm, v0 # Vector-immediate
vmadc.vv vd, vs2, vs1 # Vector-vector, no carry-in
vmadc.vx vd, vs2, rs1 # Vector-scalar, no carry-in
vmadc.vi vd, vs2, imm # Vector-immediate, no carry-in
vmadc.vvm v1, v4, v8, v0 # Get carry into temp register v1

v / _vector_integer_merge_instructions

12. Vector Integer Arithmetic Instructions / 12.15. Vector Integer Merge Instructions
vmerge.vxm vs2, rs1, vd
The vmerge instructions are always masked ( vm=0 )
vmerge.vvm vd, vs2, vs1, v0 # vd[i] = v0[i].LSB ? vs1[i] : vs2[i]
vmerge.vxm vd, vs2, rs1, v0 # vd[i] = v0[i].LSB ? x[rs1] : vs2[i]
vmerge.vim vd, vs2, imm, v0 # vd[i] = v0[i].LSB ? imm : vs2[i]
vmerge.vvm vs2, rs1, vd
The vmerge instructions are always masked ( vm=0 )
vmerge.vvm vd, vs2, vs1, v0 # vd[i] = v0[i].LSB ? vs1[i] : vs2[i]
vmerge.vxm vd, vs2, rs1, v0 # vd[i] = v0[i].LSB ? x[rs1] : vs2[i]
vmerge.vim vd, vs2, imm, v0 # vd[i] = v0[i].LSB ? imm : vs2[i]
vmerge.vim vs2, simm5, vd
The vmerge instructions are always masked ( vm=0 )
vmerge.vvm vd, vs2, vs1, v0 # vd[i] = v0[i].LSB ? vs1[i] : vs2[i]
vmerge.vxm vd, vs2, rs1, v0 # vd[i] = v0[i].LSB ? x[rs1] : vs2[i]
vmerge.vim vd, vs2, imm, v0 # vd[i] = v0[i].LSB ? imm : vs2[i]

v / _vector_integer_move_instructions

12. Vector Integer Arithmetic Instructions / 12.16. Vector Integer Move Instructions
vmv.v.x rs1, vd
This instruction copies the vs1 , rs1 , or immediate operand to the first vl Note vmv.v.i vd, 0; vmerge.vim vd, vd, 1, v0
vmv.v.v vd, vs1 # vd[i] = vs1[i]
vmv.v.x vd, rs1 # vd[i] = rs1
vmv.v.i vd, imm # vd[i] = imm
vmv.v.v rs1, vd
This instruction copies the vs1 , rs1 , or immediate operand to the first vl Note vmv.v.i vd, 0; vmerge.vim vd, vd, 1, v0
vmv.v.v vd, vs1 # vd[i] = vs1[i]
vmv.v.x vd, rs1 # vd[i] = rs1
vmv.v.i vd, imm # vd[i] = imm
vmv.v.i simm5, vd
This instruction copies the vs1 , rs1 , or immediate operand to the first vl Note vmv.v.i vd, 0; vmerge.vim vd, vd, 1, v0
vmv.v.v vd, vs1 # vd[i] = vs1[i]
vmv.v.x vd, rs1 # vd[i] = rs1
vmv.v.i vd, imm # vd[i] = imm
vmv.s.x rs1, vd
This instruction copies the vs1 , rs1 , or immediate operand to the first vl Note vmv.v.i vd, 0; vmerge.vim vd, vd, 1, v0
vmv.v.v vd, vs1 # vd[i] = vs1[i]
vmv.v.x vd, rs1 # vd[i] = rs1
vmv.v.i vd, imm # vd[i] = imm

v / _vector_single_width_saturating_add_and_subtract

13. Vector Fixed-Point Arithmetic Instructions / 13.1. Vector Single-Width Saturating Add and Subtract
vsaddu.vx vs2, rs1, vd
vsaddu.vv vd, vs2, vs1, vm # Vector-vector
vsaddu.vx vd, vs2, rs1, vm # vector-scalar
vsaddu.vi vd, vs2, imm, vm # vector-immediate
vsadd.vx vs2, rs1, vd
vsadd.vv vd, vs2, vs1, vm # Vector-vector
vsadd.vx vd, vs2, rs1, vm # vector-scalar
vsadd.vi vd, vs2, imm, vm # vector-immediate
vssubu.vx vs2, rs1, vd
vssubu.vv vd, vs2, vs1, vm # Vector-vector
vssubu.vx vd, vs2, rs1, vm # vector-scalar
vssub.vx vs2, rs1, vd
vssub.vv vd, vs2, vs1, vm # Vector-vector
vssub.vx vd, vs2, rs1, vm # vector-scalar
vsaddu.vv vs2, rs1, vd
vsaddu.vv vd, vs2, vs1, vm # Vector-vector
vsaddu.vx vd, vs2, rs1, vm # vector-scalar
vsaddu.vi vd, vs2, imm, vm # vector-immediate
vsadd.vv vs2, rs1, vd
vsadd.vv vd, vs2, vs1, vm # Vector-vector
vsadd.vx vd, vs2, rs1, vm # vector-scalar
vsadd.vi vd, vs2, imm, vm # vector-immediate
vssubu.vv vs2, rs1, vd
vssubu.vv vd, vs2, vs1, vm # Vector-vector
vssubu.vx vd, vs2, rs1, vm # vector-scalar
vssub.vv vs2, rs1, vd
vssub.vv vd, vs2, vs1, vm # Vector-vector
vssub.vx vd, vs2, rs1, vm # vector-scalar
vsaddu.vi vs2, simm5, vd
vsaddu.vv vd, vs2, vs1, vm # Vector-vector
vsaddu.vx vd, vs2, rs1, vm # vector-scalar
vsaddu.vi vd, vs2, imm, vm # vector-immediate
vsadd.vi vs2, simm5, vd
vsadd.vv vd, vs2, vs1, vm # Vector-vector
vsadd.vx vd, vs2, rs1, vm # vector-scalar
vsadd.vi vd, vs2, imm, vm # vector-immediate

v / _vector_single_width_averaging_add_and_subtract

13. Vector Fixed-Point Arithmetic Instructions / 13.2. Vector Single-Width Averaging Add and Subtract
vaadd.vx vs2, rs1, vd
For vaaddu , vaadd , and vasub , there can be no overflow in the result
vaadd.vv vd, vs2, vs1, vm # roundoff_signed(vs2[i] + vs1[i], 1)
vaadd.vx vd, vs2, rs1, vm # roundoff_signed(vs2[i] + x[rs1], 1)
vasub.vx vs2, rs1, vd
vasub.vv vd, vs2, vs1, vm # roundoff_signed(vs2[i] - vs1[i], 1)
vasub.vx vd, vs2, rs1, vm # roundoff_signed(vs2[i] - x[rs1], 1)
vaadd.vv vs2, rs1, vd
For vaaddu , vaadd , and vasub , there can be no overflow in the result
vaadd.vv vd, vs2, vs1, vm # roundoff_signed(vs2[i] + vs1[i], 1)
vaadd.vx vd, vs2, rs1, vm # roundoff_signed(vs2[i] + x[rs1], 1)
vasub.vv vs2, rs1, vd
vasub.vv vd, vs2, vs1, vm # roundoff_signed(vs2[i] - vs1[i], 1)
vasub.vx vd, vs2, rs1, vm # roundoff_signed(vs2[i] - x[rs1], 1)
vaadd.vi vs2, simm5, vd
For vaaddu , vaadd , and vasub , there can be no overflow in the result
vaadd.vv vd, vs2, vs1, vm # roundoff_signed(vs2[i] + vs1[i], 1)
vaadd.vx vd, vs2, rs1, vm # roundoff_signed(vs2[i] + x[rs1], 1)

v / _vector_single_width_bit_shift_instructions

12. Vector Integer Arithmetic Instructions / 12.5. Vector Single-Width Bit Shift Instructions
vsll.vx vs2, rs1, vd
vsll.vv vd, vs2, vs1, vm # Vector-vector
vsll.vx vd, vs2, rs1, vm # vector-scalar
vsll.vi vd, vs2, uimm, vm # vector-immediate
vsra.vx vs2, rs1, vd
vsra.vv vd, vs2, vs1, vm # Vector-vector
vsra.vx vd, vs2, rs1, vm # vector-scalar
vsra.vi vd, vs2, uimm, vm # vector-immediate
vsll.vv vs2, rs1, vd
vsll.vv vd, vs2, vs1, vm # Vector-vector
vsll.vx vd, vs2, rs1, vm # vector-scalar
vsll.vi vd, vs2, uimm, vm # vector-immediate
vsra.vv vs2, rs1, vd
vsra.vv vd, vs2, vs1, vm # Vector-vector
vsra.vx vd, vs2, rs1, vm # vector-scalar
vsra.vi vd, vs2, uimm, vm # vector-immediate
vsll.vi vs2, simm5, vd
vsll.vv vd, vs2, vs1, vm # Vector-vector
vsll.vx vd, vs2, rs1, vm # vector-scalar
vsll.vi vd, vs2, uimm, vm # vector-immediate
vsra.vi vs2, simm5, vd
vsra.vv vd, vs2, vs1, vm # Vector-vector
vsra.vx vd, vs2, rs1, vm # vector-scalar
vsra.vi vd, vs2, uimm, vm # vector-immediate

v / _vector_single_width_fractional_multiply_with_rounding_and_saturation

13. Vector Fixed-Point Arithmetic Instructions / 13.3. Vector Single-Width Fractional Multiply with Rounding and Saturation
vsmul.vx vs2, rs1, vd
vsmul.vv vd, vs2, vs1, vm # vd[i] = clip(roundoff_signed(vs2[i]*vs1[i], SEW-1))
vsmul.vx vd, vs2, rs1, vm # vd[i] = clip(roundoff_signed(vs2[i]*x[rs1], SEW-1))
vsmul.vv vs2, rs1, vd
vsmul.vv vd, vs2, vs1, vm # vd[i] = clip(roundoff_signed(vs2[i]*vs1[i], SEW-1))
vsmul.vx vd, vs2, rs1, vm # vd[i] = clip(roundoff_signed(vs2[i]*x[rs1], SEW-1))

v / _vector_single_width_scaling_shift_instructions

13. Vector Fixed-Point Arithmetic Instructions / 13.4. Vector Single-Width Scaling Shift Instructions
vssrl.vx vs2, rs1, vd
The scaling right shifts have both zero-extending ( vssrl ) and sign-extending ( vssra ) forms
vssrl.vv vd, vs2, vs1, vm # vd[i] = roundoff_unsigned(vs2[i], vs1[i])
vssrl.vx vd, vs2, rs1, vm # vd[i] = roundoff_unsigned(vs2[i], x[rs1])
vssrl.vi vd, vs2, uimm, vm # vd[i] = roundoff_unsigned(vs2[i], uimm)
vssra.vx vs2, rs1, vd
vssra.vv vd, vs2, vs1, vm # vd[i] = roundoff_signed(vs2[i],vs1[i])
vssra.vx vd, vs2, rs1, vm # vd[i] = roundoff_signed(vs2[i], x[rs1])
vssra.vi vd, vs2, uimm, vm # vd[i] = roundoff_signed(vs2[i], uimm)
vssrl.vv vs2, rs1, vd
The scaling right shifts have both zero-extending ( vssrl ) and sign-extending ( vssra ) forms
vssrl.vv vd, vs2, vs1, vm # vd[i] = roundoff_unsigned(vs2[i], vs1[i])
vssrl.vx vd, vs2, rs1, vm # vd[i] = roundoff_unsigned(vs2[i], x[rs1])
vssrl.vi vd, vs2, uimm, vm # vd[i] = roundoff_unsigned(vs2[i], uimm)
vssra.vv vs2, rs1, vd
vssra.vv vd, vs2, vs1, vm # vd[i] = roundoff_signed(vs2[i],vs1[i])
vssra.vx vd, vs2, rs1, vm # vd[i] = roundoff_signed(vs2[i], x[rs1])
vssra.vi vd, vs2, uimm, vm # vd[i] = roundoff_signed(vs2[i], uimm)
vssrl.vi vs2, simm5, vd
The scaling right shifts have both zero-extending ( vssrl ) and sign-extending ( vssra ) forms
vssrl.vv vd, vs2, vs1, vm # vd[i] = roundoff_unsigned(vs2[i], vs1[i])
vssrl.vx vd, vs2, rs1, vm # vd[i] = roundoff_unsigned(vs2[i], x[rs1])
vssrl.vi vd, vs2, uimm, vm # vd[i] = roundoff_unsigned(vs2[i], uimm)
vssra.vi vs2, simm5, vd
vssra.vv vd, vs2, vs1, vm # vd[i] = roundoff_signed(vs2[i],vs1[i])
vssra.vx vd, vs2, rs1, vm # vd[i] = roundoff_signed(vs2[i], x[rs1])
vssra.vi vd, vs2, uimm, vm # vd[i] = roundoff_signed(vs2[i], uimm)

v / _vector_narrowing_integer_right_shift_instructions

12. Vector Integer Arithmetic Instructions / 12.6. Vector Narrowing Integer Right Shift Instructions
vnsrl.vx vs2, rs1, vd
vnsrl.wv vd, vs2, vs1, vm # vector-vector
vnsrl.wx vd, vs2, rs1, vm # vector-scalar
vnsrl.wi vd, vs2, uimm, vm # vector-immediate
vnsrl.vv vs2, rs1, vd
vnsrl.wv vd, vs2, vs1, vm # vector-vector
vnsrl.wx vd, vs2, rs1, vm # vector-scalar
vnsrl.wi vd, vs2, uimm, vm # vector-immediate
vnsrl.vi vs2, simm5, vd
vnsrl.wv vd, vs2, vs1, vm # vector-vector
vnsrl.wx vd, vs2, rs1, vm # vector-scalar
vnsrl.wi vd, vs2, uimm, vm # vector-immediate

v / sec-narrowing

11. Vector Arithmetic Instruction Formats / 11.3. Narrowing Vector Arithmetic Instructions
vnsra.vx vs2, rs1, vd
The double-width source vector register group is signified by a w in the source operand suffix (e.g., vnsra.wv )
vnsra.vv vs2, rs1, vd
The double-width source vector register group is signified by a w in the source operand suffix (e.g., vnsra.wv )
vnsra.vi vs2, simm5, vd
The double-width source vector register group is signified by a w in the source operand suffix (e.g., vnsra.wv )

v / _vector_narrowing_fixed_point_clip_instructions

13. Vector Fixed-Point Arithmetic Instructions / 13.5. Vector Narrowing Fixed-Point Clip Instructions
vnclipu.vx vs2, rs1, vd
For vnclipu / vnclip , the rounding mode is specified in the vxrm For vnclipu , the shifted rounded source value is treated as an unsigned integer and saturates if the result would overflow the destination viewed as an unsigned integer.
vnclipu.wv vd, vs2, vs1, vm # vd[i] = clip(roundoff_unsigned(vs2[i], vs1[i]))
vnclipu.wx vd, vs2, rs1, vm # vd[i] = clip(roundoff_unsigned(vs2[i], x[rs1]))
vnclipu.wi vd, vs2, uimm, vm # vd[i] = clip(roundoff_unsigned(vs2[i], uimm5))
vnclip.vx vs2, rs1, vd
The vnclip instructions are used to pack a fixed-point value into a narrower destination
For vnclip , the shifted rounded source value is treated as a signed integer and saturates if the result would overflow the destination viewed as a signed integer.
vnclip.wv vd, vs2, vs1, vm # vd[i] = clip(roundoff_signed(vs2[i], vs1[i]))
vnclip.wx vd, vs2, rs1, vm # vd[i] = clip(roundoff_signed(vs2[i], x[rs1]))
vnclip.wi vd, vs2, uimm, vm # vd[i] = clip(roundoff_signed(vs2[i], uimm5))
vnclipu.vv vs2, rs1, vd
For vnclipu / vnclip , the rounding mode is specified in the vxrm For vnclipu , the shifted rounded source value is treated as an unsigned integer and saturates if the result would overflow the destination viewed as an unsigned integer.
vnclipu.wv vd, vs2, vs1, vm # vd[i] = clip(roundoff_unsigned(vs2[i], vs1[i]))
vnclipu.wx vd, vs2, rs1, vm # vd[i] = clip(roundoff_unsigned(vs2[i], x[rs1]))
vnclipu.wi vd, vs2, uimm, vm # vd[i] = clip(roundoff_unsigned(vs2[i], uimm5))
vnclip.vv vs2, rs1, vd
The vnclip instructions are used to pack a fixed-point value into a narrower destination
For vnclip , the shifted rounded source value is treated as a signed integer and saturates if the result would overflow the destination viewed as a signed integer.
vnclip.wv vd, vs2, vs1, vm # vd[i] = clip(roundoff_signed(vs2[i], vs1[i]))
vnclip.wx vd, vs2, rs1, vm # vd[i] = clip(roundoff_signed(vs2[i], x[rs1]))
vnclip.wi vd, vs2, uimm, vm # vd[i] = clip(roundoff_signed(vs2[i], uimm5))
vnclipu.vi vs2, simm5, vd
For vnclipu / vnclip , the rounding mode is specified in the vxrm For vnclipu , the shifted rounded source value is treated as an unsigned integer and saturates if the result would overflow the destination viewed as an unsigned integer.
vnclipu.wv vd, vs2, vs1, vm # vd[i] = clip(roundoff_unsigned(vs2[i], vs1[i]))
vnclipu.wx vd, vs2, rs1, vm # vd[i] = clip(roundoff_unsigned(vs2[i], x[rs1]))
vnclipu.wi vd, vs2, uimm, vm # vd[i] = clip(roundoff_unsigned(vs2[i], uimm5))
vnclip.vi vs2, simm5, vd
The vnclip instructions are used to pack a fixed-point value into a narrower destination
For vnclip , the shifted rounded source value is treated as a signed integer and saturates if the result would overflow the destination viewed as a signed integer.
vnclip.wv vd, vs2, vs1, vm # vd[i] = clip(roundoff_signed(vs2[i], vs1[i]))
vnclip.wx vd, vs2, rs1, vm # vd[i] = clip(roundoff_signed(vs2[i], x[rs1]))
vnclip.wi vd, vs2, uimm, vm # vd[i] = clip(roundoff_signed(vs2[i], uimm5))

v / _vector_widening_integer_reduction_instructions

15. Vector Reduction Operations / 15.2. Vector Widening Integer Reduction Instructions
vwredsumu.vs vs2, rs1, vd
The unsigned vwredsumu.vs instruction zero-extends the SEW-wide vector elements before summing them, then adds the 2*SEW-width scalar element, and stores the result in a 2*SEW-width scalar element.
vwredsumu.vs vd, vs2, vs1, vm # 2*SEW = 2*SEW + sum(zero-extend(SEW))
vwredsum.vs vs2, rs1, vd
The vwredsum.vs instruction sign-extends the SEW-wide vector elements before summing them.
vwredsum.vs vd, vs2, vs1, vm # 2*SEW = 2*SEW + sum(sign-extend(SEW))

v / _vector_integer_dot_product_instruction

19. Divided Element Extension ('Zvediv') / 19.3. Vector Integer Dot-Product Instruction
vdotu.vv vs2, rs1, vd
vdotu.vv vd, vs2, vs1, vm # Vector-vector
vdot.vv vs2, rs1, vd
The integer dot-product reduction vdot.vv performs an element-wise multiplication between the source sub-elements then accumulates the results into the destination vector element
vdot.vv vd, vs2, vs1, vm # Vector-vector
vdot.vv vd, vs2, vs1, vm # vd[i][31:0] += vs2[i][31:0] * vs1[i][31:0]
vdot.vv vd, vs2, vs1, vm # vd[i][31:0] += vs2[i][31:16] * vs1[i][31:16]
vdot.vv vd, vs2, vs1, vm # vd[i][31:0] += vs2[i][31:24] * vs1[i][31:24]

v / _vector_single_width_integer_reduction_instructions

15. Vector Reduction Operations / 15.1. Vector Single-Width Integer Reduction Instructions
vredsum.vs vs2, vs1, vd
vredsum.vs vd, vs2, vs1, vm # vd[0] = sum( vs1[0] , vs2[*] )
vredand.vs vs2, vs1, vd
vredand.vs vd, vs2, vs1, vm # vd[0] = and( vs1[0] , vs2[*] )
vredor.vs vs2, vs1, vd
vredor.vs vd, vs2, vs1, vm # vd[0] = or( vs1[0] , vs2[*] )
vredxor.vs vs2, vs1, vd
vredxor.vs vd, vs2, vs1, vm # vd[0] = xor( vs1[0] , vs2[*] )
vredminu.vs vs2, vs1, vd
vredminu.vs vd, vs2, vs1, vm # vd[0] = minu( vs1[0] , vs2[*] )
vredmin.vs vs2, vs1, vd
vredmin.vs vd, vs2, vs1, vm # vd[0] = min( vs1[0] , vs2[*] )
vredmaxu.vs vs2, vs1, vd
vredmaxu.vs vd, vs2, vs1, vm # vd[0] = maxu( vs1[0] , vs2[*] )
vredmax.vs vs2, vs1, vd
vredmax.vs vd, vs2, vs1, vm # vd[0] = max( vs1[0] , vs2[*] )

v / _vector_compress_instruction

17. Vector Permutation Instructions / 17.5. Vector Compress Instruction
vcompress.vm vs2, vs1, vd
vcompress is encoded as an unmasked instruction ( vm=1 )
A trap on a vcompress instruction is always reported with a vstart of 0. Executing a vcompress instruction with a non-zero vstart raises an illegal instruction exception.
Note vcompress is one of the more difficult instructions to restart with a non-zero vstart , so assumption is implementations will choose not do that but will instead restart from element 0. This does mean elements in destination register after vstart will already have been updated
vcompress.vm vd, vs2, vs1 # Compress into vd elements of vs2 where vs1 is enabled
Example use of vcompress instruction
vcompress.vm v2, v1, v0

v / _vector_integer_comparison_instructions

12. Vector Integer Arithmetic Instructions / 12.7. Vector Integer Comparison Instructions
vmandnot.mm vs2, vs1, vd
expansion: vmslt{u}.vx vt, va, x; vmandnot.mm vd, vd, vt
vmxor.mm vs2, vs1, vd
expansion: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0
vmnand.mm vs2, vs1, vd
expansion: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd

v / _vector_floating_point_compare_instructions

14. Vector Floating-Point Instructions / 14.11. Vector Floating-Point Compare Instructions
vmand.mm vs2, vs1, vd
Note vmfeq vmand instruction, but this more efficient sequence incorrectly fails to raise the invalid exception when an element of va contains a quiet NaN and the corresponding element in vb contains a signaling NaN
vmand.mm v0, v0, v1 # Only set where A and B are ordered,

v / sec-mask-register-logical

16. Vector Mask Instructions / 16.1. Vector Mask-Register Logical Instructions
vmor.mm vs2, vs1, vd
vmor.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB || vs1[i].LSB
vmornot.mm vs2, vs1, vd
vmornot.mm vd, src2, src1
vmornot.mm vd, src1, src2
vmornot.mm vd, vs2, vs1 # vd[i] = vs2[i].LSB || !vs1[i].LSB
vmnor.mm vs2, vs1, vd
vmnor.mm vd, src1, src2
vmnor.mm vd, vs2, vs1 # vd[i] = !(vs2[i[.LSB || vs1[i].LSB)
vmxnor.mm vs2, vs1, vd
vmxnor.mm vd, src1, src2
vmxnor.mm vd, vd, vd
vmxnor.mm vd, vs2, vs1 # vd[i] = !(vs2[i].LSB ^^ vs1[i].LSB)
vmset.m vd => vmxnor.mm vd, vd, vd # Set mask register

v / __code_vfirst_code_find_first_set_mask_bit

16. Vector Mask Instructions / find-first-set mask bit
vmsbf.m vs2, vd
vmsbf.m

v / __code_vmsif_m_code_set_including_first_mask_bit

16. Vector Mask Instructions / set-including-first mask bit
vmsof.m vs2, vd
vmsof.m

v / __code_vmsbf_m_code_set_before_first_mask_bit

16. Vector Mask Instructions / set-before-first mask bit
vmsif.m vs2, vd
vmsif.m

v / _vector_iota_instruction

16. Vector Mask Instructions / 16.8. Vector Iota Instruction
viota.m vs2, vd
The viota.m instruction reads a source vector mask register and writes to each element of the destination vector register group the sum of all the least-significant bits of elements in the mask register whose index is less than the element, e.g., a parallel prefix sum of the mask values.
Traps on viota.m are always reported with a vstart of 0, and execution is always restarted from the beginning when resuming after a trap handler
The viota.m instruction can be combined with memory scatter instructions (indexed stores) to perform vector compress functions.
viota.m vd, vs2, vm
viota.m v4, v2 # Unmasked
viota.m v4, v2, v0.t # Masked
viota.m v16, v0 # Get destination offsets of active elements

v / _vector_element_index_instruction

16. Vector Mask Instructions / 16.9. Vector Element Index Instruction
vid.v vs2, vd
The vid.v instruction writes each element’s index to the destination vector register group, from 0 to vl -1.
Note vid.v instruction using the same datapath as viota.m but with an implicit set mask source
vid.v vd, vm # Write element ID to destination.

v / _vector_integer_divide_instructions

12. Vector Integer Arithmetic Instructions / 12.10. Vector Integer Divide Instructions
vdivu.vv vs2, vs1, vd
vdivu.vv vd, vs2, vs1, vm # Vector-vector
vdivu.vx vd, vs2, rs1, vm # vector-scalar
vdiv.vv vs2, vs1, vd
vdiv.vv vd, vs2, vs1, vm # Vector-vector
vdiv.vx vd, vs2, rs1, vm # vector-scalar
vremu.vv vs2, vs1, vd
vremu.vv vd, vs2, vs1, vm # Vector-vector
vremu.vx vd, vs2, rs1, vm # vector-scalar
vrem.vv vs2, vs1, vd
vrem.vv vd, vs2, vs1, vm # Vector-vector
vrem.vx vd, vs2, rs1, vm # vector-scalar
vdivu.vx vs2, rs1, vd
vdivu.vv vd, vs2, vs1, vm # Vector-vector
vdivu.vx vd, vs2, rs1, vm # vector-scalar
vdiv.vx vs2, rs1, vd
vdiv.vv vd, vs2, vs1, vm # Vector-vector
vdiv.vx vd, vs2, rs1, vm # vector-scalar
vremu.vx vs2, rs1, vd
vremu.vv vd, vs2, vs1, vm # Vector-vector
vremu.vx vd, vs2, rs1, vm # vector-scalar
vrem.vx vs2, rs1, vd
vrem.vv vd, vs2, vs1, vm # Vector-vector
vrem.vx vd, vs2, rs1, vm # vector-scalar

v / _vector_single_width_integer_multiply_instructions

12. Vector Integer Arithmetic Instructions / 12.9. Vector Single-Width Integer Multiply Instructions
vmulhu.vv vs2, vs1, vd
vmulhu.vv vd, vs2, vs1, vm # Vector-vector
vmulhu.vx vd, vs2, rs1, vm # vector-scalar
vmulhsu.vv vs2, vs1, vd
vmulhsu.vv vd, vs2, vs1, vm # Vector-vector
vmulhsu.vx vd, vs2, rs1, vm # vector-scalar
vmulh.vv vs2, vs1, vd
Note vmulh* opcodes perform simple fractional multiplies, but with no option to scale, round, and/or saturate the result
Can consider changing definition of vmulh , vmulhu , vmulhsu to use vxrm rounding mode when discarding low half of product
vmulh.vv vd, vs2, vs1, vm # Vector-vector
vmulh.vx vd, vs2, rs1, vm # vector-scalar
vmulhu.vx vs2, rs1, vd
vmulhu.vv vd, vs2, vs1, vm # Vector-vector
vmulhu.vx vd, vs2, rs1, vm # vector-scalar
vmulhsu.vx vs2, rs1, vd
vmulhsu.vv vd, vs2, vs1, vm # Vector-vector
vmulhsu.vx vd, vs2, rs1, vm # vector-scalar
vmulh.vx vs2, rs1, vd
Note vmulh* opcodes perform simple fractional multiplies, but with no option to scale, round, and/or saturate the result
Can consider changing definition of vmulh , vmulhu , vmulhsu to use vxrm rounding mode when discarding low half of product
vmulh.vv vd, vs2, vs1, vm # Vector-vector
vmulh.vx vd, vs2, rs1, vm # vector-scalar

v / _vector_single_width_integer_multiply_add_instructions

12. Vector Integer Arithmetic Instructions / 12.12. Vector Single-Width Integer Multiply-Add Instructions
vmadd.vv vs2, vs1, vd
vmadd.vv vd, vs1, vs2, vm # vd[i] = (vs1[i] * vd[i]) + vs2[i]
vmadd.vx vd, rs1, vs2, vm # vd[i] = (x[rs1] * vd[i]) + vs2[i]
vnmsub.vv vs2, vs1, vd
Similarly for the "vnmsub" opcode
vnmsub.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) + vs2[i]
vnmsub.vx vd, rs1, vs2, vm # vd[i] = -(x[rs1] * vd[i]) + vs2[i]
vmacc.vv vs2, vs1, vd
The integer multiply-add instructions are destructive and are provided in two forms, one that overwrites the addend or minuend ( vmacc , vnmsac ) and one that overwrites the first multiplicand ( vmadd , vnmsub ).
vmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vmacc.vx vd, rs1, vs2, vm # vd[i] = +(x[rs1] * vs2[i]) + vd[i]
vnmsac.vv vs2, vs1, vd
vnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i]
vnmsac.vx vd, rs1, vs2, vm # vd[i] = -(x[rs1] * vs2[i]) + vd[i]
vmadd.vx vs2, rs1, vd
vmadd.vv vd, vs1, vs2, vm # vd[i] = (vs1[i] * vd[i]) + vs2[i]
vmadd.vx vd, rs1, vs2, vm # vd[i] = (x[rs1] * vd[i]) + vs2[i]
vnmsub.vx vs2, rs1, vd
Similarly for the "vnmsub" opcode
vnmsub.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) + vs2[i]
vnmsub.vx vd, rs1, vs2, vm # vd[i] = -(x[rs1] * vd[i]) + vs2[i]
vmacc.vx vs2, rs1, vd
The integer multiply-add instructions are destructive and are provided in two forms, one that overwrites the addend or minuend ( vmacc , vnmsac ) and one that overwrites the first multiplicand ( vmadd , vnmsub ).
vmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vmacc.vx vd, rs1, vs2, vm # vd[i] = +(x[rs1] * vs2[i]) + vd[i]
vnmsac.vx vs2, rs1, vd
vnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i]
vnmsac.vx vd, rs1, vs2, vm # vd[i] = -(x[rs1] * vs2[i]) + vd[i]

v / _vector_widening_integer_add_subtract

12. Vector Integer Arithmetic Instructions / 12.2. Vector Widening Integer Add/Subtract
vwaddu.vv vs2, vs1, vd
vwaddu.vv vd, vs2, vs1, vm # vector-vector
vwaddu.vx vd, vs2, rs1, vm # vector-scalar
vwaddu.wv vd, vs2, vs1, vm # vector-vector
vwaddu.wx vd, vs2, rs1, vm # vector-scalar
vwadd.vv vs2, vs1, vd
Can define assembly pseudoinstructions vwcvt.x.x.v vd,vs,vm = vwadd.vx vd,vs,x0,vm and vwcvtu.x.x.v vd,vs,vm = vwaddu.vx vd,vs,x0,vm
vwadd.vv vd, vs2, vs1, vm # vector-vector
vwadd.vx vd, vs2, rs1, vm # vector-scalar
vwadd.wv vd, vs2, vs1, vm # vector-vector
vwadd.wx vd, vs2, rs1, vm # vector-scalar
vwsubu.vv vs2, vs1, vd
vwsubu.vv vd, vs2, vs1, vm # vector-vector
vwsubu.vx vd, vs2, rs1, vm # vector-scalar
vwsubu.wv vd, vs2, vs1, vm # vector-vector
vwsubu.wx vd, vs2, rs1, vm # vector-scalar
vwsub.vv vs2, vs1, vd
vwsub.vv vd, vs2, vs1, vm # vector-vector
vwsub.vx vd, vs2, rs1, vm # vector-scalar
vwsub.wv vd, vs2, vs1, vm # vector-vector
vwsub.wx vd, vs2, rs1, vm # vector-scalar
vwaddu.wv vs2, vs1, vd
vwaddu.vv vd, vs2, vs1, vm # vector-vector
vwaddu.vx vd, vs2, rs1, vm # vector-scalar
vwaddu.wv vd, vs2, vs1, vm # vector-vector
vwaddu.wx vd, vs2, rs1, vm # vector-scalar
vwadd.wv vs2, vs1, vd
Can define assembly pseudoinstructions vwcvt.x.x.v vd,vs,vm = vwadd.vx vd,vs,x0,vm and vwcvtu.x.x.v vd,vs,vm = vwaddu.vx vd,vs,x0,vm
vwadd.vv vd, vs2, vs1, vm # vector-vector
vwadd.vx vd, vs2, rs1, vm # vector-scalar
vwadd.wv vd, vs2, vs1, vm # vector-vector
vwadd.wx vd, vs2, rs1, vm # vector-scalar
vwsubu.wv vs2, vs1, vd
vwsubu.vv vd, vs2, vs1, vm # vector-vector
vwsubu.vx vd, vs2, rs1, vm # vector-scalar
vwsubu.wv vd, vs2, vs1, vm # vector-vector
vwsubu.wx vd, vs2, rs1, vm # vector-scalar
vwsub.wv vs2, vs1, vd
vwsub.vv vd, vs2, vs1, vm # vector-vector
vwsub.vx vd, vs2, rs1, vm # vector-scalar
vwsub.wv vd, vs2, vs1, vm # vector-vector
vwsub.wx vd, vs2, rs1, vm # vector-scalar
vwaddu.vx vs2, rs1, vd
vwaddu.vv vd, vs2, vs1, vm # vector-vector
vwaddu.vx vd, vs2, rs1, vm # vector-scalar
vwaddu.wv vd, vs2, vs1, vm # vector-vector
vwaddu.wx vd, vs2, rs1, vm # vector-scalar
vwadd.vx vs2, rs1, vd
Can define assembly pseudoinstructions vwcvt.x.x.v vd,vs,vm = vwadd.vx vd,vs,x0,vm and vwcvtu.x.x.v vd,vs,vm = vwaddu.vx vd,vs,x0,vm
vwadd.vv vd, vs2, vs1, vm # vector-vector
vwadd.vx vd, vs2, rs1, vm # vector-scalar
vwadd.wv vd, vs2, vs1, vm # vector-vector
vwadd.wx vd, vs2, rs1, vm # vector-scalar
vwsubu.vx vs2, rs1, vd
vwsubu.vv vd, vs2, vs1, vm # vector-vector
vwsubu.vx vd, vs2, rs1, vm # vector-scalar
vwsubu.wv vd, vs2, vs1, vm # vector-vector
vwsubu.wx vd, vs2, rs1, vm # vector-scalar
vwsub.vx vs2, rs1, vd
vwsub.vv vd, vs2, vs1, vm # vector-vector
vwsub.vx vd, vs2, rs1, vm # vector-scalar
vwsub.wv vd, vs2, vs1, vm # vector-vector
vwsub.wx vd, vs2, rs1, vm # vector-scalar
vwaddu.wx vs2, rs1, vd
vwaddu.vv vd, vs2, vs1, vm # vector-vector
vwaddu.vx vd, vs2, rs1, vm # vector-scalar
vwaddu.wv vd, vs2, vs1, vm # vector-vector
vwaddu.wx vd, vs2, rs1, vm # vector-scalar
vwadd.wx vs2, rs1, vd
Can define assembly pseudoinstructions vwcvt.x.x.v vd,vs,vm = vwadd.vx vd,vs,x0,vm and vwcvtu.x.x.v vd,vs,vm = vwaddu.vx vd,vs,x0,vm
vwadd.vv vd, vs2, vs1, vm # vector-vector
vwadd.vx vd, vs2, rs1, vm # vector-scalar
vwadd.wv vd, vs2, vs1, vm # vector-vector
vwadd.wx vd, vs2, rs1, vm # vector-scalar
vwsubu.wx vs2, rs1, vd
vwsubu.vv vd, vs2, vs1, vm # vector-vector
vwsubu.vx vd, vs2, rs1, vm # vector-scalar
vwsubu.wv vd, vs2, vs1, vm # vector-vector
vwsubu.wx vd, vs2, rs1, vm # vector-scalar
vwsub.wx vs2, rs1, vd
vwsub.vv vd, vs2, vs1, vm # vector-vector
vwsub.vx vd, vs2, rs1, vm # vector-scalar
vwsub.wv vd, vs2, vs1, vm # vector-vector
vwsub.wx vd, vs2, rs1, vm # vector-scalar

v / _vector_widening_integer_multiply_instructions

12. Vector Integer Arithmetic Instructions / 12.11. Vector Widening Integer Multiply Instructions
vwmulu.vv vs2, vs1, vd
vwmulu.vv vd, vs2, vs1, vm # vector-vector
vwmulu.vx vd, vs2, rs1, vm # vector-scalar
vwmulsu.vv vs2, vs1, vd
vwmulsu.vv vd, vs2, vs1, vm # vector-vector
vwmulsu.vx vd, vs2, rs1, vm # vector-scalar
vwmulu.vx vs2, rs1, vd
vwmulu.vv vd, vs2, vs1, vm # vector-vector
vwmulu.vx vd, vs2, rs1, vm # vector-scalar
vwmulsu.vx vs2, rs1, vd
vwmulsu.vv vd, vs2, vs1, vm # vector-vector
vwmulsu.vx vd, vs2, rs1, vm # vector-scalar

v / _vector_widening_integer_multiply_add_instructions

12. Vector Integer Arithmetic Instructions / 12.13. Vector Widening Integer Multiply-Add Instructions
vwmaccu.vv vs2, vs1, vd
vwmaccu.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vwmaccu.vx vd, rs1, vs2, vm # vd[i] = +(x[rs1] * vs2[i]) + vd[i]
vwmacc.vv vs2, vs1, vd
vwmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vwmacc.vx vd, rs1, vs2, vm # vd[i] = +(x[rs1] * vs2[i]) + vd[i]
vwmaccsu.vv vs2, vs1, vd
vwmaccsu.vv vd, vs1, vs2, vm # vd[i] = +(signed(vs1[i]) * unsigned(vs2[i])) + vd[i]
vwmaccsu.vx vd, rs1, vs2, vm # vd[i] = +(signed(x[rs1]) * unsigned(vs2[i])) + vd[i]
vwmaccu.vx vs2, rs1, vd
vwmaccu.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vwmaccu.vx vd, rs1, vs2, vm # vd[i] = +(x[rs1] * vs2[i]) + vd[i]
vwmacc.vx vs2, rs1, vd
vwmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i]
vwmacc.vx vd, rs1, vs2, vm # vd[i] = +(x[rs1] * vs2[i]) + vd[i]
vwmaccsu.vx vs2, rs1, vd
vwmaccsu.vv vd, vs1, vs2, vm # vd[i] = +(signed(vs1[i]) * unsigned(vs2[i])) + vd[i]
vwmaccsu.vx vd, rs1, vs2, vm # vd[i] = +(signed(x[rs1]) * unsigned(vs2[i])) + vd[i]
vwmaccus.vx vs2, rs1, vd
vwmaccus.vx vd, rs1, vs2, vm # vd[i] = +(unsigned(x[rs1]) * signed(vs2[i])) + vd[i]

v / _vector_slide1up

17. Vector Permutation Instructions / 17.3. Vector Slide Instructions
vslide1up.vx vs2, rs1, vd
The vslide1up instruction places the x register argument at location 0 of the destination vector register group, provided that element 0 is active, otherwise the destination element is unchanged
The vslide1up instruction requires that the destination vector register group does not overlap the source vector register group or the mask register
vslide1up.vx vd, vs2, rs1, vm # vd[0]=x[rs1], vd[i+1] = vs2[i]
vslide1up behavior

v / _vector_slide1down_instruction

17. Vector Permutation Instructions / 17.3. Vector Slide Instructions
vslide1down.vx vs2, rs1, vd
The vslide1down instruction copies the first vl -1 active elements values from index i +1 in the source vector register group to index i in the destination vector register group.
The vslide1down instruction places the x register argument at location vl -1 in the destination vector register, provided that element vl-1 is active, otherwise the destination element is unchanged
Note vslide1down instruction can be used to load values into a vector register without using memory and without disturbing other vector registers
This provides a path for debuggers to modify the contents of a vector register, albeit slowly, with multiple repeated vslide1down invocations
vslide1down.vx vd, vs2, rs1, vm # vd[i] = vs2[i+1], vd[vl-1]=x[rs1]
vslide1down behavior

custom

custom /

/
@custom0 rd, rs1, imm12
@custom0.rs1 rd, rs1, imm12
@custom0.rs1.rs2 rd, rs1, imm12
@custom0.rd rd, rs1, imm12
@custom0.rd.rs1 rd, rs1, imm12
@custom0.rd.rs1.rs2 rd, rs1, imm12
@custom1 rd, rs1, imm12
@custom1.rs1 rd, rs1, imm12
@custom1.rs1.rs2 rd, rs1, imm12
@custom1.rd rd, rs1, imm12
@custom1.rd.rs1 rd, rs1, imm12
@custom1.rd.rs1.rs2 rd, rs1, imm12
@custom2 rd, rs1, imm12
@custom2.rs1 rd, rs1, imm12
@custom2.rs1.rs2 rd, rs1, imm12
@custom2.rd rd, rs1, imm12
@custom2.rd.rs1 rd, rs1, imm12
@custom2.rd.rs1.rs2 rd, rs1, imm12
@custom3 rd, rs1, imm12
@custom3.rs1 rd, rs1, imm12
@custom3.rs1.rs2 rd, rs1, imm12
@custom3.rd rd, rs1, imm12
@custom3.rd.rs1 rd, rs1, imm12
@custom3.rd.rs1.rs2 rd, rs1, imm12

csr

csr / csr-instructions

"Zicsr", Control and Status Register (CSR) Instructions, Version 2.0 / CSR Instructions
csrrw rd, rs1, imm12
The CSRRW (Atomic Read/Write CSR) instruction atomically swaps values in the CSRs and integer registers
CSRRW reads the old value of the CSR, zero-extends the value to XLEN bits, then writes it to integer register rd
A CSRRW with rs1 = x0 will attempt to write zero to the destination CSR.
The assembler pseudoinstruction to write a CSR, CSRW csr, rs1 , is encoded as CSRRW x0, csr, rs1 , while CSRWI csr, uimm , is encoded as CSRRWI x0, csr, uimm .
csrrs rd, rs1, imm12
The CSRRS (Atomic Read and Set Bits in CSR) instruction reads the value of the CSR, zero-extends the value to XLEN bits, and writes it to integer register rd
For both CSRRS and CSRRC, if rs1 = x0 , then the instruction will not write to the CSR at all, and so shall not cause any of the side effects that might otherwise occur on a CSR write, such as raising illegal instruction exceptions on accesses to read-only CSRs
Both CSRRS and CSRRC always read the addressed CSR and cause any read side effects regardless of rs1 and rd fields
The CSRRS and CSRRC instructions have same behavior so are shown as CSRR
The assembler pseudoinstruction to read a CSR, CSRR rd, csr , is encoded as CSRRS rd, csr, x0
csrrc rd, rs1, imm12
The CSRRC (Atomic Read and Clear Bits in CSR) instruction reads the value of the CSR, zero-extends the value to XLEN bits, and writes it to integer register rd
csrrwi rd, rs1, imm12
The CSRRWI, CSRRSI, and CSRRCI variants are similar to CSRRW, CSRRS, and CSRRC respectively, except they update the CSR using an XLEN-bit value obtained by zero-extending a 5-bit unsigned immediate (uimm[4:0]) field encoded in the rs1 field instead of a value from an integer register
For CSRRWI, if rd = x0 , then the instruction shall not read the CSR and shall not cause any of the side effects that might occur on a CSR read
csrrsi rd, rs1, imm12
For CSRRSI and CSRRCI, if the uimm[4:0] field is zero, then these instructions will not write to the CSR, and shall not cause any of the side effects that might otherwise occur on a CSR write
Both CSRRSI and CSRRCI will always read the CSR and cause any read side effects regardless of rd and rs1 fields.

supervisor

sstatus sec:satp

supervisor / sstatus

Supervisor-Level ISA, Version 1.12 / Supervisor CSRs
sret
When an SRET instruction (see Section 
When a trap is taken into supervisor mode, SPIE is set to SIE, and SIE is set to 0. When an SRET instruction is executed, SIE is set to SPIE, then SPIE is set to 1.

supervisor / sec:satp

Supervisor-Level ISA, Version 1.12 / Supervisor CSRs
sfence.vma rs1, rs2
If the new address space’s page tables have been modified, or if an ASID is reused, it may be necessary to execute an SFENCE.VMA instruction (see Section 

hypervisor

sec:hinterruptregs sec:tinst vals hypervisor status register hstatus wfi in virtual operating modes sec:hgatp

hypervisor / sec:hinterruptregs

Hypervisor Extension, Version 0.5 / Hypervisor and Virtual Supervisor CSRs
or rd, rs1, rs2
VS-level external interrupts are made pending based on the logical-OR of:
When hip is read with a CSR instruction, the value of the VSEIP bit returned in the rd destination register is the logical-OR of all the sources listed above

hypervisor / sec:tinst-vals

Hypervisor Extension, Version 0.5 / Traps
sb imm12hi, rs1, rs2, imm12lo
For a standard store instruction that is not a compressed instruction and is one of SB, SH, SW, SD, FSW, FSD, or FSQ, the transformed instruction has the format shown in Figure 
Transformed noncompressed store instruction (SB, SH, SW, SD, FSW, FSD, or FSQ)

hypervisor / hypervisor-status-register-hstatus

Hypervisor Extension, Version 0.5 / Hypervisor and Virtual Supervisor CSRs
mret
An MRET or SRET instruction that changes the operating mode to U-mode, VS-mode, or VU-mode also sets SPRV=0.

hypervisor / wfi-in-virtual-operating-modes

Hypervisor Extension, Version 0.5 / WFI in Virtual Operating Modes
wfi
Executing instruction WFI when V=1 causes an illegal instruction exception, unless it completes within an implementation-specific, bounded time limit.
The behavior required of WFI in VS-mode and VU-mode is the same as required of it in U-mode when S-mode exists.

hypervisor / sec:hgatp

Hypervisor Extension, Version 0.5 / Hypervisor and Virtual Supervisor CSRs
hfence.gvma rs1, rs2
If the new virtual machine’s guest physical page tables have been modified, it may be necessary to execute an HFENCE.GVMA instruction (see Section