GcnInstrsMubuf - CLRX/CLRX-mirror GitHub Wiki
GCN ISA MUBUF instructions
These instructions allow to access to main memory. MUBUF instructions operates on the buffer resources. The buffer resources are 4 dwords which holds the base address, buffer size, their structure and format of their data. These instructions are untyped, and they get number/data format from an resource or that format are determined by operation (data format is not encoded in instruction's format).
List of fields for the MUBUF encoding (GCN 1.0/1.1):
Bits | Name | Description |
---|---|---|
0-11 | OFFSET | Unsigned byte offset |
12 | OFFEN | If set, send additional offset from VADDR |
13 | IDXEN | If set, send index from VADDR |
14 | GLC | Operation globally coherent |
15 | ADDR64 | If set, address is 64-bit (VADDR is 64-bit) |
16 | LDS | Data is read from or written to LDS, otherwise from VGPR |
18-24 | OPCODE | Operation code |
26-31 | ENCODING | Encoding type. Must be 0b111000 |
32-39 | VADDR | Vector address registers |
40-47 | VDATA | Vector data register |
48-52 | SRSRC | Scalar registers with buffer resource (SGPR# is 4*value) |
54 | SLC | System level coherent |
55 | TFE | Texture Fail Enable ??? |
56-63 | SOFFSET | Scalar base offset operand |
List of fields for the MUBUF encoding (GCN 1.2/1.4):
Bits | Name | Description |
---|---|---|
0-11 | OFFSET | Unsigned byte offset |
12 | OFFEN | If set, send additional offset from VADDR |
13 | IDXEN | If set, send index from VADDR |
14 | GLC | Operation globally coherent |
16 | LDS | Data is read from or written to LDS, otherwise from VGPR |
17 | SLC | System level coherent |
18-24 | OPCODE | Operation code |
26-31 | ENCODING | Encoding type. Must be 0b111000 |
32-39 | VADDR | Vector address registers |
40-47 | VDATA | Vector data register |
48-52 | SRSRC | Scalar registers with buffer resource (SGPR# is 4*value) |
55 | TFE | Texture Fail Enable ??? |
56-63 | SOFFSET | Scalar base offset operand |
Instruction syntax: INSTRUCTION VDATA, [VADDR(1:2),] SRSRC(4), SOFFSET [MODIFIERS]
Modifiers can be supplied in any order. Modifiers list: OFFEN, IDXEN, SLC, GLC, TFE, ADDR64, LDS, OFFSET:OFFSET. The TFE flag requires additional the VDATA register. IDXEN and OFFEN both enabled requires 64-bit VADDR. VADDR is optional if no IDXEN, OFFEN and ADDR64.
The MUBUF instructions are executed in order. Any MUBUF instruction increments VMCNT and it decrements VMCNT after memory operation. Any memory-write operation increments EXPCNT, and it decrements EXPCNT after reading data from VDATA.
Instructions by opcode
List of the MUBUF instructions by opcode (GCN 1.0/1.1):
Opcode | GCN 1.0 | GCN 1.1 | Mnemonic |
---|---|---|---|
0 (0x0) | ✓ | ✓ | BUFFER_LOAD_FORMAT_X |
1 (0x1) | ✓ | ✓ | BUFFER_LOAD_FORMAT_XY |
2 (0x2) | ✓ | ✓ | BUFFER_LOAD_FORMAT_XYZ |
3 (0x3) | ✓ | ✓ | BUFFER_LOAD_FORMAT_XYZW |
4 (0x4) | ✓ | ✓ | BUFFER_STORE_FORMAT_X |
5 (0x5) | ✓ | ✓ | BUFFER_STORE_FORMAT_XY |
6 (0x6) | ✓ | ✓ | BUFFER_STORE_FORMAT_XYZ |
7 (0x7) | ✓ | ✓ | BUFFER_STORE_FORMAT_XYZW |
8 (0x8) | ✓ | ✓ | BUFFER_LOAD_UBYTE |
9 (0x9) | ✓ | ✓ | BUFFER_LOAD_SBYTE |
10 (0xa) | ✓ | ✓ | BUFFER_LOAD_USHORT |
11 (0xb) | ✓ | ✓ | BUFFER_LOAD_SSHORT |
12 (0xc) | ✓ | ✓ | BUFFER_LOAD_DWORD |
13 (0xd) | ✓ | ✓ | BUFFER_LOAD_DWORDX2 |
14 (0xe) | ✓ | ✓ | BUFFER_LOAD_DWORDX4 |
15 (0xf) | ✓ | BUFFER_LOAD_DWORDX3 | |
24 (0x18) | ✓ | ✓ | BUFFER_STORE_BYTE |
26 (0x1a) | ✓ | ✓ | BUFFER_STORE_SHORT |
28 (0x1c) | ✓ | ✓ | BUFFER_STORE_DWORD |
29 (0x1d) | ✓ | ✓ | BUFFER_STORE_DWORDX2 |
30 (0x1e) | ✓ | ✓ | BUFFER_STORE_DWORDX4 |
31 (0x1f) | ✓ | BUFFER_STORE_DWORDX3 | |
48 (0x30) | ✓ | ✓ | BUFFER_ATOMIC_SWAP |
49 (0x31) | ✓ | ✓ | BUFFER_ATOMIC_CMPSWAP |
50 (0x32) | ✓ | ✓ | BUFFER_ATOMIC_ADD |
51 (0x33) | ✓ | ✓ | BUFFER_ATOMIC_SUB |
52 (0x34) | ✓ | BUFFER_ATOMIC_RSUB | |
53 (0x35) | ✓ | ✓ | BUFFER_ATOMIC_SMIN |
54 (0x36) | ✓ | ✓ | BUFFER_ATOMIC_UMIN |
55 (0x37) | ✓ | ✓ | BUFFER_ATOMIC_SMAX |
56 (0x38) | ✓ | ✓ | BUFFER_ATOMIC_UMAX |
57 (0x39) | ✓ | ✓ | BUFFER_ATOMIC_AND |
58 (0x3a) | ✓ | ✓ | BUFFER_ATOMIC_OR |
59 (0x3b) | ✓ | ✓ | BUFFER_ATOMIC_XOR |
60 (0x3c) | ✓ | ✓ | BUFFER_ATOMIC_INC |
61 (0x3d) | ✓ | ✓ | BUFFER_ATOMIC_DEC |
62 (0x3e) | ✓ | ✓ | BUFFER_ATOMIC_FCMPSWAP |
63 (0x3f) | ✓ | ✓ | BUFFER_ATOMIC_FMIN |
64 (0x40) | ✓ | ✓ | BUFFER_ATOMIC_FMAX |
80 (0x50) | ✓ | ✓ | BUFFER_ATOMIC_SWAP_X2 |
81 (0x51) | ✓ | ✓ | BUFFER_ATOMIC_CMPSWAP_X2 |
82 (0x52) | ✓ | ✓ | BUFFER_ATOMIC_ADD_X2 |
83 (0x53) | ✓ | ✓ | BUFFER_ATOMIC_SUB_X2 |
84 (0x54) | ✓ | BUFFER_ATOMIC_RSUB_X2 | |
85 (0x55) | ✓ | ✓ | BUFFER_ATOMIC_SMIN_X2 |
86 (0x56) | ✓ | ✓ | BUFFER_ATOMIC_UMIN_X2 |
87 (0x57) | ✓ | ✓ | BUFFER_ATOMIC_SMAX_X2 |
88 (0x58) | ✓ | ✓ | BUFFER_ATOMIC_UMAX_X2 |
89 (0x59) | ✓ | ✓ | BUFFER_ATOMIC_AND_X2 |
90 (0x5a) | ✓ | ✓ | BUFFER_ATOMIC_OR_X2 |
91 (0x5b) | ✓ | ✓ | BUFFER_ATOMIC_XOR_X2 |
92 (0x5c) | ✓ | ✓ | BUFFER_ATOMIC_INC_X2 |
93 (0x5d) | ✓ | ✓ | BUFFER_ATOMIC_DEC_X2 |
94 (0x5e) | ✓ | ✓ | BUFFER_ATOMIC_FCMPSWAP_X2 |
95 (0x5f) | ✓ | ✓ | BUFFER_ATOMIC_FMIN_X2 |
96 (0x60) | ✓ | ✓ | BUFFER_ATOMIC_FMAX_X2 |
112 (0x70) | ✓ | ✓ | BUFFER_WBINVL1_SC |
113 (0x71) | ✓ | ✓ | BUFFER_WBINVL1 |
List of the MUBUF instructions by opcode (GCN 1.2/1.4):
Opcode | GCN 1.2 | GCN 1.4 | Mnemonic |
---|---|---|---|
0 (0x0) | ✓ | ✓ | BUFFER_LOAD_FORMAT_X |
1 (0x1) | ✓ | ✓ | BUFFER_LOAD_FORMAT_XY |
2 (0x2) | ✓ | ✓ | BUFFER_LOAD_FORMAT_XYZ |
3 (0x3) | ✓ | ✓ | BUFFER_LOAD_FORMAT_XYZW |
4 (0x4) | ✓ | ✓ | BUFFER_STORE_FORMAT_X |
5 (0x5) | ✓ | ✓ | BUFFER_STORE_FORMAT_XY |
6 (0x6) | ✓ | ✓ | BUFFER_STORE_FORMAT_XYZ |
7 (0x7) | ✓ | ✓ | BUFFER_STORE_FORMAT_XYZW |
8 (0x8) | ✓ | ✓ | BUFFER_LOAD_FORMAT_D16_X |
9 (0x9) | ✓ | ✓ | BUFFER_LOAD_FORMAT_D16_XY |
10 (0xa) | ✓ | ✓ | BUFFER_LOAD_FORMAT_D16_XYZ |
11 (0xb) | ✓ | ✓ | BUFFER_LOAD_FORMAT_D16_XYZW |
12 (0xc) | ✓ | ✓ | BUFFER_STORE_FORMAT_D16_X |
13 (0xd) | ✓ | ✓ | BUFFER_STORE_FORMAT_D16_XY |
14 (0xe) | ✓ | ✓ | BUFFER_STORE_FORMAT_D16_XYZ |
15 (0xf) | ✓ | ✓ | BUFFER_STORE_FORMAT_D16_XYZW |
16 (0x10) | ✓ | ✓ | BUFFER_LOAD_UBYTE |
17 (0x11) | ✓ | ✓ | BUFFER_LOAD_SBYTE |
18 (0x12) | ✓ | ✓ | BUFFER_LOAD_USHORT |
19 (0x13) | ✓ | ✓ | BUFFER_LOAD_SSHORT |
20 (0x14) | ✓ | ✓ | BUFFER_LOAD_DWORD |
21 (0x15) | ✓ | ✓ | BUFFER_LOAD_DWORDX2 |
22 (0x16) | ✓ | ✓ | BUFFER_LOAD_DWORDX3 |
23 (0x17) | ✓ | ✓ | BUFFER_LOAD_DWORDX4 |
24 (0x18) | ✓ | ✓ | BUFFER_STORE_BYTE |
25 (0x19) | ✓ | BUFFER_STORE_BYTE_D16 | |
26 (0x1a) | ✓ | ✓ | BUFFER_STORE_SHORT |
27 (0x1b) | ✓ | BUFFER_STORE_SHORT_D16 | |
28 (0x1c) | ✓ | ✓ | BUFFER_STORE_DWORD |
29 (0x1d) | ✓ | ✓ | BUFFER_STORE_DWORDX2 |
30 (0x1e) | ✓ | ✓ | BUFFER_STORE_DWORDX3 |
31 (0x1f) | ✓ | ✓ | BUFFER_STORE_DWORDX4 |
32 (0x20) | ✓ | BUFFER_LOAD_UBYTE_D16 | |
33 (0x21) | ✓ | BUFFER_LOAD_UBYTE_D16_HI | |
34 (0x22) | ✓ | BUFFER_LOAD_SBYTE_D16 | |
35 (0x23) | ✓ | BUFFER_LOAD_SBYTE_D16_HI | |
36 (0x24) | ✓ | BUFFER_LOAD_SHORT_D16 | |
37 (0x25) | ✓ | BUFFER_LOAD_SHORT_D16_HI | |
38 (0x26) | ✓ | BUFFER_LOAD_FORMAT_D16_HI_X | |
39 (0x27) | ✓ | BUFFER_STORE_FORMAT_D16_HI_X | |
61 (0x3d) | ✓ | ✓ | BUFFER_STORE_LDS_DWORD |
62 (0x3e) | ✓ | ✓ | BUFFER_WBINVL1 |
63 (0x3f) | ✓ | ✓ | BUFFER_WBINVL1_VOL |
64 (0x40) | ✓ | ✓ | BUFFER_ATOMIC_SWAP |
65 (0x41) | ✓ | ✓ | BUFFER_ATOMIC_CMPSWAP |
66 (0x42) | ✓ | ✓ | BUFFER_ATOMIC_ADD |
67 (0x43) | ✓ | ✓ | BUFFER_ATOMIC_SUB |
68 (0x44) | ✓ | ✓ | BUFFER_ATOMIC_SMIN |
69 (0x45) | ✓ | ✓ | BUFFER_ATOMIC_UMIN |
70 (0x46) | ✓ | ✓ | BUFFER_ATOMIC_SMAX |
71 (0x47) | ✓ | ✓ | BUFFER_ATOMIC_UMAX |
72 (0x48) | ✓ | ✓ | BUFFER_ATOMIC_AND |
73 (0x49) | ✓ | ✓ | BUFFER_ATOMIC_OR |
74 (0x4a) | ✓ | ✓ | BUFFER_ATOMIC_XOR |
75 (0x4b) | ✓ | ✓ | BUFFER_ATOMIC_INC |
76 (0x4c) | ✓ | ✓ | BUFFER_ATOMIC_DEC |
96 (0x60) | ✓ | ✓ | BUFFER_ATOMIC_SWAP_X2 |
97 (0x61) | ✓ | ✓ | BUFFER_ATOMIC_CMPSWAP_X2 |
98 (0x62) | ✓ | ✓ | BUFFER_ATOMIC_ADD_X2 |
99 (0x63) | ✓ | ✓ | BUFFER_ATOMIC_SUB_X2 |
100 (0x64) | ✓ | ✓ | BUFFER_ATOMIC_SMIN_X2 |
101 (0x65) | ✓ | ✓ | BUFFER_ATOMIC_UMIN_X2 |
102 (0x66) | ✓ | ✓ | BUFFER_ATOMIC_SMAX_X2 |
103 (0x67) | ✓ | ✓ | BUFFER_ATOMIC_UMAX_X2 |
104 (0x68) | ✓ | ✓ | BUFFER_ATOMIC_AND_X2 |
105 (0x69) | ✓ | ✓ | BUFFER_ATOMIC_OR_X2 |
106 (0x6a) | ✓ | ✓ | BUFFER_ATOMIC_XOR_X2 |
107 (0x6b) | ✓ | ✓ | BUFFER_ATOMIC_INC_X2 |
108 (0x6c) | ✓ | ✓ | BUFFER_ATOMIC_DEC_X2 |
Details
Informations about addressing and format conversion are here: Main memory handling
Instruction set
Alphabetically sorted instruction list:
BUFFER_ATOMIC_ADD
Opcode: 50 (0x32) for GCN 1.0/1.1; 66 (0x42) for GCN 1.2
Syntax: BUFFER_ATOMIC_ADD VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Add VDATA to value of SRSRC resource, and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = *VM + VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_ADD_X2
Opcode: 82 (0x52) for GCN 1.0/1.1; 98 (0x62) for GCN 1.2
Syntax: BUFFER_ATOMIC_ADD_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Add 64-bit VDATA to 64-bit value of SRSRC resource, and store result
to this resource. If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = *VM + VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_AND
Opcode: 57 (0x39) for GCN 1.0/1.1; 72 (0x48) for GCN 1.2
Syntax: BUFFER_ATOMIC_AND VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Do bitwise AND on VDATA and value of SRSRC resource,
and store result to this resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = *VM & VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_AND_X2
Opcode: 89 (0x59) for GCN 1.0/1.1; 104 (0x68) for GCN 1.2
Syntax: BUFFER_ATOMIC_AND_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Do 64-bit bitwise AND on VDATA and value of SRSRC resource,
and store result to this resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = *VM & VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_CMPSWAP
Opcode: 49 (0x31) for GCN 1.0/1.1; 65 (0x41) for GCN 1.2
Syntax: BUFFER_ATOMIC_CMPSWAP VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store lower VDATA dword into SRSRC resource if previous value
from resource is equal VDATA>>32, otherwise keep old value from resource.
If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = *VM==(VDATA>>32) ? VDATA&0xffffffff : *VM // part of atomic
VDATA[0] = (GLC) ? P : VDATA[0] // last part of atomic
BUFFER_ATOMIC_CMPSWAP_X2
Opcode: 81 (0x51) for GCN 1.0/1.1; 97 (0x61) for GCN 1.2
Syntax: BUFFER_ATOMIC_CMPSWAP_X2 VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store lower VDATA 64-bit word into SRSRC resource if previous value
from resource is equal VDATA>>64, otherwise keep old value from resource.
If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = *VM==(VDATA[2:3]) ? VDATA[0:1] : *VM // part of atomic
VDATA[0:1] = (GLC) ? P : VDATA[0:1] // last part of atomic
BUFFER_ATOMIC_DEC
Opcode: 61 (0x3d) for GCN 1.0/1.1; 76 (0x4c) for GCN 1.2
Syntax: BUFFER_ATOMIC_DEC VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Compare value from SRSRC resource and if less or equal than VDATA
and this value is not zero, then decrement value from resource,
otherwise store VDATA to resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = (*VM <= VDATA && *VM!=0) ? *VM-1 : VDATA // atomic
VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_DEC_X2
Opcode: 93 (0x5d) for GCN 1.0/1.1; 108 (0x6c) for GCN 1.2
Syntax: BUFFER_ATOMIC_DEC_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Compare 64-bit value from SRSRC resource and if less or equal than VDATA
and this value is not zero, then decrement value from resource,
otherwise store VDATA to resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = (*VM <= VDATA && *VM!=0) ? *VM-1 : VDATA // atomic
VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_FCMPSWAP
Opcode: 62 (0x3e) for GCN 1.0/1.1
Syntax: BUFFER_ATOMIC_FCMPSWAP VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store lower VDATA dword into SRSRC resource if previous single floating point
value from resource is equal singe floating point value VDATA>>32,
otherwise keep old value from resource.
If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
FLOAT* VM = (FLOAT*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
FLOAT P = *VM; *VM = *VM==ASFLOAT(VDATA>>32) ? VDATA&0xffffffff : *VM // part of atomic
VDATA[0] = (GLC) ? P : VDATA[0] // last part of atomic
BUFFER_ATOMIC_FCMPSWAP_X2
Opcode: 94 (0x5e) for GCN 1.0/1.1
Syntax: BUFFER_ATOMIC_FCMPSWAP_X2 VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store lower VDATA 64-bit word into SRSRC resource if previous double
floating point value from resource is equal singe floating point value VDATA>>32,
otherwise keep old value from resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
DOUBLE* VM = (DOUBLE*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
DOUBLE P = *VM; *VM = *VM==ASDOUBLE(VDATA[2:3]) ? VDATA[0:1] : *VM // part of atomic
VDATA[0:1] = (GLC) ? P : VDATA[0:1] // last part of atomic
BUFFER_ATOMIC_FMAX
Opcode: 64 (0x40) for GCN 1.0/1.1
Syntax: BUFFER_ATOMIC_FMAX VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose greatest single floating point value from VDATA and from
SRSRC resource, and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
FLOAT* VM = (FLOAT*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = MAX(*VM, ASFLOAT(VDATA)); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_FMAX_X2
Opcode: 96 (0x60) for GCN 1.0/1.1
Syntax: BUFFER_ATOMIC_FMAX_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose greatest double floating point value from VDATA and from
SRSRC resource, and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
DOUBLE* VM = (DOUBLE*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = MAX(*VM, ASDOUBLE(VDATA)); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_FMIN
Opcode: 63 (0x3f) for GCN 1.0/1.1
Syntax: BUFFER_ATOMIC_FMIN VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose smallest single floating point value from VDATA and from
SRSRC resource, and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
FLOAT* VM = (FLOAT*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = MIN(*VM, ASFLOAT(VDATA)); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_FMIN_X2
Opcode: 95 (0x5f) for GCN 1.0/1.1
Syntax: BUFFER_ATOMIC_FMIN_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose smallest double floating point value from VDATA and from
SRSRC resource, and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
DOUBLE* VM = (DOUBLE*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = MIN(*VM, ASDOUBLE(VDATA)); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_INC
Opcode: 60 (0x3c) for GCN 1.0/1.1; 75 (0x4b) for GCN 1.2
Syntax: BUFFER_ATOMIC_INC VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Compare value from SRSRC resource and if less than VDATA,
then increment value from resource, otherwise store zero to resource.
If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = (*VM < VDATA) ? *VM+1 : 0; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_INC_X2
Opcode: 92 (0x5c) for GCN 1.0/1.1; 107 (0x9b) for GCN 1.2
Syntax: BUFFER_ATOMIC_INC_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Compare 64-bit value from SRSRC resource and if less than VDATA,
then increment value from resource, otherwise store zero to resource.
If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = (*VM < VDATA) ? *VM+1 : 0; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_OR
Opcode: 58 (0x3a) for GCN 1.0/1.1; 73 (0x49) for GCN 1.2
Syntax: BUFFER_ATOMIC_OR VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Do bitwise OR on VDATA and value of SRSRC resource,
and store result to this resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = *VM | VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_OR_X2
Opcode: 90 (0x5a) for GCN 1.0/1.1; 105 (0x69) for GCN 1.2
Syntax: BUFFER_ATOMIC_OR_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Do 64-bit bitwise OR on VDATA and value of SRSRC resource,
and store result to this resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = *VM | VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_RSUB
Opcode: 52 (0x34) for GCN 1.0
Syntax: BUFFER_ATOMIC_RSUB VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Subtract value of SRSRC resource from VDATA, and store result to
this resource. If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = VDATA - *VM; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_RSUB_X2
Opcode: 84 (0x54) for GCN 1.0
Syntax: BUFFER_ATOMIC_RSUB_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Subtract 64-bit value of SRSRC resource from 64-bit VDATA, and store result
to this resource. If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = VDATA - *VM; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SMAX
Opcode: 55 (0x37) for GCN 1.0/1.1; 70 (0x46) for GCN 1.2
Syntax: BUFFER_ATOMIC_SMAX VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose greatest signed 32-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
INT32* VM = (INT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = MAX(*VM, (INT32)VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SMAX_X2
Opcode: 87 (0x57) for GCN 1.0/1.1; 102 (0x66) for GCN 1.2
Syntax: BUFFER_ATOMIC_SMAX_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose greatest signed 64-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
INT64* VM = (INT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = MAX(*VM, (INT64)VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SMIN
Opcode: 53 (0x35) for GCN 1.0/1.1; 68 (0x44) for GCN 1.2
Syntax: BUFFER_ATOMIC_SMIN VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose smallest signed 32-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
INT32* VM = (INT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = MIN(*VM, (INT32)VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SMIN_X2
Opcode: 85 (0x55) for GCN 1.0/1.1; 100 (0x64) for GCN 1.2
Syntax: BUFFER_ATOMIC_SMIN_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose smallest signed 64-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
INT64* VM = (INT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = MIN(*VM, (INT64)VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SUB
Opcode: 51 (0x33) for GCN 1.0/1.1; 67 (0x43) for GCN 1.2
Syntax: BUFFER_ATOMIC_SUB VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Subtract VDATA from value from SRSRC resource, and store result to
this resource. If GLC flag is set then return previous value from resource to VDATA,
otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = *VM - VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SUB_X2
Opcode: 83 (0x53) for GCN 1.0/1.1; 99 (0x63) for GCN 1.2
Syntax: BUFFER_ATOMIC_SUB_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Subtract 64-bit VDATA from 64-bit value from SRSRC resource, and store
result to this resource. If GLC flag is set then return previous value from resource to
VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = *VM - VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SWAP
Opcode: 48 (0x30) for GCN 1.0/1.1; 64 (0x40) for GCN 1.2
Syntax: BUFFER_ATOMIC_SWAP VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store VDATA dword into SRSRC resource. If GLC flag is set then
return previous value from resource to VDATA, otherwise keep VDATA value.
Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_SWAP_X2
Opcode: 80 (0x50) for GCN 1.0/1.1; 96 (0x60) for GCN 1.2
Syntax: BUFFER_ATOMIC_SWAP_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store VDATA 64-bit word into SRSRC resource. If GLC flag is set then
return previous value from resource to VDATA, otherwise keep VDATA value.
Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_UMAX
Opcode: 56 (0x38) for GCN 1.0/1.1; 71 (0x47) for GCN 1.2
Syntax: BUFFER_ATOMIC_UMAX VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose greatest unsigned 32-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise
keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = MAX(*VM, VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_UMAX_X2
Opcode: 88 (0x58) for GCN 1.0/1.1; 103 (0x67) for GCN 1.2
Syntax: BUFFER_ATOMIC_UMAX_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose greatest unsigned 64-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = MAX(*VM, VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_UMIN
Opcode: 54 (0x36) for GCN 1.0/1.1; 69 (0x45) for GCN 1.2
Syntax: BUFFER_ATOMIC_UMIN VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose smallest unsigned 32-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = MIN(*VM, VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_UMIN_X2
Opcode: 86 (0x56) for GCN 1.0/1.1; 101 (0x65) for GCN 1.2
Syntax: BUFFER_ATOMIC_UMIN_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Choose smallest unsigned 64-bit value from VDATA and from SRSRC resource,
and store result to this resource.
If GLC flag is set then return previous value from resource to VDATA, otherwise keep
VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = MIN(*VM, VDATA); VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_XOR
Opcode: 59 (0x3b) for GCN 1.0/1.1; 74 (0x4a) for GCN 1.2
Syntax: BUFFER_ATOMIC_XOR VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Do bitwise XOR on VDATA and value of SRSRC resource,
and store result to this resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT32* VM = (UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT32 P = *VM; *VM = *VM ^ VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_ATOMIC_XOR_X2
Opcode: 91 (0x5b) for GCN 1.0/1.1; 106 (0x6a) for GCN 1.2
Syntax: BUFFER_ATOMIC_XOR_X2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Do 64-bit bitwise XOR on VDATA and value of SRSRC resource,
and store result to this resource. If GLC flag is set then return previous value
from resource to VDATA, otherwise keep VDATA value. Operation is atomic.
Operation:
UINT64* VM = (UINT64*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
UINT64 P = *VM; *VM = *VM ^ VDATA; VDATA = (GLC) ? P : VDATA // atomic
BUFFER_LOAD_DWORD
Opcode: 12 (0xc) for GCN 1.0/1.1; 20 (0x14) for GCN 1.2
Syntax: BUFFER_LOAD_DWORD VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load dword to VDATA from SRSRC resource.
Operation:
VDATA = *(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_DWORDX2
Opcode: 13 (0xd) for GCN 1.0/1.1; 21 (0x15) for GCN 1.2
Syntax: BUFFER_LOAD_DWORDX2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load two dwords to VDATA from SRSRC resource.
Operation:
UINT32* VM = *(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
VDATA[0] = VM[0]
VDATA[1] = VM[1]
BUFFER_LOAD_DWORDX3
Opcode: 15 (0xf) for GCN 1.1; 22 (0x16) for GCN 1.2
Syntax: BUFFER_LOAD_DWORDX3 VDATA(3), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load three dwords to VDATA from SRSRC resource.
Operation:
UINT32* VM = *(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
VDATA[0] = VM[0]
VDATA[1] = VM[1]
VDATA[2] = VM[2]
BUFFER_LOAD_DWORDX4
Opcode: 14 (0xe) for GCN 1.0/1.1; 23 (0x17) for GCN 1.2
Syntax: BUFFER_LOAD_DWORDX4 VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load four dwords to VDATA from SRSRC resource.
Operation:
UINT32* VM = *(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
VDATA[0] = VM[0]
VDATA[1] = VM[1]
VDATA[2] = VM[2]
VDATA[3] = VM[3]
BUFFER_LOAD_FORMAT_D16_X
Opcode: 8 (0x8)
Syntax: BUFFER_LOAD_FORMAT_D16_X VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the first component of the element from SRSRC including format from
buffer resource. Store result as 16-bit value (half FP or 16-bit integer).
Operation:
VDATA = LOAD_FORMAT_D16_X(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_FORMAT_D16_HI_X
Opcode: 38 (0x26) for GCN 1.4
Syntax: BUFFER_LOAD_FORMAT_D16_HI_X VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the first component of the element from SRSRC including format from
buffer resource. Store result as 16-bit value to higher part of VDATA register
(half FP or 16-bit integer).
Operation:
VDATA = LOAD_FORMAT_D16_X(SRSRC, VADDR(1:2), SOFFSET, OFFSET)<<16
BUFFER_LOAD_FORMAT_D16_XY
Opcode: 9 (0x9)
Syntax: BUFFER_LOAD_FORMAT_D16_XY VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Syntax (GCN 1.4): BUFFER_LOAD_FORMAT_D16_XY VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the first two components of the element from SRSRC resource
including format from SRSRC. Store result as 16-bit value (half FP or 16-bit integer).
Operation:
VDATA = LOAD_FORMAT_D16_XY(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_FORMAT_D16_XYZ
Opcode: 10 (0xa)
Syntax: BUFFER_LOAD_FORMAT_D16_XYZ VDATA(3), VADDR(1:2), SRSRC(4), SOFFSET
Syntax (GCN 1.4): BUFFER_LOAD_FORMAT_D16_XYZ VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the first three components of the element from SRSRC resource
including format from SRSRC. Store result as 16-bit value (half FP or 16-bit integer).
Operation:
VDATA = LOAD_FORMAT_D16_XYZ(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_FORMAT_D16_XYZW
Opcode: 11 (0xb)
Syntax: BUFFER_LOAD_FORMAT_D16_XYZW VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Syntax (GCN 1.4): BUFFER_LOAD_FORMAT_D16_XYZW VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the all four components of the element from SRSRC resource
including format from SRSRC. Store result as 16-bit value (half FP or 16-bit integer).
Operation:
VDATA = LOAD_FORMAT_D16_XYZW(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_FORMAT_X
Opcode: 0 (0x0)
Syntax: BUFFER_LOAD_FORMAT_X VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the first component of the element from SRSRC including format from
buffer resource.
Operation:
VDATA = LOAD_FORMAT_X(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_FORMAT_XY
Opcode: 1 (0x1)
Syntax: BUFFER_LOAD_FORMAT_XY VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the first two components of the element from SRSRC resource
including format from SRSRC.
Operation:
VDATA = LOAD_FORMAT_XY(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_FORMAT_XYZ
Opcode: 2 (0x2)
Syntax: BUFFER_LOAD_FORMAT_XYZ VDATA(3), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the first three components of the element from SRSRC resource
including format from SRSRC.
Operation:
VDATA = LOAD_FORMAT_XYZ(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_FORMAT_XYZW
Opcode: 3 (0x3)
Syntax: BUFFER_LOAD_FORMAT_XYZW VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Description: Load the all four components of the element from SRSRC resource
including format from SRSRC.
Operation:
VDATA = LOAD_FORMAT_XYZW(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_SBYTE
Opcode: 9 (0x9) for GCN 1.0/1.1; 17 (0x11) for GCN 1.2
Syntax: BUFFER_LOAD_SBYTE VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load byte to VDATA from SRSRC resource with sign extending.
Operation:
VDATA = *(INT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_SBYTE_D16
Opcode: 34 (0x22) for GCN 1.4
Syntax: BUFFER_LOAD_SBYTE_D16 VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load byte to VDATA from SRSRC resource with sign extending to
lower 16-bit part of VDATA.
Operation:
VDATA &= 0xffff0000
VDATA |= (UINT32)*(INT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)&0xffff
BUFFER_LOAD_SBYTE_D16_HI
Opcode: 35 (0x23) for GCN 1.4
Syntax: BUFFER_LOAD_SBYTE_D16_HI VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load byte to VDATA from SRSRC resource with sign extending to
higher 16-bit part of VDATA.
Operation:
VDATA &= 0xffff
VDATA |= (UINT32)*(INT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)<<16
BUFFER_LOAD_SHORT_D16
Opcode: 36 (0x24) for GCN 1.4
Syntax: BUFFER_LOAD_SHORT_D16 VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load 16-bit word to VDATA from SRSRC resource to lower part of VDATA.
Operation:
VDATA &= 0xffff0000
VDATA |= *(UINT16*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_SHORT_D16
Opcode: 37 (0x25) for GCN 1.4
Syntax: BUFFER_LOAD_SHORT_D16_HI VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load 16-bit word to VDATA from SRSRC resource to part part of VDATA.
Operation:
VDATA &= 0xffff
VDATA |= *(UINT16*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)<<16
BUFFER_LOAD_SSHORT
Opcode: 11 (0xb) for GCN 1.0/1.1; 19 (0x13) for GCN 1.2
Syntax: BUFFER_LOAD_SSHORT VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load 16-bit word to VDATA from SRSRC resource with sign extending.
Operation:
VDATA = *(INT16*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_UBYTE
Opcode: 8 (0x8) for GCN 1.0/1.1; 16 (0x10) for GCN 1.2
Syntax: BUFFER_LOAD_UBYTE VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load byte to VDATA from SRSRC resource with zero extending.
Operation:
VDATA = *(UINT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_LOAD_UBYTE_D16
Opcode: 32 (0x20) for GCN 1.4
Syntax: BUFFER_LOAD_UBYTE_D16 VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load byte to VDATA from SRSRC resource with zero extending to
lower 16-bit part of VDATA.
Operation:
VDATA &= 0xffff0000
VDATA |= (UINT32)*(UINT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)&0xffff
BUFFER_LOAD_UBYTE_D16_HI
Opcode: 33 (0x21) for GCN 1.4
Syntax: BUFFER_LOAD_UBYTE_D16_HI VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load byte to VDATA from SRSRC resource with zero extending to
higher 16-bit part of VDATA.
Operation:
VDATA &= 0xffff
VDATA |= (UINT32)*(UINT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)<<16
BUFFER_LOAD_USHORT
Opcode: 10 (0xa) for GCN 1.0/1.1; 18 (0x12) for GCN 1.2
Syntax: BUFFER_LOAD_USHORT VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Load 16-bit word to VDATA from SRSRC resource with zero extending.
Operation:
VDATA = *(UINT16*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
BUFFER_STORE_BYTE
Opcode: 24 (0x18)
Syntax: BUFFER_STORE_BYTE VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store byte from VDATA into SRSRC resource.
Operation:
*(UINT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET) = VDATA&0xff
BUFFER_STORE_BYTE
Opcode: 25 (0x19)
Syntax: BUFFER_STORE_BYTE_D16_HI VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store byte from 16-23 bits of VDATA into SRSRC resource.
Operation:
*(UINT8*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET) = (VDATA>>16)&0xff
BUFFER_STORE_DWORD
Opcode: 28 (0x1c)
Syntax: BUFFER_STORE_DWORD VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store dword from VDATA into SRSRC resource.
Operation:
*(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET) = VDATA
BUFFER_STORE_DWORDX2
Opcode: 29 (0x1d)
Syntax: BUFFER_STORE_DWORDX2 VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store two dwords from VDATA into SRSRC resource.
Operation:
UINT32* VM = *(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
VM[0] = VDATA[0]
VM[1] = VDATA[1]
BUFFER_STORE_DWORDX3
Opcode: 31 (0x1f) for GCN 1.1; 30 (0x1e) for GCN 1.2
Syntax: BUFFER_STORE_DWORDX2 VDATA(3), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store three dwords from VDATA into SRSRC resource.
Operation:
UINT32* VM = *(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
VM[0] = VDATA[0]
VM[1] = VDATA[1]
VM[2] = VDATA[2]
BUFFER_STORE_DWORDX4
Opcode: 31 (0x1e) for GCN 1.1; 31 (0x1f) for GCN 1.2
Syntax: BUFFER_STORE_DWORDX2 VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store four dwords from VDATA into SRSRC resource.
Operation:
UINT32* VM = *(UINT32*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET)
VM[0] = VDATA[0]
VM[1] = VDATA[1]
VM[2] = VDATA[2]
VM[3] = VDATA[3]
BUFFER_STORE_FORMAT_D16_X
Opcode: 12 (0xc)
Syntax: BUFFER_STORE_FORMAT_D16_X VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the first component of the element into SRSRC resource
including format from SRSRC. Treat input as 16-bit value (half FP or 16-bit integer).
Operation:
STORE_FORMAT_D16_X(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_FORMAT_D16_HI_X
Opcode: 39 (0x27) for GCN 1.4
Syntax: BUFFER_STORE_FORMAT_D16_HI_X VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the first component of the element into SRSRC resource
including format from SRSRC. Treat input as 16-bit value stored in higher part of VDATA
(half FP or 16-bit integer).
Operation:
STORE_FORMAT_D16_X(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA>>16)
BUFFER_STORE_FORMAT_D16_XY
Opcode: 13 (0xd)
Syntax: BUFFER_STORE_FORMAT_D16_XY VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Syntax (GCN 1.4): BUFFER_STORE_FORMAT_D16_XY VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the first two components of the element into SRSRC resource
including format from SRSRC. Treat input as 16-bit value (half FP or 16-bit integer).
Operation:
STORE_FORMAT_D16_XY(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_FORMAT_D16_XYZ
Opcode: 14 (0xe)
Syntax: BUFFER_STORE_FORMAT_D16_XYZ VDATA(3), VADDR(1:2), SRSRC(4), SOFFSET
Syntax (GCN 1.4): BUFFER_STORE_FORMAT_D16_XYZ VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the first three components of the element into SRSRC resource
including format from SRSRC. Treat input as 16-bit value (half FP or 16-bit integer).
Operation:
STORE_FORMAT_D16_XYZ(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_FORMAT_D16_XYZW
Opcode: 15 (0xf)
Syntax: BUFFER_STORE_FORMAT_D16_XYZW VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Syntax (GCN 1.4): BUFFER_STORE_FORMAT_D16_XYZW VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the all components of the element into SRSRC resource
including format from SRSRC. Treat input as 16-bit value (half FP or 16-bit integer).
Operation:
STORE_FORMAT_D16_XYZW(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_FORMAT_X
Opcode: 4 (0x4)
Syntax: BUFFER_STORE_FORMAT_X VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the first component of the element into SRSRC resource
including format from SRSRC.
Operation:
STORE_FORMAT_X(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_FORMAT_XY
Opcode: 5 (0x5)
Syntax: BUFFER_STORE_FORMAT_XY VDATA(2), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the first two components of the element into SRSRC resource
including format from SRSRC.
Operation:
STORE_FORMAT_XY(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_FORMAT_XYZ
Opcode: 6 (0x6)
Syntax: BUFFER_STORE_FORMAT_XYZ VDATA(3), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the first three components of the element into SRSRC resource
including format from SRSRC.
Operation:
STORE_FORMAT_XYZ(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_FORMAT_XYZW
Opcode: 7 (0x7)
Syntax: BUFFER_STORE_FORMAT_XYZW VDATA(4), VADDR(1:2), SRSRC(4), SOFFSET
Description: Store the all components of the element into SRSRC resource
including format from SRSRC.
Operation:
STORE_FORMAT_XYZW(SRSRC, VADDR(1:2), SOFFSET, OFFSET, VDATA)
BUFFER_STORE_LDS_DWORD
Opcode: 61 (0x3d) for GCN 1.2
Syntax: BUFFER_STORE_LDS_DWORD SRSRC(4), SOFFSET
Description: Store single dword from LDS into SRSRC resource.
Operation:
UINT32 VAL = *(UINT32*)(LDS + (M0&0xffff) + OFFSET + LANEID*4)
*(UINT32*)VMEM(SRSRC, 0, SOFFSET, OFFSET) = VAL
BUFFER_STORE_SHORT
Opcode: 26 (0x1a)
Syntax: BUFFER_STORE_SHORT VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store 16-bit word from VDATA into SRSRC resource.
Operation:
*(UINT16*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET) = VDATA&0xffff
BUFFER_STORE_SHORT
Opcode: 27 (0x1b) for GCN 1.4
Syntax: BUFFER_STORE_SHORT_D16 VDATA, VADDR(1:2), SRSRC(4), SOFFSET
Description: Store higher 16-bit word from VDATA into SRSRC resource.
Operation:
*(UINT16*)VMEM(SRSRC, VADDR(1:2), SOFFSET, OFFSET) = VDATA>>16