VREDUCEPS - REDUCE Packed Single
VREDUCEPS xmm1{k1}{z}, xmm2/m128/m32bcst, imm8 (V5+DQ+VL
__m128 _mm_reduce_ps(__m128 a, int imm8)
__m128 _mm_mask_reduce_ps(__m128 s, __mmask8 k, __m128 a, int imm8)
__m128 _mm_maskz_reduce_ps(__mmask8 k, __m128 a, int imm8)
For each float, round (1) to n-bit below the decimal point, subtract the result from (1), and set the remainder to (2).
VREDUCEPS ymm1{k1}{z}, ymm2/m256/m32bcst, imm8 (V5+DQ+VL
__m256 _mm256_reduce_ps(__m256 a, int imm8)
__m256 _mm256_mask_reduce_ps(__m256 s, __mmask8 k, __m256 a, int imm8)
__m256 _mm256_maskz_reduce_ps(__mmask8 k, __m256 a, int imm8)
For each float, round (1) to n-bit below the decimal point, subtract the result from (1), and set the remainder to (2).
VREDUCEPS zmm1{k1}{z}, zmm2/m512/m32bcst{sae}, imm8 (V5+DQ
__m512 _mm512_reduce_ps(__m512 a, int imm8)
__m512 _mm512_mask_reduce_ps(__m512 s, __mmask16 k, __m512 a, int imm8)
__m512 _mm512_maskz_reduce_ps(__mmask16 k, __m512 a, int imm8)
__m512 _mm512_reduce_round_ps(__m512 a, int imm8, int sae)
__m512 _mm512_mask_reduce_round_ps(__m512 s, __mmask16 k, __m512 a, int imm8, int sae)
__m512 _mm512_maskz_reduce_round_ps(__mmask16 k, __m512 a, int imm8, int sae)
For each float, round (1) to n-bit below the decimal point, subtract the result from (1), and set the remainder to (2).
imm8
bit |
|
7:4 |
Bits to preserve below decimal point (0 to 15).
Set 0 in this field to round to decimal point (like ROUNDPD / ROUNDPS).
|
3 |
0: precision exception mask is specified by MXCSR
1: precision exception is masked
|
2 |
0: rounding mode is specified by bit 1:0
1: rounding mode is specified by MXCSR
|
1:0 |
00: round to nearest or even
01: round toward negative infinity
10: round toward positive inifinity
11: round toward zero
|
x86/x64 SIMD Instruction List
Feedback