Clean up / speed up various SIMD functions
-
In InterPrediction::rightShiftMSB, use existing floorLog2 function
-
In addAvg_SSE, avoid unnecessary use of 32-bit path
-
In copyBufferSimd, reduce number of loops
-
In paddingSimd, take advantage of padding extent being either 1 or 2
-
In addBIOAvg4_SSE, reduce number of operations and avoid nasty Xmm register -> memory -> integer register path
Overall, a decoder runtime reduction of about 3% is expected
Merge request reports
Activity
Please register or sign in to reply