Refine SIMD implementation of JVET-O0304 (Reduction of number of multiplications in BDOF) See merge request !797