Refactoring and cleaning up CCCM related code. CccmModel struct can now carry information about the type of the CCCM represented by the struck. This cleans up especially the IntraPrediction::predIntraCCCM function nicely as there are no more separate branches for all the different CCCM variants available in ECM. Encoding and decoding are bit-exact with the current master.
TrQuant::transformNxN function (variant that takes transform candidate list as an input and performs forward transforms for all candidates) has an issue in the final loop that compares costs to thresholds and selects which transforms survive to full RDO.
In that final loop a threshold value intended to be used for transform skip case (thrTS) is selected if itC.second equals to 1. itC.second parameters are set with the position of the transform candidate in the input candidate list in the earlier iteration over the candidates.
Under CTC conditions transform skip is always checked as the second candidate (and itC.second equals then to 1) and thus there is no issue within CTC. However, depending on the encoder configuration, especially if transform skip is disabled, itC.second can equal to 1 for other transform modes and the threshold value apparently intended specifically for the transform skip is then used for whatever transform mode happens to be on the second position in the candidate list.
Merge request fixing the issue is provided. Fix improves all intra performance by roughly 0.2% when the fix is triggered (e.g. when LFNST and transform skip are disabled and only MTS modes are tested by the encoder). There is no change for the operation under CTC.
There seems to be a conflict in InterPrediction::applyBiOptFlow with JVET_Z0136_OOB functionality when SIMD optimizations are off. The following lines are called in both SIMD and non-SIMD case:
if (bioDx == 4)
{
g_pelBufOP.addAvg4(srcY0Temp, src0Stride, srcY1Temp, src1Stride, dstY + dstBlockOffset,
dstStride, bioDx, bioDy, shiftNum, offset, clpRng, pSubMcMask, width, isOOBTmp);
}
else
{
g_pelBufOP.addAvg8(srcY0Temp, src0Stride, srcY1Temp, src1Stride, dstY + dstBlockOffset,
dstStride, bioDx, bioDy, shiftNum, offset, clpRng, pSubMcMask, width, isOOBTmp);
}
However, addAvg4 and addAvg8 only include OOB handling in the case of SIMD optimizations. Otherwise, the OOB handling is expected to happen outside of these functions. As a possible fix, the blocks could be wrapped to PelBuf objects and PelBuf.addAvg could be used to do the averaging in the non-SIMD case as that seems to include proper OOB handling:
#if ENABLE_SIMD_OPT_BUFFER && defined(TARGET_SIMD_X86)
if (bioDx == 4)
{
g_pelBufOP.addAvg4(srcY0Temp, src0Stride, srcY1Temp, src1Stride, dstY + dstBlockOffset,
dstStride, bioDx, bioDy, shiftNum, offset, clpRng, pSubMcMask, width, isOOBTmp);
}
else
{
g_pelBufOP.addAvg8(srcY0Temp, src0Stride, srcY1Temp, src1Stride, dstY + dstBlockOffset,
dstStride, bioDx, bioDy, shiftNum, offset, clpRng, pSubMcMask, width, isOOBTmp);
}
#else
PelBuf pelBufDest = PelBuf(dstY + dstBlockOffset, dstStride, bioDx, bioDy);
CPelBuf pelBufSrc0 = CPelBuf(srcY0Temp, src0Stride, bioDx, bioDy);
CPelBuf pelBufSrc1 = CPelBuf(srcY1Temp, src1Stride, bioDx, bioDy);
pelBufDest.addAvg(pelBufSrc0, pelBufSrc1, clpRng, pSubMcMask, width, isOOBTmp);
#endif
Would be good if experts more familiar with JVET_Z0136_OOB could verify if this is meaningful or if some other fix would server better.
It seems setting the following two result in different encoding results, although one would expect the results to be identical:
Alternative 1: m_EqualCoeffComputer = xEqualCoeffComputer;
Alternative 2: m_EqualCoeffComputer = simdEqualCoeffComputer<vext>;
The effect can be noticed by commenting out the assignment of the SIMD function in _initAffineGradientSearchX86() as follows (or by running the software without SIMD optimizations):
template <X86_VEXT vext>
void AffineGradientSearch::_initAffineGradientSearchX86()
{
#if !AFFINE_ENC_OPT
m_HorizontalSobelFilter = simdHorizontalSobelFilter<vext>;
m_VerticalSobelFilter = simdVerticalSobelFilter<vext>;
#endif
// m_EqualCoeffComputer = simdEqualCoeffComputer<vext>;
}