Deblocking filter has been the performance bottleneck of the H.264/AVC coding parallelization on many-core platform. The recent work of Multimedia Computing Group (MCG) entitled “Efficient Parallel Framework for H.264/AVC Deblocking Filter on Many-core Platform” has achieved great improvement on this point. Compared to the well-known 2D-wavefront method, this work achieves averagely 15.23, 15.88, 12.34 and 10.22 times speedup for QCIF, CIF, SD and HD videos using 62 cores, respectively.

Figure 1. This work proposed a three-step parallel framework (TSPF) considering both task-level parallelization and data-level parallelization
There are two ways to parallelize an application on many-core platform, task-level parallelization and data-level parallelization. All the existing researches focus on data-level parallelization and the entire deblocking filter process shows strong data dependencies. As shown in figure 1, this work proposed a novel three-step parallel framework (TSPF) considering both task-level parallelization and data-level parallelization. TSPF has more parallelism, less synchronization overhead than 2D-wavefront method. Meanwhile, TSPF alleviates the load imbalance problem of BSC.
This work was firstly published in International Conference on Multimedia and Expo (ICME) 2011, which was selected as the “Best Paper Candidates” (22 of 744 papers nominated for best paper). Then this work was invited to be published and has been published in IEEE Transactions on Multimedia (T-MM) 2012.