Please refer to OpenCL kernel compilation using ppl.common.ocl and Building commands on linux in OpenCL Platform Guide.
Please refer to How to run benchmark in OpenCL Platform Guide.
There are two implementations of a function in our benchmark, one is the OpenCL implementation in ppl.cv, the other is the CPU counterpart of the former in OpenCV. Both of them run on a serial of parameter combinations covering common usage and the elapsed time is recorded. Besides the particular parameters of a function, the supported data types(uchar/float), the channels(1/3/4) and the commonly used image sizes are tested for each function. The input images are composed of randomly generated pixel values.
We describe performance in terms of acceleration ratio using implementation in OpenCV as the baseline. For each function, we sort the speedups and pick out the minimum speedup, the median speedup and the maximum speedup to form a compact box diagram to characterize acceleration ratio instead of average speedup.
Information of machines:
- X86 desktop computer with NVIDIA GPU:
- CPU: Intel® Core™ i7-7700 CPU (8 cores, 3.60GHz)
- GPU: GeForce GTX 1060 (1280 CUDA Cores, 1772 MHz)
- Host memory: 32 GB
- Device memory: 6 GB
- OS: ubuntu 16.04
- Smartphone with Qualcomm GPU:
- CPU: Qualcomm Snapdragon 8 gen1 (8 cores, 1x3.00 GHz Cortex-X2, 3x2.50 GHz Cortex-A710, 4x1.80 GHz Cortex-A510)
- GPU: Qualcomm Adreno 730 (1536[768] ALU/FP32 SIMDs, 900 MHz)
- Host + Device memory: 12 GB
- OS: android 12
- Smartphone with Arm GPU:
- CPU: MediaTek 9000 (8 cores, 1x3.05 GHz Cortex-X2, 3x2.85 GHz Cortex-A710, 4x1.80 GHz Cortex-A510)
- GPU: Arm Mali 710 (7-16 shader cores, 850 MHz)
- Host + Device memory: 12 GB
- OS: android 12
function | Geforce GTX 1060 | Qualcomm Adreno 730 | Arm Mali G710 |
---|---|---|---|
Abs | (5.416160, 28.455800, 35.307097)(schar), (3.251696, 4.843355, 8.665003)(float) | (2.005357, 7.450896, 10.767174)(schar), (0.804077, 1.079686, 2.846432)(float) | (1.207542, 3.523319, 8.460250)(schar), (0.313592, 0.888851, 2.565313)(float) |
Add | (0.629620, 1.521336, 4.110602)(uchar), (1.814871, 3.119679, 7.844336)(float) | (0.186685, 0.708007, 0.922503)(uchar), (0.467970, 0.733883, 1.192058)(float) | (0.093987, 0.295026, 0.703502)(uchar), (0.258608, 0.565505, 0.976595)(float) |
AddWeighted | (4.391424, 7.928370, 11.803992)(uchar), (6.067815, 7.641039, 10.299791)(float) | (2.032676, 7.332689, 9.929187)(uchar), (1.490589, 1.842683, 2.183199)(float) | (1.029032, 2.674006, 5.396308)(uchar), (0.749904, 1.082015, 1.466735)(float) |
Subtract | (0.871823, 1.984403, 5.135535)(uchar), (2.152269, 4.518141, 7.329983)(float) | (0.208567, 0.717055, 1.132377)(uchar), (0.409122, 0.770779, 1.228698)(float) | (0.086469, 0.301720, 0.782554)(uchar), (0.180858, 0.521634, 1.007402)(float) |
Mul | (3.928378, 8.032842, 12.238446)(uchar), (6.320840, 7.844774, 9.984651)(float) | (1.107996, 6.444566, 8.970873)(uchar), (0.708928, 0.979360, 1.237413)(float) | (0.440297, 2.248678, 4.670598)(uchar), (0.233147, 0.581068, 0.978025)(float) |
Div | (4.895800, 8.788351, 11.193963)(uchar), (2.052837, 4.926071, 8.028919)(float) | (1.826428, 9.382538, 11.078437)(uchar), (0.800212, 1.135968, 1.269375)(float) | (1.434918, 3.566551, 6.789003)(uchar), (0.338806, 0.697945, 1.005605)(float) |
BGR2BGRA | (0.974841, 3.108321, 4.440421)(uchar), (1.573301, 5.776895, 7.338641)(float) | (0.604667, 0.734449, 0.758314)(uchar), (0.732046, 1.038377, 1.395864)(float) | (0.223615, 0.307121, 0.532409)(uchar), (0.450082, 0.812675, 0.923834)(float) |
BGRA2BGR | (0.850672, 1.801313, 4.369492)(uchar), (1.650191, 6.152087, 7.176821)(float) | (0.647767, 0.869807, 1.012246)(uchar), (0.739518, 1.264502, 1.286878)(float) | (0.217492, 0.363819, 0.620785)(uchar), (0.431237, 0.778965, 0.938064)(float) |
BGR2RGB | (1.062282, 3.434147, 4.355159)(uchar), (1.893317, 6.852732, 8.067297)(float) | (0.889952, 1.079841, 1.154151)(uchar), (1.037111, 1.520092, 2.171429)(float) | (0.180235, 0.421414, 0.644859)(uchar), (0.502864, 0.909259, 1.063143)(float) |
BGRA2RGBA | (0.958358, 3.043023, 4.575337)(uchar), (2.043202, 5.882593, 6.744154)(float) | (1.149952, 1.196268, 1.613213)(uchar), (1.158094, 1.301830, 2.829942)(float) | (0.592855, 1.232628, 1.372080)(uchar), (1.211833, 1.977722, 2.060533)(float) |
BGR2GRAY | (2.756502, 4.425517, 8.784967)(uchar), (0.874284, 5.229536, 5.774894)(float) | (1.566102, 1.736963, 1.852622)(uchar), (0.739817, 1.037248, 1.222367)(float) | (0.440939, 0.679690, 0.874412)(uchar), (0.310942, 0.611687, 0.815759)(float) |
BGRA2GRAY | (2.969068, 5.563711, 5.729600)(uchar), (1.632213, 5.141780, 5.655110)(float) | (1.912655, 2.011225, 2.111646)(uchar), (0.870554, 1.195535, 1.261483)(float) | (0.417412, 0.941263, 1.102902)(uchar), (0.449798, 0.703511, 0.878358)(float) |
GRAY2BGR | (0.740988, 1.296377, 2.584280)(uchar), (0.858851, 3.725808, 6.151738)(float) | (0.220107, 0.483014, 0.507121)(uchar), (0.464497, 0.760800, 1.009707)(float) | (0.053869, 0.187043, 0.254366)(uchar), (0.179098, 0.397241, 0.689631)(float) |
GRAY2BGRA | (0.779905, 2.430677, 2.675184)(uchar), (1.027257, 4.753796, 6.033175)(float) | (0.314025, 0.799843, 0.863792)(uchar), (0.925298, 1.101186, 1.202580)(float) | (0.206335, 0.307763, 0.367603)(uchar), (0.352923, 0.645320, 0.757814)(float) |
BGR2YCrCb | (3.463624, 4.462537, 13.514200)(uchar), (1.558110, 5.286368, 7.172311)(float) | (3.008087, 3.345875, 3.759379)(uchar), (0.902357, 1.120763, 1.280343)(float) | (0.802399, 1.435395, 1.929783)(uchar), (0.441911, 0.767839, 0.877563)(float) |
YCrCb2BGR | (3.287511, 4.426856, 15.504667)(uchar), (1.724060, 5.791151, 7.035543)(float) | (1.215096, 1.711383, 2.954303)(uchar), (1.142472, 1.223283, 2.570193)(float) | (0.716777, 1.666506, 2.011290)(uchar), (0.426489, 0.720878, 0.867341)(float) |
BGR2HSV | (3.209511, 4.072085, 11.681967)(uchar), (1.990049, 5.160914, 7.105344)(float) | (5.330833, 6.057025, 6.163449)(uchar), (1.574174, 1.750919, 2.487114)(float) | (1.249934, 1.948987, 3.271431)(uchar), (0.817982, 1.100557, 1.139319)(float) |
HSV2BGR | (4.526536, 6.578686, 23.392593)(uchar), (3.425116, 7.055993, 9.749869)(float) | (15.951101, 17.846157, 20.810139)(uchar), (3.496314, 3.726228, 5.755785)(float) | (3.073298, 5.776957, 9.423345)(uchar), (1.922469, 2.595319, 2.753271)(float) |
BGR2LAB | (3.291130, 4.147037, 10.681438)(uchar), (6.117594, 8.086083, 20.700696)(float) | (0.141159, 0.151672, 0.207139)(uchar), (0.197151, 0.249189, 0.353354)(float) | (1.761170, 5.503978, 6.140955)(uchar), (3.879701, 12.294167, 13.732485)(float) |
LAB2BGR | (11.207289, 15.564405, 41.478815)(uchar), (8.616276, 12.089414, 29.431104)(float) | (0.387117, 0.426584, 0.729339)(uchar), (7.340573, 10.229358, 14.307704)(float) | (2.089282, 7.045394, 8.734020)(uchar), (4.319507, 6.692209, 7.446671)(float) |
NV122BGR | (6.955140, 9.007715, 9.025621)(uchar) | (1.193405, 1.931122, 2.156615)(uchar) | (0.290176, 0.754971, 1.205207)(uchar) |
NV122BGRA | (7.691295, 8.168457, 8.423945)(uchar) | (1.466854, 2.190175, 2.432521)(uchar) | (0.313627, 1.018684, 1.479219)(uchar) |
NV212BGR | (9.523349, 9.964800, 11.229756)(uchar) | (1.284312, 1.943022, 2.188577)(uchar) | (0.272937, 0.758974, 1.178800)(uchar) |
NV212BGRA | (7.414431, 8.551869, 8.726700)(uchar) | (1.859573, 2.181483, 2.496289)(uchar) | (0.415347, 1.025094, 1.506224)(uchar) |
BGR2I420 | (7.274811, 10.599371, 11.101121)(uchar) | (1.496386, 3.322885, 3.519494)(uchar) | (0.387179, 1.186616, 1.921722)(uchar) |
BGRA2I420 | (10.366012, 12.612798, 13.864323)(uchar) | (2.689910, 3.687712, 3.914777)(uchar) | (0.573129, 1.420915, 2.265892)(uchar) |
I4202BGR | (8.714725, 9.596988, 10.294361)(uchar) | (1.905832, 3.420804, 3.786738)(uchar) | (0.493650, 1.431669, 2.401181)(uchar) |
I4202BGRA | (8.039565, 8.908183, 8.945429)(uchar) | (2.384061, 3.614948, 3.872713)(uchar) | (0.611662, 1.702791, 2.716904)(uchar) |
YUV2GRAY | (0.731453, 2.108811, 2.252680)(uchar) | (0.090324, 0.507611, 0.801996)(uchar) | (0.045608, 0.186470, 0.264298)(uchar) |
UYVY2BGR | (8.105145, 10.242159, 10.581748)(uchar) | (3.303848, 5.221017, 5.878847)(uchar) | (0.790456, 2.250966, 3.042487)(uchar) |
UYVY2GRAY | (7.581833, 13.641688, 22.882275)(uchar) | (1.430621, 3.639795, 3.652843)(uchar) | (0.693652, 1.544861, 1.906346)(uchar) |
YUYV2BGR | (8.315967, 10.240000, 10.774420)(uchar) | (3.426641, 5.112942, 5.776664)(uchar) | (0.643460, 2.159281, 2.867851)(uchar) |
YUYV2GRAY | (7.600400, 13.997617, 22.851875)(uchar) | (1.334884, 3.576374, 3.664124)(uchar) | (0.346315, 1.486616, 1.977368)(uchar) |
CopyMakeborder | (1.291183, 2.437038, 5.501862)(uchar), (1.993371, 4.043501, 7.029448)(float) | (0.201976, 0.871601, 1.463774)(uchar), (0.361648, 0.994192, 1.683650)(float) | (0.055294, 0.228041, 1.088783)(uchar), (0.072316, 0.479716, 1.236188)(float) |
Dilate | (0.493765, 1.168194, 2.552728)(uchar), (1.113849, 2.199795, 4.701637)(float) | (0.208704, 0.966052, 4.310404)(uchar), (0.605631, 1.580653, 3.112223)(float) | (0.132084, 0.763279, 5.039372)(uchar), (0.308198, 1.777006, 3.824586)(float) |
Erode | (0.496050, 1.200636, 2.749696)(uchar), (1.177196, 2.371961, 5.822210)(float) | (0.170810, 0.825647, 3.633979)(uchar), (0.491056, 1.435523, 2.788140)(float) | (0.131214, 0.745331, 5.016607)(uchar), (0.347553, 1.625592, 3.779340)(float) |
Flip | (0.964233, 2.506900, 5.677341)(uchar), (1.939363, 3.829587, 6.028222)(float) | (0.121993, 0.584105, 1.291385)(uchar), (0.266594, 1.224184, 3.340781)(float) | (0.043007, 0.435122, 2.476177)(uchar), (0.144471, 0.696561, 2.201557)(float) |
Resize | (1.952568, 10.617821, 24.096063)(uchar), (1.694113, 7.237342, 54.583162)(float) | (0.084918, 3.830496, 17.331950)(uchar), (1.167608, 3.822324, 8.110966)(float) | (0.021003, 0.854726, 3.856212)(uchar), (0.266691, 1.215268, 2.896390)(float) |
WarpAffine | (4.865464, 27.285814, 59.551404)(uchar), (5.973823, 19.805469, 78.941249)(float) | (6.508233, 14.626695, 37.170669)(uchar), (5.757505, 12.219769, 25.183554)(float) | (0.999877, 3.840523, 25.156036)(uchar), (1.130103, 3.302975, 15.866380)(float) |