OpenCV DSP Acceleration: Weekly Report 6

Status and Accomplishment

Worked on the performance benchmarking of sobel algorithm. To my surprise, the performance was found to be lower than the non-dsp OpenCV sobel algorithm. The design architecture of calling codec-engine was then changed. Earlier, the codec-engine was opened and then closed after processing the algorithm each time it was called. Now, the codec-engine remains open through the execution and then closed at the end when all the processing is done. The performance inproved and is close to non-dsp OpenCV implementation. To process and then display the video "tree.avi", that comes along with OpenCV examples, frame by frame with 50 ms wait time in between, it takes around 6 seconds compared to 5 seconds with non-dsp. I am still looking into factors to boost up the performance.
Extended the sobel algorithm. It is now capable of working with 5x5 and 7x7 kernel.
I am currently working on implementing cvIntegral and extending DFT. Tested the algorithm for calculating integral image. Some more work is needed so that it could be applied to images.

Looked into integration of my library with existing OpenCV library. Did some modification to existing library to conditionally call my-library after some error checking and environment-variable checkup. Had some issues with CMAKE which I have mentioned on blockers.

Plans

Blockers

When trying to building OpenCV library after integrating my library with existing algorithm, the linker was not able to find my library. After wasting almost a day in that, I gave it up moved on to other task. I will look into more details of CMAKE build procedure and needed changes to be done, when I am done with other algorithms. Meanwhile, I am planning to look into it only during free time.

ChrisJune 28, 2010 at 9:29 AM
To improve performance, you might try the asyncronous APIs (i.e., rather than UNIVERSAL_process(), consider queuing up multiple operations with UNIVERSAL_processAsync()/processWait()). http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/ce/latest_2_x/docs/html/group__ti__sdo__ce__universal___u_n_i_v_e_r_s_a_l.html

Also, depending on your DSP-side algorithm's implementation, you may also get a lift by enabling the recently introduced Server.skelCachingPolicy feature and setting it to .WBINVALL. In some systems, with algs that manage large data buffers, this can improve performance. http://processors.wiki.ti.com/index.php/Codec_Engine_skelCachingPolicy

OpenCV DSP Acceleration