skip to main |
skip to sidebar
Weekly Report 6
Status and Accomplishment
- Worked on the performance benchmarking of sobel algorithm. To my surprise, the performance was found to be lower than the non-dsp OpenCV sobel algorithm. The design architecture of calling codec-engine was then changed. Earlier, the codec-engine was opened and then closed after processing the algorithm each time it was called. Now, the codec-engine remains open through the execution and then closed at the end when all the processing is done. The performance inproved and is close to non-dsp OpenCV implementation. To process and then display the video "tree.avi", that comes along with OpenCV examples, frame by frame with 50 ms wait time in between, it takes around 6 seconds compared to 5 seconds with non-dsp. I am still looking into factors to boost up the performance.
- Extended the sobel algorithm. It is now capable of working with 5x5 and 7x7 kernel.
- I am currently working on implementing cvIntegral and extending DFT. Tested the algorithm for calculating integral image. Some more work is needed so that it could be applied to images.
- Looked into integration of my library with existing OpenCV library. Did some modification to existing library to conditionally call my-library after some error checking and environment-variable checkup. Had some issues with CMAKE which I have mentioned on blockers.
Plans
- Look further into performance hurdles and try to overcome it.
- Expand Integral algorithm and DFT algorithm.
- Look into their performance and compare it with non-dsp OpenCV algorithm.
Blockers
- When trying to building OpenCV library after integrating my library with existing algorithm, the linker was not able to find my library. After wasting almost a day in that, I gave it up moved on to other task. I will look into more details of CMAKE build procedure and needed changes to be done, when I am done with other algorithms. Meanwhile, I am planning to look into it only during free time.
To improve performance, you might try the asyncronous APIs (i.e., rather than UNIVERSAL_process(), consider queuing up multiple operations with UNIVERSAL_processAsync()/processWait()). http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/ce/latest_2_x/docs/html/group__ti__sdo__ce__universal___u_n_i_v_e_r_s_a_l.html
ReplyDeleteAlso, depending on your DSP-side algorithm's implementation, you may also get a lift by enabling the recently introduced Server.skelCachingPolicy feature and setting it to .WBINVALL. In some systems, with algs that manage large data buffers, this can improve performance. http://processors.wiki.ti.com/index.php/Codec_Engine_skelCachingPolicy