skip to main |
skip to sidebar
Accomplishment
Plans
- Take couple of days off.
- Work further to add other API and refine work.
- Provide support and help to anybody interested in this work.
Status and Accomplishment
- Completed porting all of my algorithms to C6accel. Added new algorithm cvCvtColor() to support CV_RGB2GRY operation. Added VLIB support to C6accel library. Used VLIB_integralImage8() for cvIntegral() implementation. Added flag to control its use. User need to get VLIB access after requesting on www.ti.com/vlibrequest. Implemented chaining of OpeCV APIs cvCvtColor() and cvSobel(). DSP_cvCvtColor_cvSobel() demonstrates this implementation. This scheme reduces codec-engine overhead between API calls.
- Worked on the documentation of API and procedure to add new API to the existing library. Following is the links to documentation http://code.google.com/p/opencv-dsp-acceleration/wiki/API_DOCUMENTATION and http://code.google.com/p/opencv-dsp-acceleration/wiki/Procedure_To_add_other_OpenCV_Algorithms
- Worked on the application to demonstrate the use of these API.
- All the API call now has almost constant execution time of ~380 usec for establishing asynchronous DSP function call. The ARM and DSP should be synchronized using DSP_cvSyncDSP() before output is accessed. This can give performance boost up of greater than 10x in any algorithm for image size of 640x480 if data dependency is tackeled in application wisely.
- Waiting for C6accel tag to be created before releasing code for evaluation.
Plans
- Refine document.
- Review of code.
- Look into DFT algorithm to avoid race condition as I am able to see result only when CE_DEBUG=3.
Blockers
- Some suggestion on my issue with DFT algorithm, as mentioned in Plans section, would be helpful.
Status and Accomplishment
- OpenCV now allocates memory using CMEM in the continious region. This saves overhead of copying the buffer. Same buffer allocated by OpenCV can be passed to the DSP. Functionality is fine but during the exit of the main() process, there is following error message, 'CMEM Error: CMEM_exit() already called, check stderr output for earlier CMEM failure messages (possibly version mismatch).'
- Most of the time was used to investigate on ASYNC DSP call. All the DSP_OpenCV call are made ASYNC. There will be now 2 API. One is the native OpenCV that is synchronous and use ARM while the other is ASYNC call to DSP_OpenCV. This gives the opportunity of parallel execution of task; frees the ARM for some other task. A API is provided to synchronize between DSP and ARM. Setting up ASYNC call only takes 274us for 320 x 240 monochrome image and 305us for 640x480 monochrome image. While the synchronous processing of native OpenCV sobel 3x3 algo takes 2655 us and 8820us respectively. This gives benefit of >10x and >28x performance respectively on a algorithm if task is scheduled properly considering the latency of DSP processing.
- I am now working with C6accel library. Thanks to C6accel team for their support and providing some tweaks as per my need.
Plans
- Work more on the performance and provide benchmark for implemented algorithms.
- Work on documentation.
- Work with the application.
Blockers