Digital video has become the face of television, the internet and mobile devices. According to an official blog post (May 2009), about 20 hours of video are introduced to the YouTube site every minute of real time. This is equivalent to Hollywood releasing over 114,000 new full-length movies into the theaters each week! But digital video also plays a huge role in biomedical devices, surveillance and manufacturing quality assurance.
Did you know there are approximately eight million users sharing 10 petabytes of data (mostly media files) at any given time? This accounts for nearly 10% of the worldwide internet broadband connections [1]. So how can near real-time actionable intelligence be gleaned from the vast amounts of video data being generated? One answer is to exploit the power of emerging commodity multicore computers. When used properly, each core can be used for individual threads of computations, but new software applications will need to be developed to make this happen. Today there is a parallel programming gap between multicore systems and software applications. With the end of the uni-processor performance gains, the average software developer will have to implement parallel programs to maintain performance growth. The goal in parallel computing is to perform multiple calculations simultaneously. The Pervasive DataRush™ (DataRush) platform exploits multiple forms of parallelism facilitating concurrency in video processing and video analytics from spatial-temporal partitioning and down to the pixel level.
There are several paths to parallelism given languages and programming frameworks available today, but a very common path to parallelism today is data parallelism. Data parallelism is a simple divide-and-conquer technique emerged from SPMD (single program, multiple data) where data is partitioned and distributed over multiple workers (nodes on a cluster, vm’s on a cloud) each running the same program. Hadoop, an open source version of MapReduce, logically partitions the data and allocates one map task, called a Mapper, per partition. There may be hundreds of Mappers on a single machine. A single threaded legacy application can be deployed to subsets of a large scale datasets on a cloud or grid environment. A second path is coarse grain parallelism via parallelization of loops (TPL: Parallel.For, Parallel.ForEach and RParallel: runParallel), arrays (ParallelArrays and INVOKE-IN-PARALLEL) and further orchestration onto multiple workers. True fine grain parallelism requires writing complex and correct multithreaded programs. Fine grain parallelism here refers to thread-level parallelism (not instruction level parallelism).
This figure is a cartoon depiction of a data pipeline for Video Object Detection using principal component analysis (PCA) for background subtraction. By projecting the original frame onto its eigenspace and subtracting projected image from original image, foreground objects are clearly identifiable. This work is based on Yilmaz et al (2006). A white paper detailing this work can be found
here.
Video analytics can also be used in medicine for guided surgery and video tele-monitoring of patients. A use case (see Figure 2 below) and task in this video processing pipeline is the identification of regions of interests for physicians and clinicians decision support. DataRush parallelism has being applied in experimentations with K-Means clustering of digital colposcopic images to identify acetic acid enhanced pre-cancerous legions (highlighted in red below). This image analysis can be applied concurrently to individual video frames in order to identify and label ROI's.

The DataRush platform is designed specifically to fully utilize emerging commodity multicore computers. It addresses gaps in design time cost, programming, parallelism, scalability and performance/watt, enabling rapid prototyping of video processing applications.
The volumes of video streaming onto television, computers and mobile are forcing video processing onto cloud and distributed environments. Current cloud and grid computing platforms are still not capable of real time processing. Video processing is inherently parallel and the current solutions mostly leverage data parallelism. Emergent fine grain parallelism in video processing exploits concurrency at slice-level, frame-level, intra-frame and pixel-level operations. Such fine granularity has been traditionally achieved using video encoding hardware. This hardware based approach lacks flexibility. Our approach introduces a Video processing development platform to exploit multiple levels of parallelism while facilitating rapid development of agile and adaptive video analytical models.