This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Read our privacy policy>Search

If you need help, please click here:

VSD Technologies in Safe City Storage and Application Development

As IT development intensifies, the public security sector is applying significant updates to its systems with the accelerated rollout of Safe City initiatives. Video surveillance is a key component of these build-outs, as shown by the growing amount of surveillance equipment on the street. Deployment of surveillance devices helps contribute to maintaining law and order, combating crime, and safeguarding social stability.

However, there is a massive increase in the amount of data generated, stored, and accessed. That applies tremendous pressure on already over-worked systems and makes ease-of-application and effective management all the more important. Video Structured Description (VSD) technologies provide the breakthrough needed to keep up with demands. It is poised to become a mainstay in the future development of Safe City projects.

Capacity is not the only storage bottleneck

Network-based HD video systems have become the norm in Safe City deployment. Many vendors offer applications and management utilities to access front-end digital video streams for browsing, storage, and playback, including Network Video Recording (NVR) and Cloud Video Recording (CVR).

In these distributed architectures, interconnections are implemented through IP networks complying with ONVIF, GB28181, and other communications standards. Open communication protocols and interfaces allow the video surveillance platform to connect with more types of devices while allowing for enhanced scalability.

Advances in the cloud

Network-based HD video surveillance systems are also benefiting from advances in cloud computing and storage technology. Most existing cloud platforms can deliver strong computing capabilities while mass storage can provide a viable platform for video surveillance management and applications. The powerful storage capabilities of the cloud can be further leveraged for the interconnections in the surveillance network to solve some of the existing storage shortages.

However, even if network interconnection transmission and video storage capacity are upgraded, there is still enormous pressure in storing HD video surveillance content. For example, 1080P HD video typically uses a H.264 high compression ratio as a video encoding format and the encoded HD digital video stream is maintained between 4 Mbps to 8 Mbps. Consequently, each video channel produces nearly 3 GB of data per hour. That amounts to about 50 GB per day, considering the changes in activity and amount of recorded content.

Alleviating the bottleneck with efficiency and speed

For a city deploying tens of thousands of cameras, conventional storage systems prove incapable of handling the network traffic or accommodating storage requirements. Storing vast amounts of data cannot be eliminated with expansion to storage capacity alone. Alleviating the bottleneck also depends on the efficiency and speed of the applications.

Amount of redundant data vs. extracting useful information

Video data contains lots of redundant information. Unlike other types of data it is visual, complicating effective searches. Given its visual attributes and limited processing capabilities, there are few workable automated approaches. As a result, vast amounts of labor must be expended to monitor live video feeds or browse recorded content.

Monitoring personnel are often responsible for viewing real-time feeds for dozens of cameras and workers monitoring public venues with hundreds or thousands of cameras are forced to watch only a handful of the more important or incident-prone areas. The level of alertness, skill in operating the cameras, and other factors affect the video surveillance task. These factors make it difficult to notice unusual events at monitored sites, especially considering the lack of automated video processing technologies to help with filtering. Video surveillance is currently resigned to a static state. The situation not only complicates rapid response, but leaves early detection and intervention far from the level needed to ensure complete safety during large-scale events.

Lack of efficient, accurate video search and retrieval methods

A dedicated, highly efficient, and accurate means to search and retrieve video data has yet to be appear. Primitive manual browsing must be used to search for specific content, which is an inefficient approach leading to high costs.

Data silos, resource integration, and interoperability

A major issue limiting data sharing is the vast amount of video already competing for limited bandwidth. Further complicating the situation are other sensory data integrated into the surveillance system to reduce data silos and make police work more effective.

For example, integrating RFID authentication and synthesizing multiple sources of information from beat officers, detectives, and other police can help provide the needed information and analysis for early detection and decision-making. To be effective, these multi-dimensional inputs are needed, yet they also strain the surveillance system and present challenges in resource integration and interoperability.

Effective solutions in VSD-based technology

In looking at existing video surveillance systems, video capture and simple storage models lack an effective way to sift and accurately retrieve desired information from the massive amounts of video. This leads to a considerable waste in storage space and complicates leveraging the value of the content. VSD technology can solve the existing dilemmas in storage and applications.

VSD uses time segmentation, object recognition, and other means to extract key features in the footage and determine their syntactic relationships. The technology then collates that information into text that computers and people can read. There are two main layers, applying text to the video content and associating the video resources.

Layer 1: Applying text

In the first layer, video content is collated into standardized descriptive formats so the objects of interest and their identified behavior and features can be put into text form. This layer is an intelligent process to extract and organize the information in the video resources.

Layer 2: Applying syntactical associations

The second layer applies syntactical associations to the video captured from cameras at different locations or filmed from different angles. The second layer uses data mining tools for highly efficient analysis, making retrieval of pertinent and syntactic information across the entire surveillance system and other information systems possible. Layer 2 collates, manages, and mines the data in the video resources and also assists with tasks in other systems.

Syntactic structure

VSD technology allows video content to be understood in formats that can be more easily processed. It leverages the accumulated information in the databases to collate and analyze the objects of interest, behavior, and events in the video archives. VSD provides a structured description of the video content according to pre-built object types, features, and associations to extract the useful syntactic relationships between the elements.

More simply, VSD extracts data from the video and places it into a standard syntactic structure using a pattern recognition process. These constructs allow the information to be effectively extracted and integrated to enable correlative analysis on the indexed, retrieved, and digested video data. The system can also store only the key target images and text features which consolidates the video data and considerably reduces pressure on existing systems.

VSD applications in storage models

Structured descriptive data is applied to video content and image, video, and other types of data during the VSD process. This forms the basis for analyzing the copious types of data on the platform. The data for video analysis, processing, retrieval, and other processes each have their respective and distinctive input and output modes in addition to access rules. As the application data associated with VSD continues to mushroom, system design must fully consider these attributes from the very start to ensure the entire system is able to operate efficiently.

Coupling computing and storage is important in video surveillance and other Big Data platforms to avoid bottlenecks across the entire storage system. Engineers must fully consider the computing model and how the massive amount of data is accessed and analyzed when designing data storage policies.

Storage design considerations

Video and image data storage designs must consider how to most efficiently apply structured descriptions. This type of data usually occupies much of the storage space, meaning considerable network bandwidth is occupied when accessing the video and image content. In order to improve the efficiency of the structured descriptions, the burden of processing real-time video content should be shared across multiple service nodes while considering the access model and capacity of the system architecture. At the same time, the computing node should be placed as close as possible to the original video feed to reduce the I/O overhead associated with moving the data within the processing cluster.

In the VSD system, structured descriptive data in the video content is the most important data structure. The design of the storage policies will have a significant impact on overall system performance. Various statistical and discovery (search and retrieval) applications need to access the structured descriptive data and each of them have their own distinct computing model, seriously complicating storage system design.

For text-based search applications, the descriptive data is more applicable to the storage system database or the text-search server because the output description is the desired objective of the structure. For image-based retrieval, the descriptive data should be stored as files in the distributed file system to facilitate concurrent processing for the algorithms that compare the attributes in the images.

To further mine the structured descriptions and associated systems, analytical applications require a description of the category or column in the data to implement statistical analysis. To meet this common requirement, relational database or HBase-type column stores can be considered to improve the efficiency of the applications. Different from line-sequencing in traditional relational databases, the core design of these two methods is on sequential access of the column data. Avoiding the reads on all the rows of recorded data ensures efficiency in accessing data of a particular type during global analysis.

For these applications, an appropriate data redundancy and replication level can ensure rapid access to the particular type of data on the computing node. An appropriate redundant storage policy can enhance fault tolerance while also enabling a high level of adaptability to different application models. However, the scale at which redundancy is implemented must also be weighed to avoid consistency issues during updates.


Video surveillance systems are an important part of Safe Cities. Incorporating VSD significantly improves storage efficiency and the benefits from applications in network-based video surveillance systems. VSD makes for a highly efficient storage model and can be applied to a wide range of application scenarios. VSD-based systems and off-shoot products are sure to become a major pillar in Safe City deployments.

By Dai Jie

Chinese Ministry of Public Security

Share link to: