Introduction
|
In 1994, an ambitious project in the multimedia domain was started at
the University of Mannheim under the guidance of
Prof. Dr. W. Effelsberg. We realized that multimedia applications
using continuous media like video and audio data absolutely require
access to semantic contents of these media types in a manner similar
to that for textual and numerical data. Imagine a situation for
textual media in which large digital collections of books, reports,
articles etc. exist but nobody is able to search for pertinent
keywords. Content analysis of continuous data, especially of video
data, is currently based mainly on manual annotations. This implies
that the searchable content is reduced to the annotated content, which
usually does not contain the required information. The aim of the MoCA
project is therefore to extract structural and semantic content of
videos automatically.
During the past years, different applications have been implemented
and the scope of the project has concentrated on the analysis of movie
material such as can be found on TV, in cinemas and in video-on-demand
databases. This has provided access to a great amount of input data
for our algorithms. The algorithms developed for video and audio
analysis thus concentrate on movie material. However, they are also
applicable to general video and audio material.
Analysis features developed and used within the MoCA project fall into
four different categories:
- features of single pictures (frames) like brightness, colors, text,
- features of frame sequences like motion, video cuts,
- features of the audiotrack like audio cuts, loudness and
- combination of features of the three classes to extract
e.g. scenes.
The first two are usually regarded together and called video
features. We have implemented a large number of well-known and new
features in all categories. Details can be found in our publications.
|
|
|