Sound Research WIKINDX

WIKINDX Resources

Velivelli, A., Ngo, C.-W., & Huang, T. S. (2003). Detection of documentary scene changes by audio-visual fusion. Lecture Notes in Computer Science, 2728, 227–238. 
Added by: sirfragalot (06/09/2005 11:21:24 AM)   
Resource type: Journal Article
BibTeX citation key: Velivelli2003
View all bibliographic details
Categories: General, Typologies/Taxonomies
Keywords: Semantic categorization
Creators: Huang, Ngo, Velivelli
Collection: Lecture Notes in Computer Science
Views: 4/536
The concept of a documentary scene was inferred from the audio-visual characteristics of certain documentary videos. It was observed that the amount of information from the visual component alone was not enough to convey a semantic context to most portions of these videos, but a joint observation of the visual component and the audio component conveyed a better semantic context. From the observations that we made on the video data, we generated an audio score and a visual score. We later generated a weighted audio-visual score within an interval and adaptively expanded or shrunk this interval until we found a local maximum score value. The video ultimately will be divided into a set of intervals that correspond to the documentary scenes in the video. After we obtained a set of documentary scenes, we made a check for any redundant detections.
Added by: sirfragalot  
An experiment in combining video and audio analysis for indexing scenes and shots in documentaries by semantic context.
Added by: sirfragalot  
p.228   One of the team's observations of documentaries is that usually "the visual pattern has a counterpart audio pattern."

An example they give is:

audio class: speech <-------- speech + siren <-------- speech
visual sequence: aircraft <---hanger [sic]/fire <-------- officer speaking   Added by: sirfragalot
p.231   For documentaries, they define 6 audio classes:

  • Speech
  • Speech + Music
  • Music
  • Speech + Noise
  • Noise
  • Silence
  Added by: sirfragalot
Keywords:   Semantic categorization
WIKINDX 6.4.9 | Total resources: 1084 | Username: -- | Bibliography: WIKINDX Master Bibliography | Style: American Psychological Association (APA)

PHP execution time: 0.15398 s
SQL execution time: 0.11619 s
TPL rendering time: 0.00809 s
Total elapsed time: 0.27826 s
Peak memory usage: 9.5627 MB
Memory at close: 9.4572 MB
Database queries: 64