Dataset The Movie DB (and Youtube) - Rostlab/DM_CS_WS_2016-17 GitHub Wiki

Dataset The Movie DB (+ Youtube)

  • Proposer: Chaoran Chen - @chaoran-chen - [email protected]
  • Votes:
    1. 🙋 @ssoima
    2. 🙋 @aserhany
    3. 🙋 @muhammadasad1

Summary

The Movie DB contains a lot of information about movies (and tv series) and offers a free and open access via a web and JSON-based API. Especially, it provides links to trailers and other tracks on Youtube. Having these Youtube videos, it is possible to extract interesting features from the audio tracks.

Prediction Goals

  • How much revenue will a movie generate?
  • Which cast fits to a particular movie project?
  • How much budget is needed?
  • What rating would a video receive?

Long Description

The Movie DB provides information about over 300.000 movies. Some of the features:

  • Title
  • Description
  • Release date
  • Runtime
  • Production company / country
  • Cast and crew
  • Genres, Keywords
  • Popularity, Ratings, Reviews
  • Budget, Revenue
  • Links to trailer / tracks (on Youtube)

Although these data are maybe already enough to get some predictions, the key idea of this data set is to combine them with audio tracks of the trailers. In general, a trailer contains important scenes and have a background music, that nicely reflects the ambience of a movie. Using audio analysis tools, features such as tempo, beat and energy can easily be extracted. It is also possible to extract the speeches of the trailers, which often are the first sentences, that the audience had heard from a movie. Furthermore, the rating of the trailers can be an interesting feature.

Links / Data / Other