Dataset Global Terrorism Database - Rostlab/DM_CS_WS_2016-17 GitHub Wiki

Dataset Global Terrorism Database

  • Proposer: @paulafortuna - [email protected]

  • Final Team:

    1. @paulafortuna
    2. @msandim
    3. @kapoorabhishek24
    4. @avradips
  • Votes: 1. @brosequartz, 2. @msandim, 3. @avradips, 4. @vivek-sethia, 5. @kapoorabhishek24, 6. @muhammadasad1, 7. @ashwarypande, 8. @ishaanraj

Summary

The Global Terrorism Database provides a set of instances/features that describe terrorist attacks from 1970 until 2015, worldwide. This topic is very popular nowadays due to the terrorist attacks occurring in the past years. Despite being a hard task, predict the occurrence of terrorist attacks is the most thrilling question that this dataset can provide insight about.

Prediction Goals

  • Describe terrorism and terrorist attacks (locations, time, type of attack, terrorist groups)
  • Predict terrorist attacks (location, time, type of attack)

Weekly Progress

  • Week 01 (W46-Nov16) Terrorism Database -- Main findings:

    • Terrorist attacks have increased during the last years, or at least theirs report.
    • A huge number of Terrorist attacks occurred in the middle-east region.
    • Groups responsible for most terrorism events are Taliban, Shining Path (SL), Farabundo Marti National Liberation Front, Islamic State of Iraq and the Levant.
    • To solve some missing values we are going to merge related attributes.
  • Week 02 (W47-Nov23) Terrorism Database -- Main findings:

    • Terrorist attack localization evolves over time.
    • There is a huge number of terrorist groups and its activity is evolving over time.
    • There are around 40 features with a substantial number of missing values.
  • Week 03 (W48-Nov30) Terrorism Database -- Main findings:

    • The targets of terrorism are not the same in different parts of the globe.
    • Some instances of our dataset are marked with uncertainty about being a terrorist attack.
  • Week 04 05 (W49 50 Dec14) Terrorism Database -- Main findings:

    • Around half of the attacks have no identified perpetrator.
    • It is possible to cluster the terrorist groups based on target type, attack type, and weapons used.
    • In the next weeks, we should do some tasks of predictive analysis.
  • Week 06 (W51 Dec21) Terrorism Database -- Main findings:

    • We started with our prediction phase and discussed few of the topics that we want to consider.
  • Week 07 (W02 Jan11) Terrorism Database -- Main findings:

    • A preliminary experience in predicting the number of victims achieved good results.
    • In the future it is also important to know how to predict the terrorist group.
  • Week 08 (W03 Jan18) Terrorism Database -- Main findings:

    • For our first prediction task, we provide a model that can predict with good accuracy if a terrorist attack will have mortal victims or not. However, predicting the precise number of victims is a more challenging task.
    • For our second prediction task, creating a model for each region proved to be a good strategy. The accuracy of the models found is around 0.80.
    • Relating the terrorism with demographic data is a more difficult prediction task. We are still searching for a demographic dataset that fits with our Terrorism Dataset.
  • Week 09 (W04 Jan25) Terrorism Database -- Main findings:

    • We were able to improve the prediction of the number of kills, however, we are still far from getting a good model.
    • The terrorist group prediction achieved good accuracy results, even when distinguishing between many possible terrorist groups in some regions.
    • We discover a dataset for demographic data that can help us in the prediction tasks.
  • Week 10 (W05 Feb01) Terrorism Database -- Main findings:

    • We were able to achieve good results in the terrorist group prediction task and with our model we concluded that 50% of the attacks in the Middle East may been conducted by ISIS and Al-Qaida.

Long Description

1 - Dataset Description

  • Size: 73.9 MB
  • Attributes: 120
  • Rows: 156773
  • Format: xlsx
  • Incidents from 1970 to 2015
  • Instance Example:

Place: Germany, Hanover Description: Assailants set fire to refugee housing near Hanover, Lower Saxony state, Germany. There were no reported casualties in the attack. No group claimed responsibility for the incident. Latitude, Longitude: 52.375892, 9.73201

2 - Attributes categories

  • Incident Date
  • Region
  • Country
  • State/Province
  • City
  • Latitude and Longitude (beta)
  • Perpetrator Group Name
  • Tactic used in attack
  • Nature of the target (type and sub-type, up to three targets)
  • Identity, corporation, and nationality of the target (up to three nationalities)
  • Type of weapons used (type and sub-type, up to three weapons types)
  • Whether the incident was considered a success
  • If and how a claim(s) of responsibility was made
  • Amount of damage, and more narrowly, the amount of United States damage
  • Total number of fatalities (persons, United States nationals, terrorists)
  • Total number of injured (persons, United States nationals, terrorists)
  • Indication of whether the attack is international or domestic

Several Data Types: categories, text, coordinates, boolean, numeric, timestamps

The detailed description is presented in the Codebook of the GTD [2] (p. 14).

Dataset collector

Global Terrorism Database (GTD) is maintained by the National Consortium for the Study of Terrorism and Responses to Terrorism (START) [1].

Data Quality

The main attributes (date, location, and summary) are completed in more than 90.000 instances.

Links / Data / Other