Week 03 (W48 Nov30) Terrorism Database - Rostlab/DM_CS_WS_2016-17 GitHub Wiki
Summary & Index:
During this week the group continued to explore the dataset. The main focus was to have an overview about the targets of terrorism. The main findings for this weeks are:
- The targets of terrorism are not the same in different parts of the globe.
- Some instances of our dataset are marked with uncertainty about being a terrorism attack.
Wiki Status - concluded
Index
###1 - Targets of terrorism description ####1.1 - Research Questions ####1.2 - Who are the targets of terrorism? ####1.3 - How many victims die in terrorist attacks? ####1.4 - Geo-Referential Analysis #####1.4.1 Incidents per Target Type or Victims #####1.4.2 Major Terrorist Targets or Victims since 1970 ####1.5 When there are hostages, are they released? Analysis of Hijacking or Kidnapping Events #####1.5.1 Number of kidnappings/hijackings per year #####1.5.2 Countries affected by kidnappings/hijackings #####1.5.3 Outcome of Events #####1.5.4 Groups involved #####1.5.5 Ransom Demanded vs Ransom Paid #####1.5.6 Number of persons held hostage, released and killed ###2 - Exploratory analysis - continuation ####2.1 - Variable doubtterrorism ####2.2 - Define main variables ####2.3 - Variables that should be handle as one ###3 - Action on Previous Feedback ###4 - Weekly Presentation ###5 - Perceived Feedback
Weekly work
1 - Targets of terrorism description
1.1 - Research Questions
In this week the group intended to explore the following research questions:
- Who are the targets of terrorism? Are there some specific categories of targets?
- How many victims die in terrorist attacks?
- When there are hostages, are they released?
1.2 - Who are the targets of terrorism?
The target of terrorism, presented in this dataset are:
- "Private Citizens & Property"
- "Government (Diplomatic)"
- "Journalists & Media"
- "Police"
- "Utilities"
- "Military"
- "Government (General)"
- "Airports & Aircraft"
- "Business"
- "Educational Institution"
- "Violent Political Party"
- "Religious Figures/Institutions"
- "Unknown"
- "Transportation"
- "Tourists"
- "NGO"
- "Telecommunication"
- "Food or Water Supply"
- "Terrorists/Non-State Militia"
- "Maritime"
- "Abortion Related"
- "Other"
The values presented are the distinct entrances for the variable targtype1_txt.
1.3 - How many victims die in terrorist attacks?
Our findings allowed us to conclude that the number of victims per year verifies the same pattern as the number of incidents, as we can see in the figure below.
1.4 - Geo-Referential Analysis
With the Geo-referential Analysis, our goal is to consider the interaction of different attributes and we tried to visualize it in a map over some Animation/Video.
This time we considered the Type of Targets or Victims. As per the codebook, When a victim is attacked specifically because of his or her relationship to a particular person, such as a prominent figure, the target type reflects that motive. For example, if a family member of a government official is attacked because of his or her relationship to that individual, the type of target is “government.”
1.4.1 Incidents per Target Type or Victims
In order to understand the particular Targets or Victims of terrorist attacks, we recorded a video. In this Youtube Link it is possible to see a clear analysis. In the figure below there is a frame of the video. Each red dot on the map corresponds to a terrorist attack.
As we can see from the video, major targets were Business, Government, Journalists & Media, Military, Police, Private Citizens & Property, Religious figures and Institutions, and Transportation. Victims/Targets in most affected countries are:
- Pakistan: Business, Education Institutions, Government(General), Journalists and Media, Military, Non-State Militia, Telecommunication, Police, Utilities, Transportation, and Violent Political Parties.
- India: Business, Government(General), Police, Military, Private Citizens & Property, Transportation, and Violent Political Parties.
- Iraq: Business, Education Institutions, Government, Military, Non-State Militia, Police, Private citizens and property, Utilities, Religious figures and Institutions, Transportation, and Violent Political Parties.
- Abortion-Related incidents are mostly confined to the United States.
1.4.2 Major Terrorist Targets or Victims since 1970
In this Youtube Link it is possible to see a clear analysis. We have grouped the Targets/Victims into 5 groups:
1. "Government and Military" comprises of Government (General), Government (Diplomatic), Violent Political Party, Military, and Police.
2. "Infrastructure and Transportation" comprises of Airports & Aircraft, Transportation, Maritime, Food or Water Supply, Utilities and Telecommunication.
3. "People and Public Institutions" comprises of Educational Institution, NGO, Religious Figures/Institutions, Private Citizens & Property, Tourists, Business, and Journalists & Media.
4. Terrorists/Non-State Militia
5. Abortion-Related
In the figure below there is a frame of the video. Each dot on the map corresponds to a terrorist attack.
1.5 In which situations there are hostages? Are they released? Analysis of Hijacking or Kidnapping Events
We found that the Hijacking and Kidnapping events are interesting because it may help predict behavior in ongoing situations, especially when the perpetrators are open to negotiations.
1.5.1 Number of kidnappings/hijackings per year
In the figure below we can see the number of kidnapping or hijacking related events occurring every year.
1.5.2 Countries affected by kidnappings/hijackings
In the figure below we see the fraction of events occurring in each country. 19 countries account for 75% of kidnappings in the world.
1.5.3 Outcome of Events
In the figure below we see the outcome of events. Prediction of the event outcome would be interesting, but we are missing data for ~6000 kidnapping events.
1.5.4 Groups involved
In the figure below we see the groups involved in kidnapping or hijacking events. This is a very scattered distribution.
1.5.5 Ransom Demanded vs Ransom Paid
In the figure below we plot the mean value of the ransom demanded each year alongside mean of the ransom paid. In some years we can see that there is more ransom paid than demanded, whereas after 2008 ransoms are not paid out very often. This may be due to the changing nature of hostage situations.
1.5.6 Number of persons held hostage, released and killed.
Only ~250 events have all of the information and hence does not seem useful for further analysis.
2 - Exploratory analysis - continuation
2.1 - Variable doubtterrorism
The doubt terrorism variable from our dataset informs if there is certainty about the instance being a terrorist attack. In the analysis on this variable it is important to clarify that it is encoded as: no doubt that is terrorism (0), doubt that is terrorism (1) and data not present in the year (-9).
The analysis of this variable allowed us to conclude that there is one instance with an empty value, and this must be replaced with 0. Besides, a lot of data was having the value -9. In this case, this data should be kept because they belong to years without data. In the case of there is doubt about being terrorism (1) we check on the alternative to terrorism variable. We can see the results found in the next figure.
In the majority of the cases, te doubt about being terrorism is because of some confounding within guerrilla actions. Conclusion: In this case, we prefer to remove the terrorist attacks with doubt terrorism equal to 1, because there is uncertainty in their categorization as terrorism attack and we can be mixing the two phenomena (terrorism and guerrilla incidents).
2.2 - Define main variables
In this week we also decided which variables are central to us, in order to explore this variables to outliers (e. g. found in the correlations) and to have a better idea on the dataset. We concluded that the central variables are:
- iyear, imonth, iday, duration compute new variable with duration days (from variables extended and resolution), country, contry_txt, region, latitude, longitude, specificity, vicinity - change name to suburbs, multiple, success, suicide, attacktype1 - compare the attactype1 with 2 and 3; if there is no overlaping then replace by "assassination" 0 or 1, targtypes - handle as the same as attacktype1, gname, motive, nperps nperpcap, claimed, compclaim, weaptype1 - note that there is also weaptype2, 3 and 4, weasubtype1, weapdetail, nkill, nkillter, nwound, nwoundte, property, propextent, propvalue, propcomment, ishostkid, nhostk, nhours (number of hours kidnaping), divert (Country That Kidnappers/Hijackers Diverted To), kidhijcountry, ransom, ransomamt, ransomnote, nreleased, INT_LOG (crossed frontier), INT_IDEO (same country perpetrator and victim), INT_MISC, INT_ANY
We found some variables without explanation: ingroup, ingroup2, ingroup3, related
2.3 - Variables that should be handle as one
None. Conclusion: it is important to put the table in the third normal form, or else we are loosing information.
3 - Action on Previous Feedback
-
Figure in Sec. 1.2: remove coloring from bars; improve the text related to this figure. Explain the axis. check the new version here
-
They only go a bit too fast and so are difficult to follow. Perhaps more gradual transitions would ease the interpretation. Also, the legend of used colors is too tiny to notice and by themselves, the meaning of the colors is not obvious. Plus, what does the size mean then? That is, explain all your visual cues.
Corrected the transition time for Number of Deaths, Regions, Increased legend size and few other changes. check the new version here -
Figure "Type of Attacks evolving since 1970", again the legend is too small to read.
Increased Legend size and transition time for better understanding. check the new version here -
Figure "Number of Kills vs Attack Type" why not just write the attack types labels on the x-axis? Very difficult to read, even more considering that the odd but unwritten numbers are actually valid values. check the new version here
4 - Weekly Presentation
5 - Perceived Feedback
-
Combine variables (e.g. incident number for different target types. Are there more incidents of bomb attacks, for example?
-
Analyze geographical characteristics in hostage situations. Are the hostage rescues more popular in more specific countries? Or any more specific towns, country side, etc...?
-
Clustering - similarities between terrorists (political motivated, religious motivated, IS style). We could do this manually or applying or a clustering algorithm.
-
Normalize the number of kills with the number of attacks (number of kills over time), or consider the population in the places where the killings happened.
-
Summarize better the results and conclusions (for example, we could see the US abortion pattern successfully, but the others are more unclear).
-
Add ransom value (if it was paid) to the kidnapping charts.
Professor wrote
-
I would not bother with enumerating the sections, # ## and ### without numbers may suffice
-
It's actually more important to enumerate the figures & tables
-
Fig. 1.3 -- can we normalize by a number of attacks?
-
Now even epic music in the videos?? This is going to be an interesting semester
-
"In this Youtube Link it is possible to see a clear analysis" this is very vague. Rather do not put it. What clear analysis? State / summarize what we see in the video.
- I would it write such like: "The most popular targets per country are the following (Fig. X)" ... and make the figure a clickable link to the youtube video, as in:
-
Stil that list of countries - target is too much info for me to digest. So... what should I learn and take home from all these points? What's most important?
-
Again, avoid these "In this Youtube Link, it is possible to see a clear analysis", and summarize way more your findings.
-
"We found that the Hijacking and Kidnapping events are interesting because it may help predict behavior in ongoing situations, especially when the perpetrators are open to negotiations."? Too vague, and ofc, that's the whole point of hijacking/kidnapping
-
Avoid the boring "In the figure below we can see the number of kidnapping or hijacking related events occurring every year." -- Just use a caption for the figure and state "Number of kidnapping ..." --> Simple
-
... similar for feedback for the rest