Climate and mortality in vienna (austria) - janeshdev/ggplot2 GitHub Wiki
Thermal stress could be heat load during summer or cold stress in winter. Generally it is easier to identify the influence of heat load on human health, because of the faster responses and due to the climate change discussion and the european summer 2003 (many people died during the heatwaves of june and august) the focus is on heat stress.
Daily mortality values and different thermal indices (simple ones like the air temperatur (Ta) and more complex ones, which take all the thermal fluxes affecting the human body into account) are used to asses the relation between thermal situation and climate conditions.
The scatterplot shows minimum air temperature of the day together with the mean mortality value of the day and the day after. The data shown is valid for Vienna, the capital of Austria. Mortality is described as difference to the expected value. Linear regression lines for the 5, 10, 90 and 95 percentile range are added additionaly to the ggplot2 smooth. The plots shows a strong increase in the mortality values for minimum air temperature above 16°C. For cold temperatures an increase occures after a delay of some days.
library(ggplot2)theme_set(theme_bw()) library(gtools) vplayout = function(x,y){ viewport(layout.pos.row=x, layout.pos.col=y) } cutDate = function(data, from, to){ df = data[data$Day<as.Date(to) & data$Day>=as.Date(from),] return(df) } makedekaden = function(data){ data70 = data.frame(cutDate(data, "1971-01-01", "1981-01-01"), Dekade="1971-1980") data80 = data.frame(cutDate(data, "1981-01-01", "1991-01-01"), Dekade="1981-1990") data90 = data.frame(cutDate(data, "1991-01-01", "2001-01-01"), Dekade="1991-2000") data00 = data.frame(cutDate(data, "2000-01-01", "2008-01-01"), Dekade="2001-2007") data = rbind(data70, data80, data90, data00) return(data) } meanmort = function(data, lag=1){ m = running(data[,2], fun=mean, width=lag, align="right", pad=F, na.rm=F) d = data.frame(Tag=data$Tag, Mortality=c(m, NA)) return(d) } load("climate.RData") load("mortality.RData") str(climate) #'data.frame': 42735 obs. of 17 variables: # $ Tag :Class 'Date' num [1:42735] 0 0 0 1 1 1 2 2 2 3 ... # $ Termin : Factor w/ 3 levels "14:00","19:00",..: 1 2 3 1 2 3 1 2 3 1 ... # $ Station : Factor w/ 9 levels "Eisenstadt (BGL)",..: 9 9 9 9 9 9 9 9 9 9 . #.. # $ DayOfTheYear: int 1 1 1 2 2 2 3 3 3 4 ... # $ PMV : num -6 -6.7 -4 -6.7 -6.7 -7.6 -5.8 -6 -5.6 -6.7 ... # $ PET : num -6.4 -9.5 -3.1 -8.5 -9.8 -12.4 -6.7 -10.8 -8.9 -11.4 ... # $ SET : num -11.4 -16.4 -0.1 -15.7 -17.2 -22.2 -8.9 -10.7 -9 -14.7 ... # $ Ta : num -0.6 -1.9 -1.3 -1.3 -1.7 -3.8 -3 -5.8 -3.2 -6.2 ... # $ RF : int 75 81 96 65 67 74 71 92 81 92 ... # $ v : num 6.7 6.7 0.1 12.3 6.7 12.3 2.5 0.9 0.9 2.5 ... # $ Bew : num 7.2 8 8 6.4 6.4 8 7.2 8 6.4 8 ... # $ Tmin : num -1.9 -1.9 -1.9 -4 -4 -4 -5.8 -5.8 -5.8 -9 ... # $ Tmax : num -0.4 -0.4 -0.4 -1.2 -1.2 -1.2 -1.3 -1.3 -1.3 -5.7 ... # $ RFmittel : int 84 84 84 69 69 69 81 81 81 93 ... # $ vvmittel : num 3.8 3.8 3.8 10.2 10.2 10.2 1.5 1.5 1.5 1.5 ... # $ bewmittel : int 97 97 97 87 87 87 90 90 90 100 ... # $ nied : int 0 0 0 0 0 0 -1 -1 -1 0 ... str(mort) # 'data.frame': 13879 obs. of 2 variables: # $ Tag :Class 'Date' num [1:13879] 0 1 2 3 4 5 6 7 8 9 ... # $ Mortalität: num NA NA NA NA NA NA NA NA NA NA ... summary(mort) # Tag Mortalität # Min. :1970-01-01 Min. :-56.3318 # 1st Qu.:1979-07-02 1st Qu.: -9.6785 # Median :1988-12-31 Median : -0.2859 # Mean :1988-12-31 Mean : 0.3507 # 3rd Qu.:1998-07-01 3rd Qu.: 9.7548 # Max. :2007-12-31 Max. : 71.9181 # NA's :364.0000 # TITLE="Vienna (Austria) - Ta (min)" XLAB="Ta (min)" YLAB="Mortality abnormality (%)" # calculate running mean of mortality by 2 days mmort = meanmort(mort, 2) data = merge(mmort, climate, by="Tag") # merge by day #Tamin data = data[data$Termin=="14:00",c(1,2,13)] names(data) = c("Day", "Mortality", "MinTa") p1 = ggplot(data) + geom_point(aes(MinTa, Mortality),alpha=0.4,color=I("orange")) p1 = p1 + opts(title=TITLE ) + geom_smooth(aes(x=MinTa, y=Mortality), colour="green") p1 = p1 + scale_x_continuous(XLAB) + scale_y_continuous(YLAB) p1 = p1 + geom_hline(y=0, colour="darkgreen") # 90 and 95 percentile quants = quantile(data$MinTa, probs=c(90, 95)/100) colours = c("red", "darkred", "darkblue", "blue") c=1 for (i in quants){ p1 = p1 + geom_vline(x=i, colour="gray66") p1 = p1 + geom_smooth(aes(x=MinTa, y=Mortality), data=data[data$MinTa>i,], method="lm", color=colours[c]) c = c+1 } # 5 and 10 percentile quants = quantile(data$MinTa, probs=c(5, 10)/100) for (i in quants){ p1 = p1 + geom_vline(x=i, colour="gray66") p1 = p1 + geom_smooth(aes(x=MinTa, y=Mortality), data=data[data$MinTa<i,], method="lm", color=colours[c]) c = c+1 } data = makedekaden(data) p2 = ggplot(data) + geom_point(aes(MinTa, Mortality), alpha=0.4, color=I("orange")) p2 = p2 + facet_wrap(~Dekade) + opts(title=TITLE) + geom_smooth(aes(x=MinTa, y=Mortality)) p2 = p2 + scale_x_continuous(XLAB) + scale_y_continuous(YLAB) p2 = p2 + geom_hline(y=0, colour="darkgreen") grid.newpage() pushViewport(viewport(layout=grid.layout(2,1))) print(p1, vp = vplayout(1,1)) print(p2, vp = vplayout(2,1))
Rises in the mortality values occure, when thermal stress situations persist for some days and are present at night and day.
The following plot shows the daily mortality (including a gaussian filter of 2 weeks) in summer 2000 in Vienna, with high values of PET (physiological equivalent temperature a physiological relevant index) in the morning (>=23) and at noon (>=29). Additionally periods with heat waves conditions (defined as consecutive days with PET>=23 in the morning and PET>=29 at noon) are plotted.
load("mort_vienna_summer_2000.RData") str(mort) # 'data.frame': 153 obs. of 2 variables: # $ Day :Class 'Date' num [1:153] 11078 11079 11080 11081 11082 ... # $ Mortality: num 0.769 -29.681 -25.035 -27.023 -9.068 ... load("climate_vienna_2000.RData") str(climate) # 'data.frame': 308 obs. of 3 variables: # $ Day :Class 'Date' num [1:308] 11077 11078 11079 11080 11081 ... # $ time: Factor w/ 3 levels "14:00","19:00",..: 3 3 3 3 3 3 3 3 3 3 ... # $ PET : num 19.7 19.4 16.6 8.6 14.3 17.6 18.3 16.7 21.2 18.3 ... load("heatwave_vienne_summer_2000.RData") str(heatwave) # 'data.frame': 11 obs. of 3 variables: # $ Day :Class 'Date' num [1:11] 11089 11104 11111 11113 11121 ... # $ Duration: int 2 1 1 1 4 3 2 1 1 1 ... # $ Start : Factor w/ 712 levels "10.6.1957","10.6.1963",..: 26 451 559 606 10 286 184 597 670 479 ... library(ggplot2) theme_set(theme_bw()) TITLE="Summer 2003 - Vienna (Austria)" MORNING_LIMIT = 23 # PET NOON_LIMIT = 29 # PET MINHEATDAYS = 2 gf = function(x, windowsize){ if (length(x) < 2*windowsize){ return (rep(NA, length(x))) } weight = c() for(i in seq(windowsize*4)){ weight[i] = dnorm((i-1)*6/windowsize) if(weight[i] < 0.1 * weight[1]) break } weight = weight / (weight[1] + sum(weight[-1]*2)) w = c(rev(weight), weight) weight = w[-length(weight)] filter(x, weight) } Morning = climate[climate$time=="7:00",c(1,3)] Noon = climate[climate$time=="14:00",c(1,3)] Noon[,2] = replace(Noon$PET, Noon$PET<NOON_LIMIT, NA) Morning[,2] = replace(Morning$PET, Morning$PET<MORNING_LIMIT, NA) names(Noon) = c("Day", "Noon") names(Morning) = c("Day", "Morning") tmp = merge(Noon, mort, by="Day") tmp = merge(tmp, Morning, by="Day") data = melt(tmp, id="Day") data = rbind(data, data.frame(Day=heatwave$Day, variable="Duration", value=heatwave$Duration)) scm = scale_colour_manual("", values = c( "Mortality (%)" = "gray72", "PET (°C)"="red", "Baseline"= "darkolivegreen4", "Gaussian filter 14d" = "darkred")) sfm = scale_fill_manual("Days with Heat Load", values=c("Duration (d)"="darkred")) p = ggplot(data, aes(Day, value)) p = p + geom_rect(subset = .(variable == "Duration"), aes(xmin=Day, xmax=Day-value, ymin=ifelse(min(value)>0, min(value)-1, 0), ymax=value, fill="Duration (d)")) p = p + geom_line(subset = .(variable == "Mortality"), aes(color="Mortality (%)")) p = p + geom_linerange(subset = .(variable == "Noon") , aes(ymin=min(data[data$variable=="Noon",]$value, na.rm=T), color="PET (°C)", ymax=value)) p = p + geom_linerange(subset = .(variable == "Morning"), aes(ymin=min(data[data$variable=="Morning" ,]$value, na.rm=T), color="PET (°C)", ymax=value)) p = p + facet_grid(variable ~ . , scale="free_y") + scale_y_continuous("") + scale_x_date("") p = p + geom_line(aes(x=Day, y=c(rep(NA, length(value[variable=="Mortality"])), # is there a more simple way for this? rep(mean(value[variable=="Mortality"]), length(value[variable=="Mortality"])), rep(NA, length(value[variable=="Mortality"])), rep(NA, length(value[variable=="Duration"]))), colour="Baseline", ymax=value)) p = p + geom_line(aes(x=Day, y=c(rep(NA, length(value[variable=="Mortality"])), gf(value[variable=="Mortality"], 14), rep(NA, length(value[variable=="Mortality"])), rep(NA, length(value[variable=="Duration"]))), colour="Gaussian filter 14d", ymax=value)) p = p + opts(title=TITLE) + opts(panel.margin=unit(0,"lines")) + scm + sfm
Comment:
Look through this advice and investigate how to do research online!
- The plots a part of my master thesis, so please don’t copy them without asking.
- The heat wave definition is not jet finished, the above shown example is only a first approach that will definitly be modified.
- Sadly I am not allowed to publish the data.
- Sorry for the german/english code mess!
- Any comment are welcome, I am still a R/ggplot newbie.
Stefan Muthers
Meteorological Institut of Freiburg
Albert-Ludwigs-University Freiburg (Germany)