Strainmeter offsets: real or spurious - janeshdev/ggplot2 GitHub Wiki
Scripps Institution of Oceanography
As a geophysics graduate student, I study data from instruments that measure strain (ratio of length change to original length) in the Earth’s crust (where most earthquakes happen); they are installed along the Pacific and North American tectonic plate boundary.
Since the installation of these instruments, some large earthquakes (near and far) have caused the raw data to contain static offsets at the time of the quake. From first-order modeling of the crust, we expect to see an offset, depending mainly on earthquake-magnitude and -distance, but often the magnitude of the expected signal is well below the instrumental precision (0.1 nano-strain). Our observations of offsets are often orders of magnitude larger than expected, and frequently have incorrect sign, as the case study below demonstrates.
This code is intended to provide a means to thoroughly characterize all the observed offsets for a single earthquake so I can inter-compare earthquakes, stations, or channels; this facilitates identification or similarities, differences, and/or trends in my observations. The theoretical calculations are not done in R; rather, they are imported. This was my first attempt at using sqldf. I like the syntax very much and plan to use it frequently. It made joining datasets tractable and efficient.
## ## shell script to access the R code for multiple datasets; hence, take a ## command line input of the event name (for labeling): ## event<-commandArgs(6) parts<-unlist(strsplit(event,' ')) date<-parts[1] mag<-parts[2] event<-parts[3] eventdf<-data.frame(eq=event) title<-paste('during',date,mag,event) ## ##Initialize figure and parameters: ## pdf("Rplots.pdf", width=16, height=8) par(mfcol=c(2,4)) fontpar<-par(ps=16) ## ##Start dealing with data sets: ## www <- "http://kook.ucsd.edu/pub/offset.values" offsetdf <- read.table(www,h=F) data <- "http://kook.ucsd.edu/pub/elmayor.txt" thydf=read.table(data,header=TRUE) thy<-thydf$theory stafact=factor(thydf$sta) azim<-thydf$azim ang=numeric(length(azim)) # strainmeters have an inherent directional ambiguity, so map the azimuth to a 0-180 scale ang[azim>=0]=azim[azim>=0] ang[azim<0]=abs(azim[azim<0]+180) ang[azim>=180]=azim[azim>=180]-180 # find only positives posind<-which(thy>0) possta<-thydf$sta[posind] poscha<-thydf$cha[posind] posthy<-log10(thy[posind])+1 posang<-ang[posind] posfil<-stafact[posind] postit<-"Predicted strain orientation by station\nextensional offsets" posxax<-"" #"extensional" posdf <- data.frame(sta=possta,cha=poscha,thy=posthy,ang=posang,fill=posfil,title=postit,xlab=posxax) histbr=seq(from=-2,to=4,by=0.1) histxl=c(-2,4) histyl=c(0,1.6) # form histogram poshist<-hist(posthy,main="Predicted strain at PBO BSMs\nextensional offsets", freq=FALSE,br=histbr,xlab="log 0.1 nanostrain (0 = 0 dB, resolution limit)", xlim=histxl,ylim=histyl) lines(density(posthy),col="red") text(4.0,1.60,title,pos=2) text(4.0,1.39,"coseismic offset hist.",pos=2,font=3) text(4.0,1.25,"gaussian-kernel density",pos=2,col="red",font=3) fontpar # find only negatives negind=which(thy<0) negsta<-thydf$sta[negind] negcha<-thydf$cha[negind] negthy=log10(abs(thy[negind]))+1 negang=ang[negind] negfil<-stafact[negind] negtit<-"\ncontractional offsets" negxax<-"" #"contractional" negdf <- data.frame(sta=negsta,cha=negcha,thy=negthy,ang=negang,fill=negfil,title=negtit,xlab=negxax) # form histogram neghist<-hist(negthy,main="contractional offsets", freq=FALSE,br=histbr,xlab="log 0.1 nanostrain (0 = 0 dB, resolution limit)", xlim=histxl,ylim=histyl) lines(density(negthy),col="red") fontpar ## ## Do some table joins to merge theoretical and observed offsets by station and event-name: ## library(sqldf) posres<-sqldf(' select o.V1 as sta, o.V2 as cha, t.ang as ang, t.fill as fill, t.thy as thy, 1e11*abs(o.V3) as obs, o.V3 as rawobs from eventdf e, offsetdf o, posdf t where o.V1=t.sta and o.V2=t.cha and o.V5=e.eq') negres<-sqldf(' select o.V1 as sta, o.V2 as cha, t.ang as ang, t.fill as fill, t.thy as thy, 1e11*abs(o.V3) as obs, o.V3 as rawobs from eventdf e, offsetdf o, negdf t where o.V1=t.sta and o.V2=t.cha and o.V5=e.eq') ## ## Now start dealing with ggplot2: ## library(ggplot2) # opts to add to every panel opt<-opts( panel.title=theme_text(face="bold") ,axis.title.x=theme_text()#face="bold") ,axis.title.y=theme_text(angle=90) ,panel.grid.major = theme_line(colour="gray") ,panel.background = theme_rect(colour=NA,fill=NA) ,background = theme_rect(colour=NA,fill=NA) ,plot.background = theme_blank() ) # end opts # set radial breaks and labels compass_breaks<-seq(from=0,to=179,by=30) compass_labels<-c("NS","N30E","N60E","EW","N60W","N30W") ## ## Create function to plot theoretical values ## plotthy <- function(ang,thy,fill,title,note,vp){ # radials dont like negative vals, so add and adjust limits and breaks toadd<-2 df=data.frame(ang=ang,thy=thy+toadd,fill=fill) jit<-position_jitter(width=0.2, height=0.) id<-position_identity() labdf=data.frame(ang=180,thy=6,lab=note) resdf=data.frame(ang=c(0,180),thy=c(2,2),fill="red") p<-ggplot(df,aes(ang,thy,fill=fill),legend=FALSE) print( p+ # resolution limits geom_area(data=resdf,aes(x=ang,y=thy,alpha=0.6))+ # and theoretical geom_jitter(position=jit,colour=alpha("black",0.5),legend=FALSE,size=3 )+ scale_x_continuous( limits=c(0,180), breaks=compass_breaks, labels=compass_labels )+ scale_y_continuous( # assume y comes in as log10(0.1 nanostrain) limits=c(-2,4.5)+toadd, breaks=0:4+toadd, labels=c(0.1,expression("1 n"*epsilon),10,100,expression("1"~mu*epsilon)) )+ # annotation (wish this were easier) geom_text(aes(ang,thy,fill="black",label=lab,size=4),data=labdf)+ #polar coordinate transformation coord_polar(theta="x",start=0)+ # axis labels labs(x="",y="")+ # plot options opt+opts(title=title,legend.position="none") ,vp=vp) # end print (to device) } # end plotthy function # run the plotthy function twice (pos and neg) vp<-viewport(height=0.5+0.05,width=c(1/4)+0.05,x=unit(5.8,"in"),y=unit(5.85,"in")) plotthy(posang,posthy,posfil,postit,posxax,vp) vp<-viewport(height=0.5+0.05,width=c(1/4)+0.05,x=unit(5.8,"in"),y=unit(1.90,"in")) plotthy(negang,negthy,negfil,negtit,negxax,vp) ## ## Create function to plot observed values ## plotobs <- function(dataframe,title,note,vp){ # radials dont like negative vals, so add and adjust limits and breaks toadd<-1 df<-dataframe #summary(df) jit<-position_jitter(width=0.2, height=0.) id<-position_identity() labdf=data.frame(ang=180,thy=6,lab=note) resdf=data.frame(ang=c(0,180),obs=c(1,1)) p<-ggplot(df,aes(ang,obs)) print( p+ # resolution limitsw geom_area(data=resdf,aes( x=ang,y=log10(obs)+1 ,fill="red",alpha=0.6 ) )+ # bars, by stations geom_jitter(data=df,position=jit,aes( x=ang,y=log10(obs) ,size=3,alpha=0.6 ,shape=factor(sign(rawobs)) ) )+ scale_x_continuous( limits=c(0,180), breaks=compass_breaks, labels=compass_labels )+ scale_y_continuous( # assume y comes in as log10(0.1 nanostrain) limits=c(-2,4.5)+toadd, breaks=0:4+toadd, labels=c(0.1,expression("1 n"*epsilon),10,100,expression("1"~mu*epsilon)) )+ # annotation #geom_text(aes(ang,thy,fill="black",shape=0,label=lab,size=4),data=labdf)+ # polar coordinate transformation coord_polar(theta="x",start=0)+ # axis labels labs(x="",y="")+ # plot options opt+opts(title=title,legend.position="none")+ # fix the shape scales # agreement: -1,0,1 scale_shape_manual(value=c(1,2,16)) #scale_shape_manual(value=c(15,17,16),aes(size=2)) ,vp=vp) # end print (to device) } # end plotobs function # run the plotthy function twice (pos and neg) label=c("absolute strain") vp<-viewport(height=0.5+0.05,width=c(1/4)+0.05,x=unit(9.6,"in"),y=unit(5.85,"in")) plotobs(posres,c("Observed BSM offset\nat extensional stations"),label,vp) vp<-viewport(height=0.5+0.05,width=c(1/4)+0.05,x=unit(9.6,"in"),y=unit(1.90,"in")) plotobs(negres,c("\nat contractional stations"),label,vp) ## ## Create function to plot residuals (observations less theory) ## on log scale, this corresponds to the ratio between the two quantities ## plotresid <- function(dataframe,title,vp){ dfna<-dataframe # filter and process na.omit(dfna) df=subset(dfna,obs>0) df$logabs<-log10(abs(1e9*df$rawobs)) df$facet<-factor(sign(df$rawobs)) print(levels(df$facet)) levels(df$facet)<-c("Opposite sign","Predicted sign") # remove trend from theory and obs detrend<-lm(logabs~0+thy,data=df) df$resid<-resid(detrend) jit<-position_jitter(width=0.2, height=0.) id<-position_identity() p<-ggplot(df, aes( thy, resid, #thy, logabs, #,(abs(obs-10**thy)), shape=factor(sign(rawobs)), group=factor(sign(rawobs)) ) # end aes ) # end ggplot print( p+ scale_y_continuous( limits=c(-2.25,2.25), breaks=-2:2, labels=c("1/100","1/10",1,10,100), name="magnitude ratio: obs / thy" )+ scale_x_continuous( limits=c(-1.3,3.5), breaks=-1:3, labels=c(0.1,expression("1 n"*epsilon),10,100,expression("1"~mu*epsilon)), name="predicted offset" )+ geom_abline(intercept=0,slope=0)+ geom_point(aes(alpha=0.45),size=4,stat="identity" )+ stat_smooth(method="loess",span=1.8,fullrange=TRUE,se=FALSE)+ scale_shape_manual( value=c(1,16) )+ coord_equal(ratio=.85)+ facet_grid(facet ~ .)+ opt+opts( plot=theme_rect(), panel.background=theme_rect(), panel.grid.major=theme_blank(), panel.grid.minor=theme_blank(), axis.title.x=theme_text(size=11), axis.title.y=theme_text(size=11,angle=90), title=title, legend.position="none" ) ,vp=vp) # end print } ## plot the residuals one ## at this point only need once because symbol encodes sign difference resdf<-rbind(posres,negres) vp<-viewport(height=0.7,width=0.25,x=unit(13.4,"in"),y=unit(4,"in")) plotresid(resdf,"Observed versus predicted strains\nall stations",vp)
This earthquake occurred on Easter Sunday, 2010, in a predominately agricultural region of northern Baja California; fortunately
the region is not as densely populated as, say, Tijuana and the resulting casualties were less than 5 people. You can read more about
the earthquake from the USGS official report.
The command line to produce this plot (if the R code is in ‘radial.R’) is
Rscript radial.R "2010:094 M7.2 MAYOR"
Here’s the result, after ghostscript conversion (to 32-bit alpha png):
and the vector output: elmayor.pdf
Projects such as this (multi-faceted data) often suffer from data visualization. The tool I’ve developed here allows immediate visualization of a number of different aspects to this data. The histograms and kernel-density curves show that offsets we might be able to see are often below the resolution limitation of the instrument; the radial plots echo this with the red filled-region. In the “observed…” radial plot, we see two symbol types: a filled circle indicating the sign of the offset agrees with theory, and an open circle indicating disagreement. As a final measure, the observations are de-trended by removing the predictions. As the far right panel shows, this reveals a wide range of differences (observations that are too large and too small).
It’s clear that most stations often disagree in sign and magnitude from predictions based on simple models of strain redistribution in a region from a large earthquake; the magnitude discrepancy may be up to 100 times greater, or up to 20 times smaller. Our models of predicted strain are sufficient approximations at great distances, so we conclude these instrument either behave unpredictably (causing spurious offsets), or highly localized geologic features control the strain measurement.
The script should standalone; I’ve made the two datasets (observed and theoretical) available in my public directory kook.