Means over Time - RhoInc/sas-sgplot GitHub Wiki
- The Goal
- Dummy Data
- Means Only
- Make Nice Axes
- Add Error Bars
- Make Nice Legend
- Add Sample Sizes
- Style Modifications
The Means over Time plot will be developed in several steps. A succinct version of the code is available here.
On this page, we walk through the process of creating a means over time graph, complete with error bars and sample sizes. The end goal looks like this:
The dummy data used to produce this plot was created with the following data step. This data step is only included for the sake of completeness. Do not read, study, or obsess over this data step!
data derive.adlb;
length paramcd $8 param $100;
do paramcd = "ALT", "AST";
if paramcd = "ALT" then
param = "Alanine Aminotransferase (IU/L)";
else if paramcd = "AST" then
param = "Aspartate Transaminase (IU/L)";
do trtpn = 1 to 4;
trtp = 'Treatment ' || put(trtpn,1.);
do subjid = 1 to 40;
do avisitn = 0, 1, 2, 4, 6, 8 to 24 by 4;
length avisit $20;
if avisitn = 0 then avisit = "B/L";
else avisit = 'Week ' || strip(put(avisitn,best.));
aval = 45 + rannor(1)*(1+avisitn/24)*(trtpn/4);
if ranuni(1) < (50-avisitn)/50 then output;
end;
end;
end;
end;
run;
In this step, we learn how to produce a "means only" plot.
In order to plot means over time, we must first calculate means over time. Fortunately, there's a PROC for that.
proc sort data=derive.adlb out=mot00;
by trtpn trtp avisitn;
where paramcd = "ALT";
run;
proc means data=mot00 noprint;
var aval;
by trtpn trtp avisitn;
output out=mot10 n=n mean=mean stderr=stderr;
run;
If we are going to produce a figure for a commercial study, it is probably going to need to be saved as an RTF file. The typical options
, ods graphics
, and ods rtf
statements that surround the sgplot
code are as follows:
options
nonumber
nodate
orientation=portrait
;
ods graphics /
noborder
height=6in
width=6in
outputfmt=png
;
ods results off;
ods listing close;
ods rtf file="&PgmDir\means_only.rtf";
<<sgplot portion>>
ods rtf close;
ods listing;
ods results on;
The <<sgplot portion>>
contains a series
statement along with a couple of basic options.
proc sgplot data=mot10;
*--- draw the markers and lines ---;
series y=mean x=avisitn /
group=trtpn
markers
markerattrs=(size=10px)
;
run;
In this step, we make the axes look nice.
First we create a label for the y-axis based on the variable param
.
proc sql noprint;
select distinct strip(param) || " +/- SE"
into :ylabel
from mot00
;
quit;
%let ylabel = &ylabel;
%put &=ylabel;
The result in the log shows us:
YLABEL=Alanine Aminotransferase (IU/L) +/- SE
Next we create a custom set of tick marks for the x-axis using the variable avisitn
.
proc sql noprint;
select distinct avisitn
into :values separated by " "
from mot10
;
quit;
%put &=values;
The result in the log shows us:
VALUES=0 1 2 4 6 8 12 16 20 24
We also want to format the value 0
as B/L
.
proc format;
value avisitn
0 = "B/L"
other = [best.]
;
run;
Finally, we incorporate the above format and macro variables into the plot.
proc sgplot data=mot10;
*--- draw the markers and lines ---;
series y=mean x=avisitn /
group=trtpn
markers markerattrs=(size=10px)
;
*--- cosmetics ---;
yaxis
label="&ylabel"
;
xaxis
values=(&values)
valuesformat=avisitn.
label="Weeks"
;
run;
Next we add error bars to our plot.
We start by calculating the endpoints of the error bars.
data mot20;
set mot10;
upper = mean + stderr;
lower = mean - stderr;
run;
Then we use the scatter
statement to add the error bars to the plot. The parameters yerrorlower
and yerrorupper
are what cause the bars to be drawn.
proc sgplot data=mot20;
*--- draw the markers and lines ---;
series y=mean x=avisitn /
group=trtpn
markers markerattrs=(size=10px)
;
*--- draw the error bars ---;
scatter y=mean x=avisitn /
group=trtpn
yerrorupper=upper
yerrorlower=lower
markerattrs=(size=0px)
;
*--- cosmetics ---;
yaxis …
xaxis …
run;
Notice how the error bars at each time point lay on top of one another. This can be remedied by adding the groupdisplay=cluster
option to the series
and scatter
statements.
proc sgplot data=mot20;
*--- draw the markers and lines ---;
series y=mean x=avisitn /
group=trtpn
groupdisplay=cluster
markers markerattrs=(size=10px)
;
*--- draw the error bars ---;
scatter y=mean x=avisitn /
group=trtpn
groupdisplay=cluster
yerrorupper=upper
yerrorlower=lower
markerattrs=(size=0px)
;
*--- cosmetics ---;
yaxis …
xaxis …
run;
Next we clean up the legend.
First we identify all trtpn
and trtp
combinations, applying the resulting format to the trtpn
variables.
proc sql;
create table trtp as
select distinct "trtp" as fmtname,
trtpn as start,
trtp as label
from mot20
;
quit;
proc format cntlin=trtp;
run;
proc sql noprint;
alter table mot20
modify trtpn format=trtp.
;
quit;
Next we add a keylegend
statement. The purpose of options such as title=
and noborder
are intuitively obvious. The purpose of the text "series"
is less obvious. This tells SGPLOT which plot statement the legend should be based off of. You'll notice in the series
statement that we have specified name="series"
, thus allowing us to link it to the keylegend
statement.
proc sgplot data=mot20;
*--- draw the markers and lines ---;
series y=mean x=avisitn /
group=trtpn
groupdisplay=cluster
markers
markerattrs=(size=10px)
name="series"
;
*--- draw the error bars ---;
scatter …
*--- cosmetics ---;
yaxis …
xaxis …
keylegend "series" /
title="Treatment Group"
noborder
linelength=15pct
outerpad=(bottom=10pt)
;
run;
The next-to-last step is to add the sample sizes at each time point.
We use the variable n
to create a new variable xatn
, complete with a format to prevent missing values to from showing up as a dot.
proc format;
value xatn
. = " "
other = [best.]
;
run;
data mot30;
set mot20;
if avisitn in (&values) then
xatn = n;
format xatn xatn.;
run;
Next we add the xaxistable
statement. For some reason SAS has us use the class=
option to create groups in the x-axis table (we use the group=
option in the series
and scatter
statements).
proc sgplot data=mot30;
*--- draw the markers and lines ---;
series …
*--- draw the error bars ---;
scatter …
*--- add the sample sizes ---;
xaxistable xatn /
class=trtpn
colorgroup=trtpn
;
*--- cosmetics ---;
yaxis …
xaxis …
keylegend …
run;
As a final step, we perform some style modifications to change colors, symbols, and fonts.
The simplest way to change colors and symbols for grouped data is with the %modstyle
macro. Here we use the style rtf
to create a new style named motstyle0
.
%modstyle
(name=motstyle0
,parent=rtf
,type=CLM
,colors=black black black black
,markers=circle square diamond star
);
Changing fonts looks a little more awkward. Here we use the motstyle0
from above to create a new style named motstyle
. Try not to worry yourself too much about understanding this PROC TEMPLATE code. As long as you understand that Courier New
is the font we want in our plot then you've understood enough.
proc template;
define style styles.motstyle;
parent=styles.motstyle0;
class GraphFonts /
"GraphDataFont" = ("Courier New, <MTserif>, <serif>", 7pt)
"GraphValueFont" = ("Courier New, <MTserif>, <serif>", 9pt)
"GraphLabelFont" = ("Courier New, <MTserif>, <serif>",10pt)
;
end;
run;
Finally, we apply this new style to the ods rtf
statement.
ods rtf
style=styles.motstyle
file="&PgmDir\style_mods.rtf"
;
proc sgplot data=mot30;
series …
scatter …
xaxistable …
yaxis…
xaxis …
keylegend …
run;
ods rtf close;
And that's all there is to it.
Next page: Kaplan Meier