A Univariate Descriptive Stats - JulTob/R GitHub Wiki
📈 Chart of Univariate Descriptive Stats
“A single variable can whisper secrets if ye listen well enough.”
This chart shows how to summarize and explore one variable in R: useless alone, but powerful when interpreted well. We’ll cover frequencies, central tendency, spread, and how to read the waters.
📊 One Variable: One dimensión: X
We’ll assume a numeric vector, like:
x <- c(11,16,5,10,5,9,9,11,9,4,8,13,7,12,8,12,8,7,11,8,10,8,10,12,10)
⸻
📦 Frequency
Let’s define the building blocks.
⚓ Total Frequency (n)
The number of data points:
n <- length(x)
[1] 25
🔢 Absolute Frequency ni
How many times each value appears with the type i:
table(x)
Example output:
4 5 7 8 9 10 11 12 13 16
1 2 2 5 3 4 3 3 1 1
This tells ye: 8 appears 5 times, 10 appears 4 times, etc.
📉 Relative Frequency fi
The proportion of Type i from the total:
table(x) / length(x)
This gives each value’s relative frequency: each fi, such that:
⛓️ Cumulative Frequency Ni
Cumulative counts in ordered data:
cumsum(table(x))
Shows how many data points are less than or equal to each value.
⚖️ Relative Cumulative Frequency Fi
cumsum(table(x) / length(x))
Shows the running total proportion—useful for percentiles and decision thresholds.
⸻
📄 Tabla de Frecuencias
| Modo i | Frecuencia ni | Frecuencia relativa fi | Acumulada Ni | Acumulada relativa Fi |
|---|---|---|---|---|
| 4 | 1 | 0.04 | 1 | 0.04 |
| 5 | 2 | 0.08 | 3 | 0.12 |
| … | … | … | … | … |
Ye can build this table like so:
freq_abs <- table(x)
freq_rel <- prop.table(freq_abs)
freq_acc <- cumsum(freq_abs)
freq_rel_acc <- cumsum(freq_rel)
tabla <- data.frame(
Modo = names(freq_abs),
ni = as.vector(freq_abs),
fi = round(freq_rel, 2),
Ni = freq_acc,
Fi = round(freq_rel_acc, 2)
)
print(tabla, row.names = FALSE)
⸻
📍 Central Tendency
These tell ye where the center of the distribution lies:
mean(x) # Media (average)
median(x) # Mediana (middle value)
⸻
🌊 Spread and Shape
These tell ye how the values vary:
var(x) # Varianza
sd(x) # Desviación estándar (√varianza)
Get min, max, quartiles in one command:
summary(x)
Example output:
Min. 1st Qu. Median Mean 3rd Qu. Max.
4.00 8.00 9.00 9.32 11.00 16.00
Or get exact percentiles:
quantile(x)
⸻
📊 Plot It!
Every good summary needs a visual:
hist(x, col = "skyblue", main = "Distribución de X", xlab = "Valores")
boxplot(x, col = "tomato", main = "Boxplot de X")
⸻
🧭 Summary of Metrics
Concepto Función R Tamaño total length(x) Frecuencia absoluta table(x) Frecuencia relativa prop.table(table(x)) Media mean(x) Mediana median(x) Varianza var(x) Desviación estándar sd(x) Cuartiles quantile(x) Resumen general summary(x)
Cheatsheet
> x <- c(11,16,5,10,5,9,9,11,9,4,8,13,7,12,8,12,8,7,11,8,10,8,10,12,10 )
> n <- length(x) [1] 25
> E <- mean(x) [1] 9.32
> Mdn <- median(x) [1] 9
> quantile(x)
0% 25% 50% 75% 100%
4 8 9 11 16
> var <- var(x) [1] 7.31
> sd <- sd(x) [1] 2.703701 sqrt(var)
> summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
4.00 8.00 9.00 9.32 11.00 16.00