A Univariate Descriptive Stats - JulTob/R GitHub Wiki

📈 Chart of Univariate Descriptive Stats

“A single variable can whisper secrets if ye listen well enough.”

This chart shows how to summarize and explore one variable in R: useless alone, but powerful when interpreted well. We’ll cover frequencies, central tendency, spread, and how to read the waters.

📊 One Variable: One dimensión: X

We’ll assume a numeric vector, like:

x <- c(11,16,5,10,5,9,9,11,9,4,8,13,7,12,8,12,8,7,11,8,10,8,10,12,10)

⸻

📦 Frequency

Let’s define the building blocks.

⚓ Total Frequency (n)

The number of data points:

n <- length(x)
[1] 25

🔢 Absolute Frequency ni

How many times each value appears with the type i:

table(x)

Example output:

 4  5  7  8  9 10 11 12 13 16 
 1  2  2  5  3   4   3   3   1   1

This tells ye: 8 appears 5 times, 10 appears 4 times, etc.

📉 Relative Frequency `fi`

The proportion of Type i from the total:

table(x) / length(x)

This gives each value’s relative frequency: each fi, such that:

⛓️ Cumulative Frequency `Ni`

Cumulative counts in ordered data:

cumsum(table(x))

Shows how many data points are less than or equal to each value.

⚖️ Relative Cumulative Frequency Fi

cumsum(table(x) / length(x))

Shows the running total proportion—useful for percentiles and decision thresholds.

⸻

📄 Tabla de Frecuencias

Modo i	Frecuencia ni	Frecuencia relativa fi	Acumulada Ni	Acumulada relativa Fi
4	1	0.04	1	0.04
5	2	0.08	3	0.12
…	…	…	…	…

Ye can build this table like so:

freq_abs <- table(x)
freq_rel <- prop.table(freq_abs)
freq_acc <- cumsum(freq_abs)
freq_rel_acc <- cumsum(freq_rel)

tabla <- data.frame(
  Modo = names(freq_abs),
  ni = as.vector(freq_abs),
  fi = round(freq_rel, 2),
  Ni = freq_acc,
  Fi = round(freq_rel_acc, 2)
)
print(tabla, row.names = FALSE)

⸻

📍 Central Tendency

These tell ye where the center of the distribution lies:

mean(x)       # Media (average)
median(x)     # Mediana (middle value)

⸻

🌊 Spread and Shape

These tell ye how the values vary:

var(x)        # Varianza
sd(x)         # Desviación estándar (√varianza)

Get min, max, quartiles in one command:

summary(x)

Example output:

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   4.00    8.00    9.00    9.32   11.00   16.00

Or get exact percentiles:

quantile(x)

⸻

📊 Plot It!

Every good summary needs a visual:

hist(x, col = "skyblue", main = "Distribución de X", xlab = "Valores")
boxplot(x, col = "tomato", main = "Boxplot de X")

⸻

🧭 Summary of Metrics

Concepto Función R Tamaño total length(x) Frecuencia absoluta table(x) Frecuencia relativa prop.table(table(x)) Media mean(x) Mediana median(x) Varianza var(x) Desviación estándar sd(x) Cuartiles quantile(x) Resumen general summary(x)

Cheatsheet

> x   <- c(11,16,5,10,5,9,9,11,9,4,8,13,7,12,8,12,8,7,11,8,10,8,10,12,10 )
> n   <- length(x)    [1] 25
> E   <- mean(x)      [1] 9.32
> Mdn <- median(x)    [1] 9
> quantile(x)
  0%  25%  50%  75% 100% 
   4    8    9   11   16 
> var <- var(x)      [1] 7.31
> sd <- sd(x)        [1] 2.703701   sqrt(var)
> summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   4.00    8.00    9.00    9.32   11.00   16.00