ggplot2 - Statistics-and-Machine-Learning-with-R/Statistical-Methods-and-Machine-Learning-in-R GitHub Wiki
Visualization of Data
Why is Visualization necessary?
Because of the way the human brain processes information, using charts or graphs to visualize large amounts of complex data is easier than poring over spreadsheets or reports. Data visualization is a quick, easy way to convey concepts in a universal manner – and you can experiment with different scenarios by making slight adjustments
Data visualization is primarily used to:
- Identify areas of significance.
- Clarify influential factors.
Package to Visualize: ggplot2()
ggplot2 is a powerful and flexible R package, used for producing elegant graphics. The concept behind ggplot2 divides the plot into three different fundamental parts:
Plot = data + Aesthetics + Geometry.
The principal components of every plot can be defined as follow:
- data is a data frame
- Aesthetics is used to indicate x and y variables. It can also be used to control the color, the size or the shape of points, the height of bars, etc.
- Geometry defines the type of graphics (histogram, box plot, line plot, density plot, dot plot, etc.)
Scatter Plots using ggplot2()
Basic scatter plots are created using this R Package. The color, the size, and the shape of points can be changed using the function geom_point() as follow :
geom_point(size, color, shape)