Low code Data Exploration Tools - clizarraga-UAD7/Workshops GitHub Wiki

Data Exploration or Exploratory Data Analysis

[Image credit: Devopedia]

For reading and cleaning data, as well as for doing data analysis, the Pandas Python Library is the preferred choice for every day data science tasks. Pandas also includes a set of essential visualization functions to explore the dataset properties.

We will present a small collection of open source software Python tools that will facilitate us carrying out an Exploratory Data Analysis of a dataset with a small amount of coding necessary.

There is significant number of these type of tools, that we will review:

  • ydata-profiling | Documentation. ydata-profiling (formerly know as pandas-profiling)provides a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, ydata-profiling delivers an extended analysis of a dataFrame while allowing the data analysis to be exported in different formats such as html and json. (Please read installation notes).

  • Sweetviz. Sweetviz is an open-source Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with just two lines of code. Output is a fully self-contained HTML application.

  • Lux API | Documentation. Lux is a Python library that makes data science easier by automating certain aspects of the data exploration process. Lux is designed to facilitate faster experimentation with data, even when the user does not have a clear idea of what they are looking for. Lux is integrated with an interactive Jupyter widget that allows users to quickly browse through large collections of data directly within their Jupyter notebooks.

  • DataPrep | Documentation. DataPrep.EDA is the fastest and the easiest EDA (Exploratory Data Analysis) tool in Python. It allows you to understand a Pandas/Dask DataFrame with a few lines of code in seconds.

  • AutoViz. Automatically Visualize any dataset, any size with a single line of code. Now you can save these interactive charts as HTML files automatically with the "html" setting.


Please see Jupyter Notebook with examples


References:


Created: 03/16/2023; Updated: 03/16/2023

Carlos Lizárraga Data Science Institute University of Arizona

CC BY-NC-SA 4.0