3. Charts Deep Dive - jonboone1/plotlyPowerpoint GitHub Wiki

Charts Overview

This page has everything you need to know when it comes to individual charts. Below you will find information on each chart type that is currently supported, as well as how to define an object for that specific chart type.

All charts are created in Plotly, saved as images, and then loaded into a specific powerpoint slide (except 1). The one exception for this is the table. Because powerpoint support for tables is pretty good, we've decided to insert tables as actual Pptx tables instead of an image. Because of this, defining an object for the table is a bit different than it is for everything else.

The Plotly Chart Object

Below is an overview of the core components you need to define a plotly chart. Some charts have small differences, which you can find below for the description related to that chart type.

Example Code

 { #Line Chart - stock prices
    "data": df2,
    "type": "line",
    "name": "Stock Prices by Company",
    "filename": 'charts/stock-prices-by-company',
    "metrics": [
        {"name": "JohnsonJohnson", "prettyName": "Stock Price", "method": "mean"}
    ],
    "axis": "year",
    "x-axis-title": 'Year',
    "y-axis-title": "Average Stock Price",
    "description": "Grouping by additional variables is easy",
    "filters": [
        {"variable": "year", "operation": ">=", "value": "1970", "type":"int"}
    ],
    "item-index": {
        'slide': 0,
        'title': 0,
        'chart': 10,
        'description': 11
    }
},

Object Definition

  • data - A pandas dataframe representing your data
  • type - Defines your chart type.
  • name - The title of the slide
  • filename - The place where the image will be stored. Make sure the directory actually exists
  • metrics - An array defining each metric.
    • name - The name of the column in the dataframe
    • prettyName - The label you would like to show for this metric
    • method - How you want to aggregate this metric. Can use mean, sum, or nunique.
  • axis - The name of the column you want on the categorical axis. Typically this is the x-axis.
  • x-axis-title - A label you want to show for the x-axis
  • y-axis-title - A label you want to show for the y-axis
  • description - Content you want to place in a text box (if your slide has that object)
  • filters - An array of filters you want to place on your dataset for the chart.
    • variable - The name of the column in your pandas dataframe
    • operation - Can be ==, >=, <=, !=, or in
    • value - What value you are using to filter
    • type - What data type this field is. Supports int, str, or arr
  • item-index - A sub-dictionary defining where each element should go in the powerpoint slide template
    • slide - What slide you are using. This number is the slide index in the master layout
    • title - Where the title in the slide template goes
    • chart - Where the chart in the slide goes
    • description - Where the description in the slide goes

Individual Chart Types

Line Chart

A simple line chart. You can do basic stuff with this chart, such as break out by an additional dimenion, add multiple metrics, etc.

image

Example Code

 { #Line Chart - stock prices
    "data": df2,
    "type": "line",
    "name": "Stock Prices by Company",
    "filename": 'charts/stock-prices-by-company',
    "metrics": [
        {"name": "JohnsonJohnson", "prettyName": "Stock Price", "method": "mean"}
    ],
    "axis": "year",
    "x-axis-title": 'Year',
    "y-axis-title": "Average Stock Price",
    "description": "Grouping by additional variables is easy",
    "filters": [
        {"variable": "year", "operation": ">=", "value": "1970", "type":"int"}
    ],
    "item-index": {
        'slide': 0,
        'title': 0,
        'chart': 10,
        'description': 11
    }
},

Line Chart Specifics

  • color - Defined in the root of the array. You can use this to select a dimension used to group by color. For instance, if you have a dataframe with a date column, metric column, then dimension column, you can put the date on the x-axis, metric on the y-axis, then group by the dimension to create multiple lines. Currently not supported for multiple metrics.
  • options -
    • horizontal-grid-lines - Set to either true or false. If true, it will set grid-lines in the background (going horizontally) at this color: #ebebeb.
    • vertical-grid-lines - Also set to either true or false. Does the same thing but vertically.
    • x-axis-ticks-angle - Allows you to rotate the x-axis-labels on the line chart. This must be an integer; generally I use -45 to rotate labels that are a little long to fit normally.

Bar Chart

A basic bar chart. Offers most of the same functionality as the basic line chart. image

Example Code

 { #Bar chart of insect sprays
    "data": df4,
    "type": "bar",
    "name": "Avg Spray Effictiveness by Type",
    "filename": 'charts/spray-by-type',
    "metrics": [
        {"name": "count", "prettyName": "Effectiveness", "method": "mean"},
        {"name": "m2", "prettyName": "Effectiveness 2", "method": "mean"}
    ],
    "axis": "spray",
    "color": "category", 
    "x-axis-title": "Effectiveness",
    "description": "this slide has data on it!",
    'options': {
        'orientation': 'horizontal'
    },
     "item-index": {
        'slide': 0,
        'title': 0,
        'chart': 10,
        'description': 11
    }
},

Bar Chart Specifics

  • color - Defined in the root of the array. You can use this to select a dimension used to group by color. For instance, if you have a dataframe with a date column, metric column, then dimension column, you can put the date on the x-axis, metric on the y-axis, then group by the dimension to create multiple lines. Currently not supported for multiple metrics.
  • options -
    • horizontal-grid-lines - Set to either true or false. If true, it will set grid-lines in the background (going horizontally) at this color: #ebebeb.
    • vertical-grid-lines - Also set to either true or false. Does the same thing but vertically.
    • x-axis-ticks-angle - Allows you to rotate the x-axis-labels on the line chart. This must be an integer; generally I use -45 to rotate labels that are a little long to fit normally.
    • orientation - Can be set to either horizontal or vertical. If horizontal, the bar chart will be set to go from left to right.

Filled Line Chart

image

Example Code

 { #Filled line chart
    "data": df3,
    "type": "filledLine",
    "name": "Stock Prices by Company - Filled Line Chart",
    "filename": 'charts/stock-prices-by-company-filled',
    "metrics": [
        {"name": "price", "prettyName": "Stock Price", "method": "mean"}
    ],
    "color": "company",
    "axis": "year",
    "x-axis-title": 'Year',
    "y-axis-title": "Average Stock Price",
    "description": "Grouping by additional variables is easy",
    "filters": [
        {"variable": "year", "operation": ">=", "value": "1970", "type":"int"}
    ],
     "item-index": {
        'slide': 0,
        'title': 0,
        'chart': 10,
        'description': 11
    }
}

Filled Line Chart Specifics Everything for this chart is the same as a normal line chart.

Facet Line Chart

image

Example Code

 { # Facet Line Chart
    "data": df3,
    "type": "facetLine",
    "name": "Multiple Comanpy Stock Prices",
    "filename": 'charts/many-stock-prices-facet',
    "metrics": [
        {"name": "price", "prettyName": "Stock Price", "method": "mean"}
    ],
    "axis": "year",
    'y-axis-title': 'Stuff',
    "description": "Faceting can be fun",
    "filters": [
        {"variable": "year", "operation": ">=", "value": "1970", "type":"int"}
    ],
    'facet': 'company',
    'facet-direction': 'rows',
    "options": {
        'horizontal-grid-lines': 'true',
        'vertical-grid-lines': 'true',
        'color-grouping': 'facet',
        'facet-spacing': 0.2
    },
     "item-index": {
        'slide': 0,
        'title': 0,
        'chart': 10,
        'description': 11
    }
}

Facet Line Chart Specifics

  • facet - Determines the variable used for faceting. This must align to one of the column names in your dataframe.
  • facet-direction - Either set to rows or cols. This tells the function if you want to have multiple rows or multiple columns.
  • options -
    • color-grouping - Either set to metric or facet. If set to metric, colors will be grouped by metric. If set to facet, you will have one metric for each facet.
    • horizontal-grid-lines - Set to either true or false. If true, it will set grid-lines in the background (going horizontally) at this color: #ebebeb.
    • vertical-grid-lines - Also set to either true or false. Does the same thing but vertically.
    • facet-spacing - Used to define spacing between facets. Should be a number between 0.01 and 0.4. Anything above 0.4 is likely going to be too large.

Facet Bar Chart

image

Example Code

 { # Facet Bar chart
    "data": df4,
    "type": "facetBar",
    "name": "Avg Spray Effictiveness by Type",
    "filename": 'charts/spray-by-type',
    "metrics": [
        {"name": "count", "prettyName": "Effectiveness", "method": "mean"}
    ],
    "axis": "spray",
    "facet": 'category',
    "facet-direction": 'rows',
    "x-axis-title": "Effectiveness",
    "description": "this slide has data on it!",
    "options": {
        'color-grouping': 'facet'
    },
    "item-index": {
        'slide': 0,
        'title': 0,
        'chart': 10,
        'description': 11
    }
    
}

Facet Bar Specifics

  • facet - Determines the variable used for faceting. This must align to one of the column names in your dataframe.
  • facet-direction - Either set to rows or cols. This tells the function if you want to have multiple rows or multiple columns.
  • options -
    • color-grouping - Either set to metric, axis, or facet. If set to metric, colors will be grouped by metric. If set to axis, colors will be set by each point on the categorical axis (generally x). If set to facet, you will have one metric for each facet.
    • horizontal-grid-lines - Set to either true or false. If true, it will set grid-lines in the background (going horizontally) at this color: #ebebeb.
    • vertical-grid-lines - Also set to either true or false. Does the same thing but vertically.
    • facet-spacing - Used to define spacing between facets. Should be a number between 0.01 and 0.4. Anything above 0.4 is likely going to be too large.

Facet Filled Line Chart

image

Example Code

 { #Facet Filled line chart
    "data": df6,
    "type": "facetFilledLine",
    "name": "Facet Filled Line",
    "filename": 'charts/stock-prices-by-company-filled',
    "metrics": [
        {"name": "price", "prettyName": "Stock Price", "method": "mean"}
    ],
    "facet": 'company',
    "facet_direction": 'rows',
    "axis": "year",
    "x-axis-title": 'Year',
    "y-axis-title": "Average Stock Price",
    "description": "Grouping by additional variables is easy",
    "filters": [
        {"variable": "year", "operation": ">=", "value": "1970", "type":"int"}
    ],
     "item-index": {
        'slide': 0,
        'title': 0,
        'chart': 10,
        'description': 11
    }
}

Facet Filled Line Specifics Everything for this chart is the same as a normal line chart.

Table

image

Example Code

 { # Table
    "data": df5,
    "type": "table",
    "name": "Example of a Table",
    "fill_color": colorDf,
    "column_formats": ['string', 'string', 'money'],
     "item-index": {
        'slide': 2,
        'title': 0,
        'chart': 12,
        'description': 11
    },
    'header_fill_color': '#051e4a',
    'header_text_color': '#ffffff',
    'text_color': '#051e4a',
    'top_offset': 1000
}

Table Specifics

  • data - This is still a pandas dataframe, but this time is an exact copy of what the table will look like. Other charts allow you to aggregate bigger dataframes, where as the table chart needs you to have the dataframe in the exact format before running this code.
  • fill_color - The global fill color for all cells
  • header_fill_color - The fill color specifically for header cells
  • header_text_color - The text color specifically for header cells
  • text_color - The global text color for all cells
  • header_font_size - Should be an integer and will be the font size of the header rows
  • text_font_size - Should be an integer and will be the font size of the rest of the data
  • top_offset - The amount you want to offset the placement of the table. The entire size of a slide is 5,143,500, so I recommned using a number no smaller than 1,000 (although you can).

Formatting Data

In the example above you can see a variable called column_formats. This expects an array and can be given the following values:

  • number - Will format this as a whole number, no commas
  • money - Whole number with a dollar sign as the beginning.
  • percent - Will multiply the number by 100 adn add a percent sign. No decimal places
  • twoDigitNum - A number rounded to the nearest two digits.
  • date - A date formatted as mm/dd/yyyy. This requires your column is formatted as a pandas datetime (pd.to_datetime())

Filtering

Below are some quick notes on how to use the filtering options within the slide object. In general, this library is designed in a way for you to provide an un-aggregated dataset to create a chart. By allowing for this, you can do any type of filtering you want before the dataset gets grouped together.

Below is a quick list of filters and how to use them (these correspond to the type parameter within a filter):

  • int - Used when your value is an integer. Supports operations of >=, <=, and ==
  • str - Used when your value is a string. Supports operations of ==
  • date - Used when your value is a datetype. This is assuming you have formatted your column using pandas datetime (pd.to_datetime()). Supports operations of >=, <=, and ==
  • list - Used when you want to filter on multiple options (can equal A,B, or c). Set the operation to either in or not in. Then create an array and use that for the value parameter. Below is what this could look like:

{'variable': 'year', 'type':'list', 'operation': 'not in', 'value':[1970, 1971, 1972]}