Paul Bernert Research Project - PaulBernert/DBNA GitHub Wiki
University of Arizona - Coding Bootcamp - Final Project
Research Objectives
This project aims to take the raw data provided from the Doing Business North America report, calculate a rank and score to determine the 'Ease of Doing Business' (see the Methodology for explanation of what Ease of Doing Business measures). The effectiveness of the ranking/scoring process will be tested in an applied setting--using the calculated 'Ease of Doing Business' ranks to see if it correlates with business activity relative to the local population.
Another primary goal of the project is to use Machine Learning algorithms (through Scikit-learn) to test multiple clustering algorithms to see which locations are the most similar in nature. The two clustering algorithms to be tested are KMeans and Affinity Propagation. After comparing these two clustering structures, the next step is to test the relationship between clusters and the 'Ease of Doing Business' ranks to see if there clusters are formed around ranks (whether ranks are a good representative of how locations are clustered).
These tests were chosen not only to test the effectiveness of the data-set in an analytical environment, but to also see whether regulatory burdens truly have an impact on business starts.
About the Project
The Doing Business North America (DBNA) project annually provides objective measures of the scale and scope of business regulations in 130 cities across 92 states, provinces, and federal districts of the United States, Canada, and Mexico. It uses these measures to score and rank cities in regard to how easy or difficult it is to set up, operate, and shut down a business.
Over the years, researchers have begun to understand how robust measurement and ranking of regulations that either enhance business activity or constrain it can provide substantial insight into economic outcomes. Objective measurements of those regulations have been vital in this understanding. Unlike many studies that measure regulations at the state level, this annual study measures the impact at the city level and does so for over 100 municipal jurisdictions across North America.
The Doing Business North America team collected data on 63 different regulatory and economic indicators across six different categories. The data collected came entirely from official and publicly-available sources.
About the Data
This project manually collects data on primary regulatory burdens businesses face throughout the entire life-cycle of a business, ranging from Starting a Business to eventually Resolving Insolvency if the business were to shut down / go bankrupt. The report contains data on 63 regulatory indicators within the following six categories:
- Starting a Business
- Employing Workers
- Getting Electricity
- Paying Taxes
- Land and Space Use
- Resolving Insolvency
These six categories are then combined to create a catch-all value known as the 'Ease of Doing Business'. This is the value used in all analysis for this project.