Journal Entry Week 10 20221031 - klmartinez/DSF GitHub Wiki

For Journal Club on Monday we read and discussed A Guide to Data Pipelines. It was an interesting read but felt a little beyond where I am at as far as creating pipelines. I think it will be a little while until I am at this level but feel that I am taking productive steps towards this goal.

Notes from Tuesday's lesson:

Weekly DSF Small Group Challenge for 2022 October 2:

  1. Using the Python computing environment of your choice, the challenge for today is to reproduce the Exploratory Data Analysis with Pandas & Seaborn Jupyter Notebook available on Github (you may already have downloaded it)
  2. Then replicate using either a different dataset included in Seaborn (using both commands: sns.get_dataset_names( ) and sns.load_dataset( ) ) or select one of your own datasets, with at least 2 numeric variables and at least 2 categorical variables. Reproduce similar plots using the latest version of Seaborn (at least version 0.12 or newer). Please use !pip install seaborn command from within a Jupyter Notebook code cell or use conda install -c anaconda seaborn if you are using Anaconda Python.
  3. After this upload the Jupyter Notebook into one of your Github repository and link it in a Github Wiki Journal entry. Please write a brief text describing what were your difficulties in this activity; what did you find interesting; what you’ve learned; and where would this be useful for your research.

Here is a link to my JupyterNotebook that also includes a summary description of this acitivty.

FOSS: