Installing the data science stack for Python on a Macbook Pro starts with installing the latest
Anaconda build. Once installed, I set up some Anaconda environments to handle the different python versions needed for different purposes.
In order to work with Git (to come), and the different conda environments in a handy way, there are some terminal customizations to utilize. To deploy the terminal customizations, I use a terminal replacement called
Oh My Zsh, which themes the Iterm2 terminal with the active conda environment and Git repo & status.
Gitis then installed and used with a corresponding
GitHubaccount to enable version control.
VS Code (comes with Anaconda), or a suitable editor, along with
Jupyter Lab (comes with Anaconda) and a suitable command line editor (CLE) are needed, such as
vim (comes stock with MacBook Pro).
Rather than dealing with troubling installs of different environments for things like various flavors of SQL, NoSQL, Google Cloud or AWS, I install
Docker. Docker has some advantages over virtualization by using containers. Docker, for example, can be installed on an instance of an AWS EC2 server for quick and compatible installation of big data packages such as Docker container versions of MongoDB and Spark. Along the way, several Python libraries are needed.
I signed up for an
AWS account to learn how to use AWS S3 cloud storage and Elastic Compute Cloud (EC2) server instances for discount rates for working with big data.
For near real-time data collection, several cloud developer APIs sounded attractive, such as from
New York Times (NYT)
SoundCloud, . Where APIs are not provided, I installed python packages for web-scraping and developed data-mixing pipelines for later use.
AWS, its components EC2 (for the server and its local memory), and S3 (for the cloud memory storage) requires a few packages to be able to interface Python. There is a docker container, a command line interface, and some python libraries.
conda install -c conda-forge awscli # To avoid the `pip install` conda install s3fs conda install boto3 docker