Installing Python for Data Science

Python has some fantastic libraries with powerful data analysis tools.  can be difficult to install.  These are the installation instructions for a Mac.

  1.  You’re going to need to install the version of XCODE that matches your operating system.  XCODE comes with the development and command line tools needed for Python.  You can find XCODE at https://developer.apple.com/download/more/.
  2.  Homebrew is the best way to install and manage Python so you’re going to need to install this first. In your terminal run: $ ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”.  Just copy and paste this code into your command line tool and Homebrew will automatically download and install.
  3.  Install Python 3 using Homebrew by running this command: $ brew install python to install the latest stable version of Python onto your Mac.
  4. Check that you’re command shell is running the version of Python you just installed.  The latest stable version right now is Python 3.8.5.

If your command shell is still running an older version of Python, here’s how to change the default version in you command shell.  Find the path to all the installed Python libraries by running the command ~ % ls -l /usr/local/bin/python*.

You’ll see output like this:

Mar 14 2020 /usr/local/bin/python -> /usr/local/bin/python3.7
Sep 13 14:26 /usr/local/bin/python3 -> ../Cellar/python@3.8/3.8.5/bin/python3
Sep 13 14:26 /usr/local/bin/python3-config -> ../Cellar/python@3.8/3.8.5/bin/python3-config
Sep 13 14:26 /usr/local/bin/python3.8 -> ../Cellar/python@3.8/3.8.5/bin/python3.8
Sep 13 14:26 /usr/local/bin/python3.8-config -> ../Cellar/python@3.8/3.8.5/bin/python3.8-config

You want to the command shell to point to the one ending in the latest version without “-config” which in my case is  /usr/local/bin/python3.8. 

I ran the command ~ % ln -s -f /usr/local/bin/python3.8 /usr/local/bin/python to redirect my command shell.

If you have a different version installed just change the final number of the Python version you have where the X is: /usr/local/bin/python3.X

Now you’re going to need to install some data science libraries to get going:

  1. To install Matplotlib, the 2D Python library that can plot bar charts,  scatterplots, errorcharts, histograms, and more with just a few lines of code. Past the command pip3 install matplotlib into your command shell.
  2. To install Pandas, the data analysis and data structure toolkit, us the command pip3 install pandas in your command shell.

There are many more tools available but this is all you need to start doing some serious data analysis.