Python Packages

picture of python with python logo in corner

Python packages are libraries of code that you can install on your system and import into your code.

There are special libraries for data science like numpy, pandas, and matplot lib.

There are also libraries for machine learning like scikit-learn and keras.

These libraries can save you a ton of work in “reinventing the wheel” – or rather not needing to reinvent the wheel, or logistic regression!

With these libraries you don’t need to be a math whiz to do data science (although knowing math will help you understand what your code is doing and write better code because of it)

To use a package you need to install it.

Python uses Pip to install packages. You can download pip (from somewhere) if it is not already installed on your system.

Then (for python 3) use “pip3 install <name>” to install a package: pip3 install numpy

For Python2, you only needed to call pip install … but not many people are using Python 2 anymore.

Yeah, so once a package is installed you can import it into your program, usually at the very top of your .py scripts: import numpy

Or alias it so you can type a shortcut name when calling methods from the package: import numpy as np

Some packages have standard abbreviations that you should know so you will recognize them when you read other people’s code: numpy as np, pandas as pd, matplotlib as plt for example

Technical details and debate about import numpy vs. import array from numpy being “better” – but that should be it’s own blog post.