NumPy (Numerical Python) is one of the famous libraries, which is used heavily in data science. If you have worked in any data science problem you might have heard about them.

PS: Second part is also released and linked at the end of this post.

What is NumPy?

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. Source

So What? Why to use NumPy, isn't Python lists already there?

There are several reasons to use NumPy over Python lists.

  1. NumPy is fast, like blazing fast then Python lists.
  2. It facilitates advanced mathematical and similar scientific and computational operations on large numbers. NumPy handles it with less code and is executed more efficiently.

Let's see a glimpse about, how NumPy is handy?

We have two lists we want to multiply respective elements and store it in another list. it is given that both the lists have same length. If we want to perform the same in Python lists we will do something similar to this.

rows = 5
a = [1, 2, 3, 4, 5]
b = [6, 7, 8, 9, 10]
c = []

for i in range(row):
  c[i] = a[i]*b[i]

Now assume we have a two dimensional array and we want to multiply restive elements of the given matrices and form a new matrix. Then we would probably write something like this.

for i in range(rows):
  for j in range(columns):
    c[i][j] = a[i][j]*b[i][j]

Now here is the catch, with numpy we can simply write:

c = a * b

It is cool, isn't it? 😇

Difference between NumPy and Python Lists?

NumPy Python List
NumPy provides ndarray, a homogeneous n-dimensional array object, with methods to efficiently operate on it. Lists are used to store multiple items in a single variable.
NumPy arrays have a fixed size at creation Python lists can grow dynamically
The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory. The elements in Lists can be of different data types.
NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.

Install NumPy

If you use pip, you can install NumPy with:

pip install numpy

If you use conda, you can install NumPy from the defaults or conda-forge channels:

# Best practice, use an environment rather than install in the base env
conda create -n my-env
conda activate my-env
# If you want to install from conda-forge
conda config --env --add channels conda-forge
# The actual install command
conda install numpy

Read More

Playing with NumPy

import numpy as np

Let's create an equivalent of list in NumPy.

normal_list = [1, 2, 3, 4, 5]
np_list = np.array(normal_list)

Let's check the speed of NumPy

normal_list = list(range(1, 10000000))
np_list = np.array(normal_list)

Let's subtract 1 from eash element

List Comprehension
%%time
a = [x-1 for x in normal_list]

Output
CPU times: user 426 ms, sys: 132 ms, total: 558 ms
Wall time: 571 ms
With Loop
%%time
a = []
for x in normal_list:
    a.append(x-1)
Output
CPU times: user 1.12 s, sys: 155 ms, total: 1.28 s
Wall time: 1.28 s
With NumPy
%%time
a = np_list-1
Output
CPU times: user 109 ms, sys: 20.3 ms, total: 129 ms
Wall time: 129 ms

Learn More

Why NumPy is so fast?

  1. Vectorized Code
  2. Less lines of code resulting in less bugs
  3. Code resembles standard mathematical notations
  4. Pythonic Code

Indexing of NumPy Arrays

Given a NumPy array how to access specific indexes of the given np_array. We can access the np_array as we would with the python lists.

a = np.array([1, 2, 3, 4, 5, 6, 7])

#Let's try to access the first, third, last and second last elements
print(a[0], a[2], a[-1], a[-2])

# Let's try it with a multi-dimensional array.

new_a = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])

# Let's try to access 2nd row and 2nd column i.e. 4
new_a[1, 1]
# Output: 4

# Let's try to access 3rd row and 1st column i.e. 5
new_a[2, 0]
# Output: 5

# Let's try to access the whole second row i.e. [3, 4]
new_a[1]
# Output: array([3, 4])

# Let's try to access the whole second column i.e. [2, 4, 6]
new_a[:, 1]
# Output: array([2, 4, 6])

It is all cool, right? Every example is self explanatory in it's own except the last example. In the index we have passed, : what is it? why it did not throw any syntax error and so on. Well. let's take a look at that as well.

Accessing np_array or lists via indexes

Listed items can be accessed by referring to their index number

a = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
print(a[1])

# Output: [1, 2]

Negative Indexing

a = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
print(a[-1])

# Output: [5, 6]

Range of Indexes

a = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
print(a[1 : 2])

# Output: [[3, 4]]

Note: The search will start at index 1 (included) and end at index 2 (not included).

By leaving out the start value, the range will start at the first item. And, by leaving out the last value, the range shall stop at the last item.

Range of Negative Indexes

a = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])
a[-3:-1]

# Output: [
    [1, 2],
    [3, 4]
]

Note: The search will start at index -3 (included) and end at index -1 (not included).

So in a nutshell start:end signifies that we want to access a range of indexes from start index to end index in which the start index in inclusive and end index is exclusive.

Python - Access List Items

Coming back to our original problem new_a[:, 1] what is this in the above solutions?

So in the given code snippet we want to access 1st column of all the rows.

Bonus

There is a difference between a[start:end] and a[row_start:row_end, column_start:column_end].

Guess what? We have a video about the same. Do check.

Here is the part 2

What is NumPy? Why and How to use it? Part 2
This post is a continuation of the previous post about NumPy introduction. If you have not read that, then it is advised to kindly go read them first.