Numpy#

Numpy is a library that brings array operations to Python. This page focuses on key library concepts.

import numpy as np
from IPython.display import HTML
header_template = "<text style='font-size:20px'>{}</text>"

Axis#

Array nesting is a numpy feature. Each level of nesting has a different number, starting from the outermost to the innermost.

Description#

The following cell shows the array form of 3 arrays, each having two arrays each having 4 elements:

arr = np.array([
    [
        [1,2,3,4],
        [3,3,2,1]
    ],
    [
        [3,2,1,3],
        [5,3,2,1]
    ],
    [
        [3,2,1,3],
        [5,3,2,1]
    ]
])
arr.shape
(3, 2, 4)

So the position on the shape corresponds to the axis:

  • axis 0 is the array of the two dimensional arrays;

  • axis 1 is the array of the one dimensional arrays;

  • axis 2 is the array of the end elements.

Aggregations#

There are many functions in numpy that somehow aggregate values along an axis - they usually take an axis argument. For example, we will consider the aggregation function numpy.mean.

The following cell applies np.mean along all possible axes.

arr = np.array([
    [
        [1,2,3,4],
        [3,3,2,1]
    ],
    [
        [3,2,1,3],
        [5,3,2,1]
    ],
    [
        [3,2,1,3],
        [5,3,2,1]
    ]
])

display(HTML(header_template.format("axis=0")))
display(np.mean(arr, axis=0))
display(HTML(header_template.format("axis=1")))
display(np.mean(arr, axis=1))
display(HTML(header_template.format("axis=2")))
display(np.mean(arr, axis=2))
axis=0
array([[2.33333333, 2.        , 1.66666667, 3.33333333],
       [4.33333333, 3.        , 2.        , 1.        ]])
axis=1
array([[2. , 2.5, 2.5, 2.5],
       [4. , 2.5, 1.5, 2. ],
       [4. , 2.5, 1.5, 2. ]])
axis=2
array([[2.5 , 2.25],
       [2.25, 2.75],
       [2.25, 2.75]])

When aggregating along an axis using the np.mean function, we apply the function to all elements that have the same position on the other axes but vary along the specified axis. This means that for each set of elements aligned along the specified axis, the mean is computed and a single value is returned for that set, reducing the dimensionality of the array by one along the specified axis. So for:

  • axis=0 output size is (2,4) (no zero axis).

  • axis=1 output size is (3,4) (no one axis).

  • axis=2 output size is (3,2) (no two axis).

From function#

You can set a custom rule for creating a numpy array by using np.fromfunction. The first argument specifies a function that takes the index of the row and the index of the column and must return elements that typically depend on the indexes.


The following cell shows the process of generating a sequential array using this method.

np.fromfunction(lambda i, j: i*3 + j, (4,4))
array([[ 0.,  1.,  2.,  3.],
       [ 3.,  4.,  5.,  6.],
       [ 6.,  7.,  8.,  9.],
       [ 9., 10., 11., 12.]])