Lets import NumPy
Well follow convention and use the name np
.
The crown jewel of NumPy is the ndarray
. The ndarray is a homogeneous n-dimensional array object. What does that mean?
A Python List or a Pandas DataFrame can contain a mix of strings, numbers, or objects (i.e., a mix of different types). Homogenous means all the data have to have the same data type, for example all floating-point numbers.
And n-dimensional means that we can work with everything from a single column (1-dimensional) to the matrix (2-dimensional) to a bunch of matrices stacked on top of each other (n-dimensional).
Lets create a 1-dimensional array (i.e., a vector)
my_array = np.array([1.1, 9.2, 8.1, 4.7])
We can see my_array
is 1 dimensional by looking at its shape
my_array.shape
We access an element in a ndarray similar to how we work with a Python List, namely by that element's index:
my_array[2]
Lets check the dimensions of my_array with the ndim
attribute:
my_array.ndim
Now, lets create a 2-dimensional array (i.e., a matrix)
array_2d = np.array([[1, 2, 3, 9], [5, 6, 7, 8]])
Note we have two pairs of square brackets. This array has 2 rows and 4 columns. NumPy refers to the dimensions as axes, so the first axis has length 2 and the second axis has length 4.
print(f'array_2d has {array_2d.ndim} dimensions') print(f'Its shape is {array_2d.shape}') print(f'It has {array_2d.shape[0]} rows and {array_2d.shape[1]} columns') print(array_2d)
Again, you can access a particular row or a particular value with the square bracket notation. To access a particular value, you have to provide an index for each dimension. We have two dimensions, so we need to provide an index for the row and for the column. Heres how to access the 3rd value in the 2nd row:
array_2d[1,2]
To access an entire row and all the values therein, you can use the :
operator just like you would do with a Python List. Heres the entire first row:
array_2d[0, :]
An array of 3 dimensions (or higher) is often referred to as a tensor. Yes, thats also where Tensorflow, the popular machine learning tool, gets its name. A tensor simply refers to an n-dimensional array. Using what you've learned about 1- and 2-dimensional arrays, can you apply the same techniques to tackle a more complex array?
Challenge
How many dimensions does the array below have?
What is its shape (i.e., how many elements are along each axis)?
Try to access the value 18
in the last line of code.
Try to retrieve a 1-dimensional vector with the values [97, 0, 27, 18]
Try to retrieve a (3,2) matrix with the values [[ 0, 4], [ 7, 5], [ 5, 97]]
mystery_array = np.array([[[0, 1, 2, 3], [4, 5, 6, 7]], [[7, 86, 6, 98], [5, 1, 0, 4]], [[5, 36, 32, 48], [97, 0, 27, 18]]])
.
.
..
...
..
.
.
Solution: Working with Higher Dimensions
This is really where we have to start to wrap our heads around how ndarrays work because it takes some getting used to the notation.
The ndim
and shape
attributes show us the number of dimensions and the length of the axes respectively.
print(f'We have {mystery_array.ndim} dimensions') print(f'The shape is {mystery_array.shape}')
The shape is (3, 2, 4), so we have 3 elements along axis #0, 2 elements along axis #1 and 4 elements along axis #3.
To access the value 18
we, therefore, have to provide three different indices - one for each axis. As such, we locate the number at index 2 for the first axis, index number 1 for the second axis, and index number 3 for the third axis.
mystery_array[2, 1, 3]
The values [97, 0, 27, 18] live on the 3rd axis and are on position 2 for the first axis and position 1 on the second axis. Hence we can retrieve them like so:
mystery_array[2, 1, :]
Finally, to retrieve all the first elements on the third axis, we can use the colon operator for the other two dimensions.
mystery_array[:, :, 0]
With the square brackets serving as your guide, the ndarray is quite difficult to visualise for 3 or more dimensions. So if any of this was unclear or confusing. Pause on this lesson for a minute and play around with the array above. Try selecting different subsets from the array. That way you can get comfortable thinking along the different dimensions of the ndarray.