What the f*ck is Numpy? - A Complete Tutorial
NumPy is a library that can be used to perform a wide variety of mathematical operations on arrays.
Table of contents
- 01 . Why do I even care about Numpy? ๐
- 02 . Hello World ๐
- 03 . Dimensions in an Array ๐
- 04 . Accessing Array Elements ๐
- 05 . Slicing an Array ๐ช
- 06 . Numpy Methods โ๏ธ
- 07 . Numpy Random โ
- 08 . Concatenating Arrays ๐ค
- 09 . Mathematics of Arrays ๐
- 10 . Not a Number โ
- 11 . Where? ๐ค
- 12 . Fancy Indexing ๐
- 13 . Sorting Arrays ๐
NumPy? It's like the superhero of Python libraries for handling arrays and more. Besides arrays, it's got your back for diving into stuff like linear algebra, Fourier transforms, and matrices.
This cool tool was born in 2005, thanks to Travis Oliphant. The best part? It's open source, and free for you to dive into. And yeah, NumPy? Short for Numerical Python.
01 . Why do I even care about Numpy? ๐
Okay, so Python's got lists that act like arrays, but let's be real, they're kinda sluggish. Enter NumPy - the speed demon of arrays. It's up to 50 times faster than those basic Python lists. In NumPy land, they've got this cool thing called ndarray (pronounced 'en-dee-ray'). It's packed with handy functions that make working with arrays a breeze. And hey, in the world of data science, arrays are the real MVPs. Speed and saving resources? Super crucial there.
Okay, so why are Numpy arrays fast? They chill together in one memory spot, unlike lists that scatter all over. This setup makes them super easy to access and tweak quickly. Computer science folks call this vibe 'locality of reference.' This is why NumPy outpaces lists in speed. Plus, it's fine-tuned to vibe with the latest CPU setups, giving it that extra speed boost.
The source code for NumPy is located at this github repository:
https://github.com/numpy/numpy
02 . Hello World ๐
To start with NumPy, open your Python IDE and import the NumPy module.
import numpy
Now NumPy is imported and ready to use. NumPy is usually imported under the np
alias.
import numpy as np
NumPy is used to work with arrays. The array object in NumPy is called ndarray
. We can create a NumPy ndarray
object by using the array()
function.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
The output of this is :
The type( ) is a built-in Python function that tells us the type of the object passed to it.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(type(arr))
The output of this would be:
03 . Dimensions in an Array ๐
Talking about dimensions, there can be 0-D arrays, 1-D arrays, 2-D arrays, 3-D arrays, and so on... However, we will be talking only about 0-D, 1-D, 2-D and 3-D Arrays.
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.
import numpy as np
arr = np.array(42)
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
An array that has 1-D arrays as its elements is called a 2-D array.
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
An array that has 2-D arrays (matrices) as its elements is called 3-D array. These are often used to represent a 3rd order tensor.
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
NumPy Arrays provides the ndim
attribute that returns an integer that tells us how many dimensions the array has.
import numpy as np
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)
The output of this would be:
04 . Accessing Array Elements ๐
You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has an index of 0, the second has an index of 1, etc.
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[0])
The output of this would be 1
.
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr[1])
The output of this would be 2
.
To access elements from 2-D arrays we can use comma-separated integers representing the dimension and the index of the element.
Think of 2-D arrays like a table with rows and columns, where the dimension represents the row and the index represents the column.
import numpy as np
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('2nd element on 1st row: ', arr[0, 1])
The output of this would be:
import numpy as np
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('5th element on 2nd row: ', arr[1, 4])
The output of this would be:
Use negative indexing to access an array from the end.
import numpy as np
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('Last element from 2nd dim: ', arr[1, -1])
The output of this would be:
05 . Slicing an Array ๐ช
Slicing in Python means taking elements from one given index to another given index. We pass a slice instead of an index like this: [start:end]
. We can also define the step, like this: [start:end:step]
.
If we don't pass the start it's considered 0 If we don't pass the end it's considered the length of the array in that dimension. If we don't pass the step it's considered 1.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])
The output of this would be:
It must be noted that the array is sliced from the start index to the last index minus 1. This means that the last element is not included. The result includes the start index but excludes the end index.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
arr[4:]
This will slice elements from index 4 to the end of the array.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[:4])
This will slice elements from the beginning to index 4 (not included). We can have negative slicing as well, using negative index values. However, if step value is negative, then the output list is reversed.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[7:2:-1])
The output of this is:
We can slice 2D Arrays in the same way.
import numpy as np
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[1, 1:4])
The output of this would be:
Let us take another example.
import numpy as np
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[0:2, 1:4])
The output of this would be:
06 . Numpy Methods โ๏ธ
Instead of coding everything ourselves, we can use NumPy functions to simplify our tasks. The following is a list of common NumPy methods, which we shall be overviewing.
Type of Operation | Methods |
Array Creation Functions | np.array() , np.zeros() , np.ones() , np.empty() , np.arange() , np.full() |
Array Manipulation Functions | np.reshape() , np.transpose() |
Array Mathematical Functions | np.add() , np.subtract() , np.sqrt() , np.power() |
Array Statistical Functions | np.median() , np.mean() , np.std() , np.var() |
Array Input and Output Functions | np.save() , np.load() , np.loadtxt() |
06.01. Array Creation Functions
We have already seen np.array( )
. It takes in a list and converts it into an nd-array. When we talk about np.zeros( )
, it takes in a tuple or a number. If we give it a tuple, then it creates a 2-D Array, else it creates a 1-D Array. If we give it a number, say 5, then it will create an array of 5 zeros. If we give it a tuple, say (3,4), it will create a 2-D Array of zeros with 3 rows and 4 columns. Similar is the case with np.ones( )
, with the only difference being that instead of zeros, it creates an array of ones.
import numpy as np
# create an array filled with zeros using np.zeros()
array1 = np.zeros(5)
print("\nnp.zeros():\n", array1)
# create an array filled with ones using np.ones()
array2 = np.ones((2, 4))
print("\nnp.ones():\n", array2)
This will give the following output:
Now let's talk about np.arange( )
. It takes in start value, stop value, step value, similar to what we have seen during indexing. It then creates an array and populates it with values in that range.
import numpy as np
array1 = np.arange(5,10,3)
print("\nnp.arange():\n", array1)
The output of this would be:
Now suppose that we have to create an array that is full of any number we want. Then we can use np.full( )
. The first argument it takes in is the same as np.zeros( )
or np.ones( )
and the second argument is the number we want to fill in.
import numpy as np
array1 = np.full((5,3),5)
print("\nnp.full():\n", array1)
The output of this would be:
06.02. Array Manipulation Functions
First of all, we will talk about np.reshape( )
. NumPy array reshaping simply means changing the shape of an array without changing its data. Let's say we have a 1D array.
np.array([1, 3, 5, 7, 2, 4, 6, 8])
We can reshape this 1D array into an N-d array as
# reshape 1D into 2D array
# with 2 rows and 4 columns
[[1 3 5 7]
[2 4 6 8]]
# reshape 1D into 3D array
# with 2 rows, 2 columns, and 2 layers
[[[1 3]
[5 7]]
[[2 4]
[6 8]]]
Here, we can see that the 1D array has been reshaped into 2D and 3D arrays without altering its data.
The syntax of NumPy array reshaping is
np.reshape(array, newshape,)
Here,
array
- input array that needs to be reshaped,newshape
- desired new shape of the array. This is a tuple.
We use the reshape()
function to reshape a 1D array into a 2D array. For example,
import numpy as np
array1 = np.array([1, 3, 5, 7, 2, 4, 6, 8])
# reshape a 1D array into a 2D array
# with 2 rows and 4 columns
result = np.reshape(array1, (2, 4))
print(result)
The output of this would be:
We can also flatten an array using np.reshape( )
. Flattening an array simply means converting a multidimensional array into a 1D array. To flatten an N-d array to a 1-D array we can use reshape()
and pass "-1" as an argument.
import numpy as np
# flatten 2D array to 1D
array1 = np.array([[1, 3], [5, 7], [9, 11]])
result1 = np.reshape(array1, -1)
print("Flattened 2D array:", result1)
The output of this would be:
Let's now talk about np.transpose( )
. The numpy.transpose() function is one of the most important functions in matrix multiplication. This function permutes or reserves the dimension of the given array and returns the modified array.
The numpy.transpose() function changes the row elements into column elements and the column elements into row elements. The output of this function is a modified array of the original one.
import numpy as np
a=np.arange(6).reshape((2,3))
print(a)
b=np.transpose(a)
print(b)
The output of this would be:
We will talk about statistical methods and mathematical methods, when we talk about Mathematics of Arrays.
07 . Numpy Random โ
The random is a module present in the NumPy library. This module contains the functions which are used for generating random numbers. This module contains some simple random data generation methods, some permutation and distribution functions, and random generator functions.
07.01. randn( )
This function of the random module returns a sample from the "standard normal" distribution. It takes in two values: the number of rows and the number of columns.
import numpy as np
a=np.random.randn(2,2)
print(a)
The output of this would be completely random (technically, pseudo-random, but let's not talk about it and confuse ourselves) and output on one device would differ from another. Let's look at a sample output.
07.02. randint( )
This function of random module is used to generate random integers from inclusive(low) to exclusive(high).
import numpy as np
a=np.random.randint(1,3, (8,10))
print(a)
The output of this would be as follows:
07.03. random( )
This function of the random module is used to generate random float numbers in the half-open interval [0.0, 1.0).
import numpy as np
a=np.random.random((5,10))
print(a)
The output of this would be:
There are lots of other methods in numpy.random( )
. However, we will leave those methods for another day. Perhaps I will write about those methods in some other article.
08 . Concatenating Arrays ๐ค
Concatenating means putting the contents of two or more arrays in a single array. In SQL we join tables based on a key, whereas in NumPy we join arrays by axes. We pass a sequence of arrays that we want to join to the concatenate()
function, along with the axis. If the axis is not explicitly passed, it is taken as 0.
First of all, let us understand the difference between axis = 0 and axis = 1. axis = 0 implies that we are joining arrays as shown below:
axis = 1 implies that we are joining arrays as shown below:
import numpy as np
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
arr = np.concatenate((arr1, arr2), axis=0)
print(arr)
arr = np.concatenate((arr1, arr2), axis=1)
print(arr)
The output of this would be:
09 . Mathematics of Arrays ๐
Now that we are finally here - let's discuss about the mathematics of arrays. If the shapes of the arrays are the same, then they can be added, subtracted, divided, and multiplied using arithmetic operators. In this case, operations are applied between corresponding elements in each array.
import numpy as np
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
print(arr1+arr2)
The output of this would be:
NumPy provides a set of standard trigonometric functions to calculate the trigonometric ratios (sine, cosine, tangent, etc.)
import numpy as np
# array of angles in radians
angles = np.array([0, 1, 2])
print("Angles:", angles)
# compute the sine of the angles
sine_values = np.sin(angles)
print("Sine values:", sine_values)
# compute the inverse sine of the angles
inverse_sine = np.arcsin(angles)
print("Inverse Sine values:", inverse_sine)
The output of this would be:
NumPy provides a wide range of arithmetic functions to perform on arrays. The only condition is that the arrays must be of equal dimensions.
import numpy as np
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
print(np.add(arr1,arr2))
This is essentially the same as arr1 + arr2. We can perform statistical operations on NumPy as well. For instance, let's consider np.mean( )
.
import numpy as np
arr1 = np.array([[1, 2], [3, 4]])
print(np.mean(arr1))
This shall give the mean of all values inside the array, i.e., 2.5. However, this function also has an optional attribute called axis. If axis is set to 0, then, it implies that mean is calculated for every row and displayed in an array.
If the axis is set to 1, this implies that the mean is calculated for every column and displayed in an array.
Please note that we can write np.mean(arr,axis=1)
, as well as arr.mean(axis=1)
. Both of them imply the same thing. Similarly, we can write np.transpose(a)
, as well as a.transpose( )
. The only difference is that if we use functions with np, then there would be no change in the original array. Else, there will be a change in the original array.
Now I am not going to talk about every statistical function as their name makes it obvious of their use cases, and all of them are used in a similar way.
10 . Not a Number โ
If you have a keen eye, you must have noticed in one of the examples, something known as nan was written. NaN stands for Not a Number. The np.nan is a constant representing a missing or undefined numerical value in a NumPy array. It stands for not a number and has a float type. Please note that two nan values are not the same. This means that if I print the following, the result will be false.
np.nan == np.nan #Returns False
The question is two np.nan values are not equal, how do we programmatically find out where nan values are located. For that, we can use a function called np.isnan( ). It returns an nd-array with True in places where nan was present and false in places where nan was not present.
import numpy as np
arr1 = np.array([[1, np.nan], [3, 4]])
print(np.isnan(arr1))
The output of this would be:
11 . Where? ๐ค
Just like we have isnan( ) for NaN values, do we have something for values that are not NaN? Well... we have a special function in NumPy called where. numpy.where( ) takes in a boolean argument and returns an array of indices wherever the condition holds true.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x = np.where(arr == 4)
print(x)
print(arr[x])
The output of this would be:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x = np.where(arr%2 == 1)
print(x)
The output of this would be:
12 . Fancy Indexing ๐
If you observed carefully, in the example above, we used an array to extract elements from another array. This is known as Fancy Indexing in NumPy.
import numpy as np
# create a numpy array
array1 = np.array([1, 2, 3, 4, 5, 6, 7, 8])
# select elements at index 1, 2, 5, 7
select_elements = array1[[1, 2, 5, 7]]
print(select_elements)
The output of this would be:
13 . Sorting Arrays ๐
Sorting means putting elements in an ordered sequence.
An ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.
The NumPy ndarray object has a function called sort()
, that will sort a specified array.
import numpy as np
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
The output of this would be:
To sort an array in descending order, we can just reverse this array using slicing.
import numpy as np
arr = np.array([3, 2, 0, 1])
print(np.sort(arr)[::-1])
The output of this would be:
We are now done with basics of NumPy. There are hundreds of more operations in NumPy. For instance, we can use np.dot(arr1,arr2)
to find dot product between arr1 and arr2 or we can use np.cross(arr1,arr2)
to find cross product betweeen the two arrays. However, we need not to study each and every one of the methods. We can always take help from Google.