A Quick Guide to NumPy Functions

NumPy, short for Numerical Python, is a library that is like a superhero cape for Python when dealing with numbers. It is a powerful Python library designed to help you work with numbers effortlessly. Whether you’re dealing with structured or unstructured data (eg. image data), NumPy has your back. Most importantly, it is easy to use and plays well with other data science tools. So whether you’re a beginner or a seasoned data pro, NumPy is your must-go-to ally in the world of data science. In this article, we will try the best to give you a quick guide to NumPy functions.

Table of Contents

Introduction

In general, the core of NumPy is all about arrays, which act as a container for numbers. Unlike regular Python lists or tuples, NumPy provides `numpy.ndarray` data structure which allows you to efficiently store and manipulate large arrays of data. The word `ndarray` is a short form for N-dimensional array, meaning these arrays can have one or more dimensions, making them suitable for various types of data.

Imagine that we have a dataset either in 2D, 3D or any dimension, and we want to manipulate the data to meet our needs. Instead of doing our own calculations, NumPy provides a broad range of mathematical operations, ranging from basic arithmetic (addition, subtraction, multiplication or division) to complex data analysis such as Statistics or Linear Algebra.

NumPy Fundamentals

1. Create NumPy Array

Convert existing data into NumPy Array

Let’s get started with the simplest and most used method, which is converting existing data into one. If we have a Python list, we can convert it to NumPy by simply using the ‘np.array()’ function.

# covnert Python list to NumPy array
import numpy as np

python_list = [1, 2, 3, 4, 5]
numpy_array = np.array([python_list])

In the example above, I’ve only shown how to convert a 1D python list to NumPy, but you could use the same method for any dimensional data.

What if we don’t have a Python List or we want to create a large matrix from scratch? NumPy also provides some built-in functions for array generation. These functions include

1D array creation functions
  • np.arrange()
  • np.linspace()
# create an array with evenly spaced values within a given interval
>>> np.arange(0, 5, 0.5)
array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])


# create an array with evenly spaced numbers over a specified interval
>>> np.linspace(0, 5, 5)
array([0.  , 1.25, 2.5 , 3.75, 5.  ])
2D array creation functions
  • np.eye()
  • np.diag()
# create an identity matrix 
>>> np.eye(3)
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])


# create 2D array with given diagonal elements
>>> np.diag([1, 2, 3])
array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])
General ndarray creation functions
  • np.zeros()
  • np.ones()
  • np.random()
  • np.indices()

These functions accept a desired array shape and data type as parameters. For example, np.zeros(2) returns a 1D array with 2 zeros. The default data type is float64, but you could also change it by changing the dtype argument. We could also replace it with (a, b) which returns an a-by-b array. You also try np.zeros((1, 2, 3)) and see what it returns (supposed to be a 3D array).

# create an array of given shape and type, filled with zeros
>>> np.zeros((2, 2))
array([[0., 0.],
       [0., 0.]])

# create an array of given shape and type, filled with ones
>>> np.ones((2, 3))
array([[1., 1.],
       [1., 1.],
       [1., 1.]])

# create an array of given shape and type, without initializing entries
>>> np.empty([2, 2])
array([[ -9.74499359e+001,   6.69583040e-309],
       [  2.13182611e-314,   3.06959433e-309]])  

The arrays created above are all 1D, but you could also create different dimensional data by providing the data shape. For example, creating a 2D zero array with “np.zeros((2, 2))”.

2. Indexing in NumPy

When playing with data, we often only want to access specific elements or subsets of data. NumPy offers various indexing techniques to work with arrays efficiently. Here are some common indexing methods:

Basic Indexing

NumPy arrays can be indexed using square brackets and integer indices, similar to Python lists. The indexing starts from 0.

import numpy as np

arr = np.array([0, 1, 2, 3, 4])

# Accessing a single element
element = arr[2]  # Returns 2

# Slicing to get a subset of elements
subset = arr[1:4]  # Returns [1, 2, 3]
Slicing with Steps

You can use slicing with step values to access elements at regular intervals.

import numpy as np

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

# Slicing with a step of 2
subset = arr[1:8:2]  # Returns [1, 3, 5, 7]
Multidimensional Indexing

NumPy supports indexing in multiple dimensions for multidimensional arrays (e.g., matrices). You can use comma-separated indices or separate brackets for each dimension.

import numpy as np

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Accessing a single element
element = matrix[1, 2]  # Returns 6

# Slicing rows and columns
row = matrix[0]  # Returns [1, 2, 3]
col = matrix[:, 1]  # Returns [2, 5, 8]
Boolean Indexing

You can use boolean arrays to filter elements based on a condition. This is particularly useful for selecting elements that satisfy specific criteria.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Boolean indexing to select elements greater than 3
filtered = arr[arr > 3]  # Returns [4, 5]
Fancy Indexing

NumPy allows you to use arrays of integers or other sequences as indices to access or modify multiple elements simultaneously.

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

# Using an array of indices
indices = np.array([0, 2, 4])
selected = arr[indices]  # Returns [10, 30, 50]
Integer Array Indexing

NumPy also provides advanced indexing techniques like integer array indexing, which allows you to create arbitrary arrays using the data from another array.

import numpy as np

arr = np.array([[1, 2], [3, 4], [5, 6]])

# Integer array indexing to select specific elements
selected = arr[[0, 1], [1, 0]]  # Returns [2, 3]

Conclusion

NumPy is a powerful numerical computing library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these elements. Understanding how numpy works would also help us to understand other array-like data structures, eg. Tensor.