Pages

Friday 24 March 2023

NumPy in Python

Table of Contents

Introduction

NumPy, short for "Numerical Python", is an open-source Python library used for scientific computing and data analysis. It provides support for large, multi-dimensional arrays and matrices, as well as a large collection of mathematical functions to operate on these arrays. NumPy is an essential tool for any Python programmer working with data, and it is widely used in fields such as data science, machine learning, and scientific computing.

NumPy was first released in 1995 by Jim Hugunin as part of the Python library, Numeric. Numeric was later rewritten and evolved into NumPy by Travis Oliphant, and the first version of NumPy was released in 2006. NumPy is now maintained by a team of developers and has a large and active community of users.

One of the main advantages of NumPy is its ability to handle large datasets efficiently. The library is designed to work with arrays of any dimensionality, and provides efficient algorithms for manipulating these arrays. NumPy also provides a range of mathematical functions that operate on these arrays, making it an ideal tool for scientific computing.

Installing NumPy

Before we begin, we need to make sure that NumPy is installed on our system. To install NumPy, we can use the following command in the terminal:

pip install numpy

Creating NumPy Arrays

The core of NumPy is the ndarray, or n-dimensional array. An ndarray is a collection of elements of the same type that can be accessed and manipulated efficiently. NumPy provides several functions for creating ndarrays, including:

  1. numpy.array(): Convert Python lists or tuples to ndarrays.
  2. numpy.zeros(): Create an array filled with zeros.
  3. numpy.ones(): Create an array filled with ones.
  4. numpy.full(): Create an array filled with a specific value.
  5. numpy.arange(): Create an array with a range of values.
  6. numpy.linspace(): Create an array with a specified number of evenly spaced values.

import numpy as np # Create a one-dimensional array a = np.array([1, 2, 3, 4, 5]) print(a) # Create a two-dimensional array b = np.array([[1, 2, 3], [4, 5, 6]]) print(b) # Create an array filled with zeros c = np.zeros((3, 4)) print(c) # Create an array filled with ones d = np.ones((2, 2)) print(d) # Create an array filled with a specified value e = np.full((3, 3), 7) print(e) # Create an array with a range of values f = np.arange(0, 10, 2) print(f) # Create an array with evenly spaced values g = np.linspace(0, 1, 5) print(g) # Create an array with random values between 0 and 1 h = np.random.rand(3, 3) print(h) # Create an array with random values from a normal distribution i = np.random.randn(2, 2) print(i) # [1 2 3 4 5] # [[1 2 3] # [4 5 6]] # [[0. 0. 0. 0.] # [0. 0. 0. 0.] # [0. 0. 0. 0.]] # [[1. 1.] # [1. 1.]] # [[7 7 7] # [7 7 7] # [7 7 7]] # [0 2 4 6 8] # [0. 0.25 0.5 0.75 1. ] # [[0.13732157 0.38519747 0.94311832] # [0.20336453 0.18802563 0.83540154] # [0.31365463 0.77575852 0.75865154]] # [[-0.83527489 0.41275766] # [-0.0156634 -0.22911079]]



Manipulating NumPy Arrays:

Once an ndarray is created, it can be manipulated using a variety of functions. NumPy provides several functions for manipulating arrays, including:

  1. Indexing and Slicing: Accessing elements or a subset of an array.
  2. Reshaping: Changing the shape of an array.
  3. Concatenation: Joining multiple arrays together.
  4. Splitting: Dividing an array into multiple smaller arrays.

import numpy as np


# Create an array

a = np.array([[1, 2], [3, 4]])


# Reshape the array

b = a.reshape(1, 4)

print(b)


# Transpose the array

c = a.transpose()

print(c)


# Flatten the array

d = a.flatten()

print(d)


# Concatenate two arrays

e = np.array([[5, 6]])

f = np.concatenate((a, e), axis=0)

print(f)


# Split an array

g = np.array_split(f, 2, axis=1)

print(g)


# Sort an array

h = np.array([3, 1, 4, 2, 5])

i = np.sort(h)

print(i)


# Filter an array

j = np.array([1, 2, 3, 4, 5])

k = j[j > 3]

print(k)


# Perform element-wise multiplication

l = np.array([[1, 2], [3, 4]])

m = np.array([[5, 6], [7, 8]])

n = l * m

print(n)


# Perform matrix multiplication

o = np.dot(l, m)

print(o)

# OUTPUT # [[1 2 3 4]] # [[1 3] # [2 4]] # [1 2 3 4] # [[1 2] # [3 4] # [5 6]] # [array([[1, 2], # [3, 4]]), array([[5, 6]])] # [1 2 3 4 5] # [4 5] # [[ 5 12] # [21 32]] # [[19 22] # [43 50]]

Mathematical Operations on NumPy Arrays:

NumPy provides a vast number of mathematical functions that operate on ndarrays. These functions include:

Basic Operations:

Addition, subtraction, multiplication, division, and exponentiation.
import numpy as np

# Create two arrays
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Addition
c = a + b
print(c)

# Subtraction
d = b - a
print(d)

# Multiplication
e = a * b
print(e)

# Division
f = b / a
print(f)

# Exponentiation
g = np.array([[2, 3], [4, 5]])
h = np.array([[3, 2], [1, 4]])
i = np.power(g, h)
print(i)

# OUTPUT
# [[ 6  8]
#  [10 12]]
 
# [[4 4]
#  [4 4]]
 
# [[ 5 12]
#  [21 32]]
 
# [[5.         3.        ]
#  [2.33333333 2.        ]]
 
# [[ 8  9]
#  [ 4 625]]
  

Aggregation Functions: 

Functions that compute a single value from an array, such as sum, mean, and standard deviation.
import numpy as np

# Create an array
a = np.array([[1, 2], [3, 4]])

# Sum
b = np.sum(a)
print("Sum:", b)

# Mean
c = np.mean(a)
print("Mean:", c)

# Standard deviation
d = np.std(a)
print("Standard Deviation:", d)

# Computing on a specific axis
e = np.array([[1, 2], [3, 4], [5, 6]])
print("Original Array:")
print(e)

# Sum of rows
f = np.sum(e, axis=0)
print("Sum of Rows:")
print(f)

# Mean of columns
g = np.mean(e, axis=1)
print("Mean of Columns:")
print(g)

# Standard deviation of rows
h = np.std(e, axis=0)
print("Standard Deviation of Rows:")
print(h)

#OUTPUT
Sum: 10
Mean: 2.5
Standard Deviation: 1.118033988749895
Original Array:
[[1 2]
 [3 4]
 [5 6]]
Sum of Rows:
[ 9 12]
Mean of Columns:
[1.5 3.5 5.5]
Standard Deviation of Rows:
[1.24721913 1.24721913]

Universal Functions: 

Mathematical functions that operate element-wise on an array, such as sin, cos, and exp.
import numpy as np

# Create an array
a = np.array([1, 2, 3, 4])

# Square root
b = np.sqrt(a)
print("Square root of", a, "is", b)

# Exponential
c = np.exp(a)
print("Exponential of", a, "is", c)

# Logarithm
d = np.log(a)
print("Logarithm of", a, "is", d)

# Trigonometric functions
e = np.sin(a)
print("Sine of", a, "is", e)

f = np.cos(a)
print("Cosine of", a, "is", f)

g = np.tan(a)
print("Tangent of", a, "is", g)
# OUTPUT
Square root of [1 2 3 4] is [1.         1.41421356 1.73205081 2.        ]
Exponential of [1 2 3 4] is [ 2.71828183  7.3890561  20.08553692 54.59815003]
Logarithm of [1 2 3 4] is [0.         0.69314718 1.09861229 1.38629436]
Sine of [1 2 3 4] is [ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
Cosine of [1 2 3 4] is [ 0.54030231 -0.41614684 -0.9899925  -0.65364362]
Tangent of [1 2 3 4] is [ 1.55740772 -2.18503986 -0.14254654  1.15782128]

Linear Algebra: 


NumPy provides a range of functions for performing linear algebra operations, such as matrix multiplication and decomposition.
import numpy as np

# Create arrays
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Matrix multiplication
c = np.matmul(a, b)
print("Matrix Multiplication:")
print(c)

# Matrix determinant
d = np.linalg.det(a)
print("Matrix Determinant of A:")
print(d)

# Inverse matrix
e = np.linalg.inv(a)
print("Inverse Matrix of A:")
print(e)

# Eigenvalues and eigenvectors
f, g = np.linalg.eig(a)
print("Eigenvalues of A:")
print(f)
print("Eigenvectors of A:")
print(g)
# OUTPUT
# Matrix Multiplication:
# [[19 22]
#  [43 50]]
# Matrix Determinant of A:
# -2.0000000000000004
# Inverse Matrix of A:
# [[-2.   1. ]
#  [ 1.5 -0.5]]
# Eigenvalues of A:
# [-0.37228132  5.37228132]
# Eigenvectors of A:
# [[-0.82456484 -0.41597356]
#  [ 0.56576746 -0.90937671]]

Broadcasting:

NumPy's broadcasting rules allow for efficient operations on arrays of different sizes and shapes. This feature allows for the efficient computation of operations between arrays, even if their shapes are not identical. Broadcasting rules are based on a set of rules that allow arrays to be broadcasted to each other and for operations to be performed element-wise.
import numpy as np

# Create arrays
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
b = np.array([10, 20, 30])

# Addition with broadcasting
c = a + b
print("Addition with broadcasting:")
print(c)

# Multiplication with broadcasting
d = a * b
print("Multiplication with broadcasting:")
print(d)

# Broadcasting with scalar
e = a * 2
print("Broadcasting with scalar:")
print(e)
# Output:

# Addition with broadcasting:
# [[11 22 33]
#  [14 25 36]
#  [17 28 39]]
# Multiplication with broadcasting:
# [[ 10  40  90]
#  [ 40 100 180]
#  [ 70 160 270]]
# Broadcasting with scalar:
# [[ 2  4  6]
#  [ 8 10 12]
#  [14 16 18]]
  
This program demonstrates various broadcasting operations in NumPy, such as addition, multiplication, and scalar broadcasting. Broadcasting is a powerful feature of NumPy that allows for operations on arrays of different shapes and sizes, making it much easier to work with arrays in Python

Performance Optimization:

NumPy provides many features for optimizing the performance of code, including:

Vectorization: 

Writing code that performs operations on entire arrays, rather than on individual elements.
import numpy as np

# Create arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Vector addition
c = a + b
print("Vector Addition:")
print(c)

# Vector multiplication
d = a * b
print("Vector Multiplication:")
print(d)

# Vector dot product
e = np.dot(a, b)
print("Vector Dot Product:")
print(e)

# Vector cross product
f = np.cross(a, b)
print("Vector Cross Product:")
print(f)
# Output:


# Vector Addition:
# [5 7 9]
# Vector Multiplication:
# [ 4 10 18]
# Vector Dot Product:
# 32
# Vector Cross Product:
# [-3  6 -3]
This program demonstrates various vectorization operations in NumPy, such as vector addition, multiplication, dot product, and cross product. Vectorization is a powerful feature of NumPy that allows for operations on arrays of different shapes and sizes, making it much easier to work with arrays in Python.

Cython:

A programming language that is a superset of Python and can be used to write fast, efficient code that interfaces well with NumPy.
# example.pyx
import numpy as np
cimport numpy as np

def add_arrays(np.ndarray[np.int_t, ndim=1] a, np.ndarray[np.int_t, ndim=1] b):
    cdef np.ndarray[np.int_t, ndim=1] result = np.zeros_like(a)
    for i in range(len(a)):
        result[i] = a[i] + b[i]
    return result


# setup.py
from distutils.core import setup
from Cython.Build import cythonize
import numpy

setup(
    ext_modules=cythonize("example.pyx"),
    include_dirs=[numpy.get_include()]
)


# main.py
import numpy as np
import example

# Create arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Call the Cython function
c = example.add_arrays(a, b)

# Print the result
print(c)
#Output:


[5 7 9]
This program demonstrates how to use Cython with NumPy to optimize a function that adds two arrays element-wise. Cython is a superset of Python that allows for faster execution times by compiling Python code into C code, and NumPy is a Python library for scientific computing that provides fast array operations. By combining the two, we can achieve even faster performance in our Python code. In this example, the Cython function is called from a Python script, and the result is printed to the console.

Advanced Topics:

NumPy also provides support for more advanced features, including:

Masked Arrays: 

Arrays that allow for the handling of missing data.
import numpy as np
from numpy import ma

# Create an array with missing values
a = np.array([1, 2, -999, 4, -999, 6])

# Create a mask for the missing values
mask = a == -999

# Create a masked array
b = ma.masked_array(a, mask)

# Print the masked array
print("Masked Array:")
print(b)

# Apply an operation to the masked array
c = b * 2
print("Result of operation on Masked Array:")
print(c)
# Output:

# Masked Array:
# [1 2 -- 4 -- 6]
# Result of operation on Masked Array:
# [2 4 -- 8 -- 12]
 
This program demonstrates how to use masked arrays in NumPy to handle missing or invalid data. A masked array is an array that has certain elements marked as invalid or missing, and these elements are then ignored in operations that involve the array. In this example, we create a mask for the missing values in the original array and then use it to create a masked array. We then perform an operation on the masked array, and the missing values are automatically ignored. This is a powerful feature of NumPy that makes it easy to work with data that may contain missing or invalid values

Structured Arrays: 

Arrays that contain elements with different data types.
import numpy as np

# Define the data types for the structured array
dt = np.dtype([('name', 'S10'), ('age', np.int32), ('salary', np.float64)])

# Create an empty structured array with three elements
a = np.empty(3, dtype=dt)

# Fill the structured array with data
a['name'] = ['Alice', 'Bob', 'Charlie']
a['age'] = [25, 30, 35]
a['salary'] = [50000.0, 60000.0, 70000.0]

# Print the structured array
print("Structured Array:")
print(a)

# Access a single element of the structured array
print("Accessing a single element:")
print(a[1])

# Access a field of a single element
print("Accessing a field of a single element:")
print(a[1]['name'])
Output:

# Structured Array:
# [(b'Alice', 25, 50000.) (b'Bob', 30, 60000.) (b'Charlie', 35, 70000.)]
# Accessing a single element:
# (b'Bob', 30, 60000.)
# Accessing a field of a single element:
# b'Bob'
This program demonstrates how to create a structured array in NumPy. A structured array is an array that contains elements of different data types, similar to a table or spreadsheet. In this example, we define the data types for the structured array using the np.dtype function, and then create an empty structured array with three elements. We then fill the structured array with data and print it to the console. We also demonstrate how to access a single element of the structured array and how to access a field of a single element. This is a powerful feature of NumPy that allows you to work with structured data in a convenient and efficient way.

Integration with Other Libraries:


NumPy integrates well with other Python libraries, including Pandas, SciPy, and Matplotlib.

Conclusion

NumPy is a powerful library that provides support for large, multi-dimensional arrays and matrices, as well as a wide range of mathematical functions to operate on them. In this blog post, we explored the basic operations available in NumPy, including arithmetic operations, universal functions, aggregation functions, indexing and slicing, reshaping, transposing, concatenation, and stacking. These operations are the building blocks for more advanced operations in NumPy and are essential for scientific computing,

Please subscribe my youtube channel for latest python tutorials and this article

No comments:

Post a Comment