NumPy (Numerical Python)
NumPy is a powerful library for numerical computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. It is widely used for data analysis, machine learning, and scientific computing.
________________________________________
Key Features of NumPy:
1. Multidimensional Arrays (ndarray):
o NumPy introduces the ndarray, a fast, flexible, and efficient multi-dimensional array object. Arrays are more compact and faster than regular Python lists.
2. Mathematical Functions:
o NumPy provides a wide range of mathematical operations, such as linear algebra, statistics, Fourier transforms, and random number generation.
3. Broadcasting:
o Broadcasting allows NumPy to perform element-wise operations on arrays of different shapes. It is a powerful feature that makes array operations efficient without needing explicit looping.
4. Array Operations:
o NumPy supports operations like element-wise addition, subtraction, multiplication, division, and more, which can be performed on arrays without using explicit loops.
5. Linear Algebra:
o NumPy provides functions for matrix multiplication, eigenvalue decomposition, determinants, and other linear algebra operations.
6. Random Module:
o NumPy has a random module for generating random numbers and performing statistical sampling.
7. Fast and Efficient:
o NumPy is highly optimized for performance and is implemented in C, making it much faster than pure Python code for large data sets.
8. Compatibility with Other Libraries:
o NumPy arrays serve as the backbone for other data science and machine learning libraries like Pandas, Matplotlib, TensorFlow, and SciPy.
________________________________________
Basic NumPy Operations
1. Importing NumPy:
import numpy as np
2. Creating Arrays:
• From a List:
• arr = np.array([1, 2, 3, 4])
• print(arr)
• Creating a 2D Array:
• arr2d = np.array([[1, 2, 3], [4, 5, 6]])
• print(arr2d)
• Using Built-in Functions:
• zeros = np.zeros((3, 3)) # 3x3 array of zeros
• ones = np.ones((2, 4)) # 2x4 array of ones
• eye = np.eye(3) # Identity matrix of size 3x3
3. Array Operations:
• Addition:
• arr1 = np.array([1, 2, 3])
• arr2 = np.array([4, 5, 6])
• result = arr1 + arr2
• print(result) # Output: [5 7 9]
• Element-wise Operations:
• result = arr1 * arr2
• print(result) # Output: [4 10 18]
• Dot Product (Matrix Multiplication):
• arr3 = np.array([[1, 2], [3, 4]])
• arr4 = np.array([[5, 6], [7, 8]])
• result = np.dot(arr3, arr4)
• print(result)
4. Array Indexing and Slicing:
• Indexing:
• arr = np.array([1, 2, 3, 4, 5])
• print(arr[2]) # Output: 3
• Slicing:
• print(arr[1:4]) # Output: [2 3 4]
• 2D Array Indexing:
• arr2d = np.array([[1, 2, 3], [4, 5, 6]])
• print(arr2d[1, 2]) # Output: 6
5. Reshaping Arrays:
arr = np.array([1, 2, 3, 4, 5, 6])
reshaped = arr.reshape(2, 3) # Reshaping to 2x3 array
print(reshaped)
6. Broadcasting Example:
Broadcasting allows operations on arrays of different shapes. For example:
arr1 = np.array([1, 2, 3])
arr2 = np.array([10])
result = arr1 + arr2 # arr2 is broadcasted to match the shape of arr1
print(result) # Output: [11 12 13]
________________________________________
Mathematical Operations in NumPy:
1. Sum:
2. arr = np.array([1, 2, 3, 4, 5])
3. sum_arr = np.sum(arr)
4. print(sum_arr) # Output: 15
5. Mean:
6. mean = np.mean(arr)
7. print(mean) # Output: 3.0
8. Standard Deviation:
9. std_dev = np.std(arr)
10. print(std_dev) # Output: 1.4142135623730951
11. Minimum and Maximum:
12. min_val = np.min(arr)
13. max_val = np.max(arr)
14. print(min_val, max_val) # Output: 1 5
15. Matrix Operations (determinant, inverse):
16. mat = np.array([[1, 2], [3, 4]])
17. det = np.linalg.det(mat)
18. inv = np.linalg.inv(mat)
19. print(det, inv)
________________________________________
NumPy Random Module:
NumPy provides a random module to generate random numbers and perform statistical sampling.
1. Random Integer:
2. random_int = np.random.randint(1, 10) # Random integer between 1 and 9
3. print(random_int)
4. Random Float:
5. random_float = np.random.rand(3, 2) # 3x2 matrix of random floats between 0 and 1
6. print(random_float)
7. Random Normal Distribution:
8. normal_dist = np.random.randn(3, 3) # 3x3 matrix from a normal distribution
9. print(normal_dist)
10. Random Sample from a Given Array:
11. sample = np.random.choice([10, 20, 30, 40, 50], size=3)
12. print(sample)
________________________________________
Advanced NumPy Operations:
1. Linear Algebra (Eigenvalues and Eigenvectors):
2. mat = np.array([[1, 2], [3, 4]])
3. eigenvalues, eigenvectors = np.linalg.eig(mat)
4. print("Eigenvalues:", eigenvalues)
5. print("Eigenvectors:", eigenvectors)
6. Singular Value Decomposition (SVD):
7. U, S, Vt = np.linalg.svd(mat)
8. print(U, S, Vt)
9. Solving Linear Systems:
10. A = np.array([[3, 1], [1, 2]])
11. B = np.array([9, 8])
12. X = np.linalg.solve(A, B) # Solves the equation Ax = B
13. print(X)
________________________________________
Performance Considerations:
1. Vectorization:
o Instead of using loops, NumPy allows you to perform operations on entire arrays at once. This leads to significant performance improvements compared to native Python loops.
2. Memory Efficiency:
o NumPy arrays are more memory-efficient than Python lists due to their contiguous memory storage.
3. Interfacing with C/C++:
o NumPy allows you to integrate with low-level languages (C, C++) to achieve performance gains when necessary, particularly for large datasets or complex mathematical computations.