Predictive Hacks

Tips About Numpy Arrays

numpy_arrays

Differences Between Numpy Arrays and Python Lists

There are some differences between Numpy Arrays and Python Lists. We will provide some examples of algebraic operators.

+’ Operator

import numpy as np

alist = [1, 2, 3, 4, 5]   # Define a python list. It looks like an np array
narray = np.array([1, 2, 3, 4]) # Define a numpy array

print(narray + narray)
print(alist + alist)
 
[2 4 6 8]
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

Note that the ‘+’ operator on NumPy arrays perform an element-wise addition, while the same operation on Python lists results in a list concatenation. Be careful while coding. Knowing this can save many headaches.

*’ Operator

It is the same as with the product operator, *. In the first case, we scale the vector, while in the second case, we concatenate three times the same list.

print(narray * 3)
print(alist * 3)
 
[ 3  6  9 12]
[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

Matrix or Array of Arrays

In linear algebra, a matrix is a structure composed of n rows by m columns. That means each row must have the same number of columns. With NumPy, we have two ways to create a matrix:

  • Creating an array of arrays using np.array (recommended).
  • Creating a matrix using np.matrix (still available but might be removed soon).

NumPy arrays or lists can be used to initialize a matrix, but the resulting matrix will be composed of NumPy arrays only.

npmatrix1 = np.array([narray, narray, narray]) # Matrix initialized with NumPy arrays
npmatrix2 = np.array([alist, alist, alist]) # Matrix initialized with lists
npmatrix3 = np.array([narray, [1, 1, 1, 1], narray]) # Matrix initialized with both types

print(npmatrix1)
print(npmatrix2)
print(npmatrix3)
[[1 2 3 4]
 [1 2 3 4]
 [1 2 3 4]]
[[1 2 3 4 5]
 [1 2 3 4 5]
 [1 2 3 4 5]]
[[1 2 3 4]
 [1 1 1 1]
 [1 2 3 4]]
 

However, when defining a matrix, be sure that all the rows contain the same number of elements. Otherwise, the linear algebra operations could lead to unexpected results.

Analyze the following two examples:

# Example 1:

okmatrix = np.array([[1, 2], [3, 4]]) # Define a 2x2 matrix
print(okmatrix) # Print okmatrix
print(okmatrix * 2) # Print a scaled version of okmatrix
 
[[1 2]
 [3 4]]
[[2 4]
 [6 8]]
# Example 2:

badmatrix = np.array([[1, 2], [3, 4], [5, 6, 7]]) # Define a matrix. Note the third row contains 3 elements
print(badmatrix) # Print the malformed matrix
print(badmatrix * 2) # It is supposed to scale the whole matrix
 
[list([1, 2]) list([3, 4]) list([5, 6, 7])]
[list([1, 2, 1, 2]) list([3, 4, 3, 4]) list([5, 6, 7, 5, 6, 7])]

Get the norm of an numpy array or matrix

Let’s recall that:

\(norm(\vec a) = ||\vec a|| = \sqrt {\sum_{i=1}^{n} a_i ^ 2}\)

nparray1 = np.array([1, 2, 3, 4]) # Define an array
norm1 = np.linalg.norm(nparray1)

nparray2 = np.array([[1, 2], [3, 4]]) # Define a 2 x 2 matrix. Note the 2 level of square brackets
norm2 = np.linalg.norm(nparray2) 

print(norm1)
print(norm2)
 
5.477225575051661
5.477225575051661

Note that without any other parameter, the norm function treats the matrix as being just an array of numbers. However, it is possible to get the norm by rows or by columns. The axis parameter controls the form of the operation:

  • axis=0 means get the norm of each column
  • axis=1 means get the norm of each row.
nparray2 = np.array([[1, 1], [2, 2], [3, 3]]) # Define a 3 x 2 matrix. 

normByCols = np.linalg.norm(nparray2, axis=0) # Get the norm for each column. Returns 2 elements
normByRows = np.linalg.norm(nparray2, axis=1) # get the norm for each row. Returns 3 elements

print(normByCols)
print(normByRows)
 
[3.74165739 3.74165739]
[1.41421356 2.82842712 4.24264069]

The dot product between numpy arrays: All the flavors

The dot product or scalar product or inner product between two vectors \(\vec a\) and \(\vec a\) of the same size is defined as:
\(\vec a \cdot \vec b = \sum_{i=1}^{n} a_i b_i\)

The dot product takes two vectors and returns a single number.

nparray1 = np.array([0, 1, 2, 3]) # Define an array
nparray2 = np.array([4, 5, 6, 7]) # Define an array

flavor1 = np.dot(nparray1, nparray2) # Recommended way
print(flavor1)

flavor2 = np.sum(nparray1 * nparray2) # Ok way
print(flavor2)

flavor3 = nparray1 @ nparray2         # Geeks way
print(flavor3)

# As you never should do:             # Noobs way
flavor4 = 0
for a, b in zip(nparray1, nparray2):
    flavor4 += a * b
    
print(flavor4)
 
38
38
38
38

Get the mean and sum by rows or columns of numpy arrays

Another general operation performed on matrices is the sum by rows or columns. Just as we did for the function norm, the axis parameter controls the form of the operation:

  • axis=0 means to sum the elements of each column together.
  • axis=1 means to sum the elements of each row together.
nparray2 = np.array([[1, -1], [2, -2], [3, -3]]) # Define a 3 x 2 matrix. 

sumByCols = np.sum(nparray2, axis=0) # Get the sum for each column. Returns 2 elements
sumByRows = np.sum(nparray2, axis=1) # get the sum for each row. Returns 3 elements

print('Sum by columns: ')
print(sumByCols)
print('Sum by rows:')
print(sumByRows)
 
Sum by columns: 
[ 6 -6]
Sum by rows:
[0 0 0]
nparray2 = np.array([[1, -1], [2, -2], [3, -3]]) # Define a 3 x 2 matrix. Chosen to be a matrix with 0 mean

mean = np.mean(nparray2) # Get the mean for the whole matrix
meanByCols = np.mean(nparray2, axis=0) # Get the mean for each column. Returns 2 elements
meanByRows = np.mean(nparray2, axis=1) # get the mean for each row. Returns 3 elements

print('Matrix mean: ')
print(mean)
print('Mean by columns: ')
print(meanByCols)
print('Mean by rows:')
print(meanByRows)
 
Matrix mean: 
0.0
Mean by columns: 
[ 2. -2.]
Mean by rows:
[0. 0. 0.]

Some other useful commands

  • numpy.squeeze: Remove single-dimensional entries from the shape of an array
  • numpy.expand_dims(a,axis): Expand the shape of an array
  • numpy.broadcast_arrays(*args, subok=False): Broadcast any number of arrays against each other
  • numpy.reshape(a, newshape, order='C): Gives a new shape to an array without changing its data
  • numpy.ravel(a, order='C'): Return a contiguous flattened array.
  • ndarray.flatten(order='C'): Return a copy of the array collapsed into one dimension.
  • numpy.unique(a, return_counts=True): Count the frequency of unique values in numpy array a.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Leave a Comment

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

connect with sql
R

How to Connect R with SQL

Need to Connect R with SQL It is common for Data Analysts/Scientists to connect R with SQL. For that reason,

letter frequency
Python

Document Letter Frequency in Python

Letter Frequency We will provide you a walk-through example of how you can easily get the letter frequency in documents

[the_ad_group id="232"]
[the_ad id="2133"]