Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
In Part 1 of the Data science With Python series, we looked at the basic in-built functions for numerical computing in Python. In this part, we will be taking a look at the Numpy library.
NumPy is the fundamental package for scientific computing with Python. It contains among other things:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
Great, letâs see how to use the Numpy library for basic array manipulation.
The Numpy library
First, we need to import numpy in Python.
import numpy as np
Letâs create a numpy array.
np.array([4,5,6])
Output : array([4,5,6])
Now, letâs create a multi-dimensional array.
mul=np.array([[5,4,6],[7,8,9],[10,11,12]])mul
Output : array([[4, 5, 6], [7, 8, 9],[10,11,12]])
Check the shape (rows and columns of the array).
mul.shape
Output : (3, 3)
Create an evenly spaced array between 1 and 40 with a difference of 2.
dif=np.arange(1,60,2)dif
Output : array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59])
Reshape the above array into a desired shape.
dif.reshape(10,3)
Output : array([[ 1, 3, 5], [ 7, 9, 11], [13, 15, 17], [19, 21, 23], [25, 27, 29], [31, 33, 35], [37, 39, 41], [43, 45, 47], [49, 51, 53], [55, 57, 59]])
Generate an evenly spaced list between the interval 1 and 8. (Take a minute here to understand the difference between âlinspaceâ and âarangeâ)
gen = np.linspace(1,8,40)gen
Output: array([1. , 1.17948718, 1.35897436, 1.53846154, 1.71794872, 1.8974359 , 2.07692308, 2.25641026, 2.43589744, 2.61538462, 2.79487179, 2.97435897, 3.15384615, 3.33333333, 3.51282051, 3.69230769, 3.87179487, 4.05128205, 4.23076923, 4.41025641, 4.58974359, 4.76923077, 4.94871795, 5.12820513, 5.30769231, 5.48717949, 5.66666667, 5.84615385, 6.02564103, 6.20512821, 6.38461538, 6.56410256, 6.74358974, 6.92307692, 7.1025641 , 7.28205128, 7.46153846, 7.64102564, 7.82051282, 8. ])
Now, change the shape of the array in place (âresizeâ function changes the shape of the array in place, unlike âreshapeâ)
gen.resize(10,4)gen
Output: array([[1. , 1.17948718, 1.35897436, 1.53846154], [1.71794872, 1.8974359 , 2.07692308, 2.25641026], [2.43589744, 2.61538462, 2.79487179, 2.97435897], [3.15384615, 3.33333333, 3.51282051, 3.69230769], [3.87179487, 4.05128205, 4.23076923, 4.41025641], [4.58974359, 4.76923077, 4.94871795, 5.12820513], [5.30769231, 5.48717949, 5.66666667, 5.84615385], [6.02564103, 6.20512821, 6.38461538, 6.56410256], [6.74358974, 6.92307692, 7.1025641 , 7.28205128], [7.46153846, 7.64102564, 7.82051282, 8. ]])
Create an array with all elements as ones.
onarr = np.ones((4,4))onarr
Output: array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]])
Create an array filled with zeros.
zearr = np.zeros((4,4))zearr
Output: array([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]])
Create a diagonal matrix with diagonal values =Â 1
dm = np.eye(3)dm
Output: array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
Extract only diagonal values from an array.
np.diag(dm)
Output: array([1., 1., 1.])
Create an array consisting of repeating list
relist = np.array([1,2,3]*7)relist
Output: array([1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])
Now, repeat each element of array n number of times using repeat function.
np.repeat([1,2,3],3)
Output : array([1, 1, 1, 2, 2, 2, 3, 3, 3])
Generate two arrays of desired shape filled with random values between 0 and 1.
relist = np.random.rand(2,3)print(relist)de = np.random.rand(2,3)print(de)
Output :
[[0.55523672 0.46815197 0.67590369] [0.5331193 0.62780236 0.45044916]]
[[0.26215572 0.07380256 0.06592746] [0.89782279 0.95603968 0.82052478]]
Stack the above two arrays created vertically
st = np.vstack([de,relist])st
Output :
array([[0.26215572, 0.07380256, 0.06592746], [0.89782279, 0.95603968, 0.82052478], [0.55523672, 0.46815197, 0.67590369], [0.5331193Â , 0.62780236, 0.45044916]])
Now, letâs stack them horizontally.
sh = np.hstack([de,relist])sh
Output :
array([[0.26215572, 0.07380256, 0.06592746, 0.55523672, 0.46815197, 0.67590369], [0.89782279, 0.95603968, 0.82052478, 0.5331193Â , 0.62780236, 0.45044916]])
Great, now letâs perform some array operations. First letâs create two random arrays
r1 = np.random.rand(2,2)r2 = np.random.rand(2,2)print(r1)print(r2)
Output :
[[ 0.02430146 0.14448542] [ 0.54428337 0.40332494]]
[[ 0.77574886 0.08747577] [ 0.51484157 0.92319888]]
Letâs do element wise addition.
r3 = r1+ r2r3
Output : array([[-0.75144739, 0.05700965], [ 0.02944179, -0.51987394]])
Element wise subtraction.
r4 = r1 - r2r4
Output : array([[-0.75144739, 0.05700965], [ 0.02944179, -0.51987394]])
Letâs power each element to 3.
r5 = r1**3r5
Output : array([[0.65228631, 0.24993365], [0.97976155, 0.71554632]])
Now, instead of element wise operation, letâs perform a dot product of the two arrays r1 and r2.
r6 = r1.dot(r2)r6
Output : array([[ 0.09323893, 0.13551456], [ 0.62987564, 0.41996073]])
Letâs create a new array and transpose it.
sh = np.array([[1,2],[3,4]])sh
Output :
array([[1, 2], [3, 4]])
sh.T
Output :
array([[1, 3], [2, 4]])
Now, check the datatype of elements in the array.
sh.dtype
Output : dtype(âint32â)
Change the datatype of the array.
rs = a.astype('f')rs.dtype
Output : dtype(âfloat32â)
Now, letâs look at some mathematical functions in an array, starting with sum of an array.
c = np.array([1,2,3,4,5])c.sum()
Output : 15
Maximum of the elements of an array.
c.max()
Output : 5
Mean of the elements of the array
c.mean()
Output : 3
Now, letâs retrieve the index of the maximum value of the array.
c.argmax()
Output : 4
c.argmin()
Output : 0
Create an array consisting of square of first ten whole numbers.
dim = np.arange(10)**2dim
Output : array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81], dtype=int32)
Access values in the above array using index
dim[2]
Output : 4
dim[1:5]
Output : array([ 1, 4, 9, 16], dtype=int32)
Use negative sign to access variables in reverse.
dim[-1:]
Output : array([81], dtype=int32)
Now, access certain elements of the array based on a step size.
dim[1:10:2] #dim[start:stop:stepsize]
Output : array([ 1, 9, 25, 49, 81], dtype=int32)
Create a multidimensional array
en = np.arange(36)en.resize(6,6)en
Output : array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35]])
Access the second row and third column
en[1,2]
Output : 8
Access 2nd row and columns 3 to 7. Note that the numbering of the rows and columns start with 0.
en[1, 2:6]
Output : array([ 8, 9, 10, 11])
Select all rows till the 2nd row and all columns except last column
en[:2,:-1]
Output : array([[ 0, 1, 2, 3, 4], [ 6, 7, 8, 9, 10]])
Select values from array greater than 20.
en[e>20]
Output : array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35])
Assign value of the array elements as 20 if the element value is greater than 20.
en[en>20] = 20en
Output : array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20]])
To copy an array onto another variable, always use the copy function.
fun = en.copy()fun
Output : array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20]])
Create an array with a set of random integers between 1 and 10. Specify the array to be of shape 4*4
gom = np.random.randin(1,10,(4,4))gom
Output : array([[9, 7, 1, 4], [1, 4, 3, 6], [2, 5, 5, 1], [2, 2, 9, 9]])
Great, we have looked at creating, accessing and manipulating arrays in Numpy. In the next part of the series, we will be looking at a library which is built on the Numpy libraryâââPandas. Pandas is a library which makes data manipulation and analysis much easier in Python. It offers data structures and operations for numerical tables and time series.
Resources :
Connect on LinkedIn and, check out Github (below) for the complete notebook.
harunshimanto/Python-The-Dangerous-Tool-For-ML-Data-Science
You can tell me what you think about this, if you enjoy writing, click on the clap đ button.
Thanks to everyone.
Numpy With Python For Data Science was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.