How to use lists, arrays, Numpy to make life easy with Python?-Part 1


 In my last post (How to define a function in python?)  we learnt use of the functions in python. All what we learnt is useful for many things but if you are interested to solve mathematical problems using python, you need to know more. The same is true if you have huge databases to analyse. Today we will learn about how do you read and write data using python. This is time to get to know Numpy. I am trying to introduce minimum concepts for the beginners. I will do another blog for advanced python in the future. In the advance part we will see "how Numpy can be used to increase efficiency of python code?".

 

What is Numpy?

I would just like to quote how Numpy is described on their official webpage.
NumPy is the fundamental package for scientific computing with Python. It contains among other things:
  • a powerful N-dimensional array object
  • sophisticated (broadcasting) functions
  • tools for integrating C/C++ and Fortran code
  • useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.


If you you are not using anaconda, you need to install Numpy. Look "how to install numpy?".

How to work with python lists?

Before we learn to use numpy. Let us see how you can use lists in python. I will start with a very simple example. I want to put data at every iteration of a loop in to a list.

import math 
mylist=[]#you can name your list as you wish 
a=0
for i in range(10):

    a+=i#this just means a=a+i (there is more to it!)
    mylist+=[a]#this is how to put values of each iteration into the list  

print mylist #this will print all elements of mylist
print mylist[0]#this will print the first element of mylist
print mylist[-1]#this will print the last element of mylist
print mylist[5:]#this will print all elements starting from 5th element.
print mylist[:5]#this will print all elements up to 5th element.
Now you can think about how to chose a range of elements in the lists.

How to write data to a file in Python?

There are many ways of writing a list to a file. I would suggest pickle module. You can read about how to use pickle to write list to a file? One can also use (most of the time I will do that in this blog) os module for writing data to file.
First we open a file.
f=open('myfile.dat','w') 
for i in range(10):
    a+=i#this just means a=a+i (there is more to it!) 
    mylist+=[a]#this is how to put values of each iteration into the list 

    f.write(str(a)+ "\n" )#this is how we could write data instead of list to a file
There are multiple ways and modules to write the data to the file. At this point I however will not discuss it. In the above example a was an integer so we converted it to a string by using str(a).

How to write a list or numpy array to a file in Python?

Although one can use multiple methods to do so. I would suggest to use numpy instead of a simple python list. We now rewrite above example of list to replace list with numpy array. I myself use numpy more than list as most of the time it is easier and faster for the computations.

import math 
import numpy as np
mylist=[]#you can name your list as you wish 
a=0 
for i in range(10):

    a+=i#this just means a=a+i (there is more to it!) 
    mylist+=[a]#this is how to put values of each iteration into the list  
 
mynparray=np.array(mylist)#we converted now a list to a numpy array 

In this example we converted a list to numpy array but we do not need to do this. We can initiate a numpy array as fllows:

import math 
import numpy as np
numbers=np.zeros(20)#array of size 20 initiated with zeroes. 
a=0
for i in range(20):
    numbers[i]=a+i#here we put the value directly to the array we do not append as for the list in the last example

So we learnt how to use numpy array either by converting list to an array or by defining an array from the beginning. Now how to write a numpy array to file? This is explained very well on the official webpage of numpy. Let us go here through an example

import math 
import numpy as np
numbers=np.zeros(20)#array of size 20 initiated with zeroes. 
a=0
for i in range(20):
    numbers[i]=a+i
np.savetxt("myfile.dat", numbers)

Numpy makes life very easy. we do not need to open the file before in this case we can just put the data into the file.

How to write multiple numpy arrays to a single file in Python?

This also answers the following question.

How to stack multiple numpy arrays to a multidimensional numpy array?


import
 math 

import numpy as np
numbers=np.zeros(20)#array of size 20 initiated with zeroes. 
myarray1=np.ones(20)#array of size 20 initiated with ones.
myarray2=np.empty(20)#array of size 20 initiated empty.
a=0
for i in range(20):
    numbers[i]=a+i
dat=np.column_stack((numbers, myarray1, myarray2))#we stacked 3 numpy arrays to one
np.savetxt("myfile.dat", data, delimiter=" ")

In this example we stacked 3 arrays to create a new array dat which we then wrote to the file. This will create a multidimensional text file.

How to read a data file as a numpy array in python?

We learnt how to write data to a file using numpy. Now we will learn how to import a data file in a program in order to analyse the data. This is very easy task in numpy.
import math 
import numpy as np
dataarray=np.loadtxt("myfolder/myfile.dat",dtype=np.float)
for i in range(len(dataarray)):
    print dataarray[i]

Here dataarray is a numpy array by definition. We imported it as a numpy array. Now you can do all kind of calculations using your data. 

How to do basic statistical analysis using numpy in python?

We can calculate mean, variance, standard deviation very easily in numpy. Let us use our imported data for the analysis.
import math 
import numpy as np
dataarray=np.loadtxt("myfolder/myfile.dat",dtype=np.float)
mean=np.mean(dataarray)
var=np.var(dataarray)
std=np.std(dataarray)


In this way we get introduced to numpy. It is a very powerful tool in python and I will do some more examples in my next blog. In case you have any questions, email me.




Comments

Popular posts from this blog

How to use Edward library for probabilistic modeling with Tensorflow and GPy to study asymptotic connections between Multi-Layer Perceptrons (neural nets) and Gaussian processes?

Writing a loop first time in python

Beginning with a python program: The tail