Montag, 23. Juni 2014

In and Out Data Operations: Interpolating Measured Data Points with Scipy Splines

Hello, I came up with this example reading a question here: Stackoverflow .
I really like the Pandas Package and i am looking forward to get deeper into this great Python package.
Make_Spline_fr_Data

Interpolation of Data Points between measured points using

Pandas for IO - Data Operation and Scipy Splines

In [1]:
import numpy as np
In [2]:
import scipy as sc
import matplotlib.pyplot as plt
In [3]:
from pandas import read_csv,DataFrame
In [4]:
test=read_csv('test.txt', sep='\t', index_col=False,usecols=[0,1])#we only take the first two columns to be save 
In [5]:
test
Out[5]:
x y
0 0 0.000
1 1 0.100
2 2 0.120
3 3 0.122
4 4 0.126
5 5 0.130
6 6 0.132
7 7 0.133
8 8 0.138
9 9 0.140
10 10 0.150

11 rows × 2 columns

In [29]:
test['y']
Out[29]:
0     0.000
1     0.100
2     0.120
3     0.122
4     0.126
5     0.130
6     0.132
7     0.133
8     0.138
9     0.140
10    0.150
Name: y, dtype: float64
In [7]:
x=test['x']
y=test['y']
#x=np.arange(0,11,1)
#y=np.array([0,0.1,0.12,0.122,0.126,0.13,0.132,0.133,0.138,0.14,0.15])
In [8]:
x
Out[8]:
0      0
1      1
2      2
3      3
4      4
5      5
6      6
7      7
8      8
9      9
10    10
Name: x, dtype: int64

Plotting our data is not very smooth !

In [31]:
plt.scatter(x,y)
plt.plot(x,y)
plt.xlabel(test.columns.values[0]+' measured')
plt.ylabel(test.columns.values[1]+' measured')
plt.show()

Let us use Splines to get a smooth Curve around measured points

In [11]:
from scipy.interpolate import splrep, splev

inter= splrep(x, y, w=None, xb=None, xe=None, k=3, task=0, s=None, t=None,full_output=0, per=0, quiet=1)#spline with standard parameters
In [12]:
incr=0.1 #points between measured points
x_between=np.arange(x[0],x[10],incr)
In [13]:
x_between
Out[13]:
array([ 0. ,  0.1,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9,  1. ,
        1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8,  1.9,  2. ,  2.1,
        2.2,  2.3,  2.4,  2.5,  2.6,  2.7,  2.8,  2.9,  3. ,  3.1,  3.2,
        3.3,  3.4,  3.5,  3.6,  3.7,  3.8,  3.9,  4. ,  4.1,  4.2,  4.3,
        4.4,  4.5,  4.6,  4.7,  4.8,  4.9,  5. ,  5.1,  5.2,  5.3,  5.4,
        5.5,  5.6,  5.7,  5.8,  5.9,  6. ,  6.1,  6.2,  6.3,  6.4,  6.5,
        6.6,  6.7,  6.8,  6.9,  7. ,  7.1,  7.2,  7.3,  7.4,  7.5,  7.6,
        7.7,  7.8,  7.9,  8. ,  8.1,  8.2,  8.3,  8.4,  8.5,  8.6,  8.7,
        8.8,  8.9,  9. ,  9.1,  9.2,  9.3,  9.4,  9.5,  9.6,  9.7,  9.8,
        9.9])
In [14]:
inter
Out[14]:
(array([  0.,   0.,   0.,   0.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,
         10.,  10.,  10.,  10.]),
 array([ -1.72132773e-18,   1.09261823e-01,   1.21476354e-01,
          1.21118802e-01,   1.26137825e-01,   1.30329897e-01,
          1.32542587e-01,   1.31499755e-01,   1.42111275e-01,
          1.36944363e-01,   1.50000000e-01,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00]),
 3)
In [15]:
extr=splev(x_between, inter, der=0, ext=0)#'inter' retruns tree parameters/arrays we feed our Spline_evaluation function with this
In [33]:
plt.plot(x_between,extr,label='Spline')
plt.scatter(x,y,label='Data Points')
plt.xlabel(test.columns.values[0])
plt.ylabel(test.columns.values[1])
plt.legend(loc='best')
plt.show()
In [25]:
extr[:10]#extr holds the spline values - interpolation between x- points
Out[25]:
array([ -1.72132773e-18,   1.56428288e-02,   2.98405538e-02,
         4.26648531e-02,   5.41874050e-02,   6.44798877e-02,
         7.36139794e-02,   8.16613582e-02,   8.86937025e-02,
         9.47826904e-02])

Now we want to write back our interpolated Data

In [19]:
interp_points=np.concatenate((x_between[None].T,extr[None].T),axis=1)
In [20]:
inter_col_names=test.columns.values.tolist()#making a list from column names
inter_col_names[0]=inter_col_names[0]+'_between' #an renaming them
inter_col_names[1]=inter_col_names[1]+'_interpolated'
In [21]:
inter_col_names
Out[21]:
['x_between', 'y_interpolated']
In [22]:
DF=DataFrame(interp_points,columns=inter_col_names)#making a DataFrame of our new Data
In [36]:
DF[:5]#showing that now 10 points are added between every x-point
Out[36]:
x_between y_interpolated
0 0.0 -1.721328e-18
1 0.1 1.564283e-02
2 0.2 2.984055e-02
3 0.3 4.266485e-02
4 0.4 5.418741e-02

5 rows × 2 columns

In [24]:
DF.to_csv('test_interp.txt',index=False)#rewriting with tab delimiter and new column names