I: Preparations

First, we need to import the necessary packages and prepare to make plots.

In [1]:
import numpy as np
from matplotlib import pyplot as plt
import fit
In [2]:
%matplotlib tk

II: A First Fit

Before doing a fit, you need to make a guess as to the functional form of the data. Best bet is to make a plot, first. Let's load some data and make a plot.

In [3]:
x1, y1 = np.loadtxt('fittable1.csv', delimiter=',', unpack=True)
plt.plot(x1, y1)
Out[3]:
[<matplotlib.lines.Line2D at 0x7fd248603390>]

The fit module provides a class, Fit, whose general purpose is obvious. A Fit has three required positional arguments:

  • x_data: an np.ndarray containing the x-data.
  • y_data: an np.ndarray containing the y-data.
  • model: a function guess as to the form of the data. The function should return an np.ndarray containing y-values, and it should take the following inputs.

    • an np.ndarray of x-values
    • however many float variables are necessary to fit the curve.

The particular data we have look linear. A model function fit.linear is already defined, so we'll use that.

In [4]:
f1 = fit.Fit(x1, y1, fit.linear)

We can print the fit parameters using

Fit.print_results(sig_figs=False)

if sig_figs is true, only significant digits will be printed. Otherwise, all digits will be printed.

We can easily get a plot using

Fit.show_plot(new_fig=True)

If new_fig is true, a new figure will be created. Otherwise, a line representing the fit will be added to the active axes in pyplot.

In [5]:
f1.print_results()
f1.show_plot(False)
-- Best Fit Results -----------------------------------
 m = 2.0074603518145326 ± 0.005582184193047931 (0.278%)
 b = 1.1689747904285182 ±  0.03238949963874929 (2.771%)
-------------------------------------------------------
Out[5]:
(<Figure size 640x480 with 1 Axes>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd248ac4f98>)

If we want to use the values of the fit for calculations, we can access them using one of three attributes, all of which are dictionary-like objects in which the keys are the variable names in the model function.

  1. Fit.fit_values
  2. Fit.fit_errors
  3. Fit.best_fit_parameters

In the first, the values of the dictionary are the fit parameters:

In [6]:
f1.fit_values
Out[6]:
{'m': 2.0074603518145326, 'b': 1.1689747904285182}

In the second, the values of the dictionary are the errors:

In [7]:
f1.fit_errors
Out[7]:
{'m': 0.005582184193047931, 'b': 0.03238949963874929}

In the third, the values of the dictionary are tuples of (value_of_fit_parameter, error).

In [8]:
f1.best_fit_parameters
Out[8]:
OrderedDict([('m', (2.0074603518145326, 0.005582184193047931)),
             ('b', (1.1689747904285182, 0.03238949963874929))])

Divide the slope by the intercept without typing (or copying and pasting) any numbers.

In [9]:
# CISV
f1.fit_values['m']/f1.fit_values['b']
Out[9]:
1.7172828432670013

III: Another Fit

Load some more data and make a plot.

In [10]:
x2, y2 = np.loadtxt('fittable2.csv', delimiter=',', unpack=True)
plt.plot(x2, y2)
Out[10]:
[<matplotlib.lines.Line2D at 0x7fd24857dc18>]

This looks quadratic. fit provides no model function, so we have to define one.

In [11]:
# CISV
def quadratic(x, a, b, c):
    return a*x**2 + b*x + c

Now create the fit, print the parameters, and show the results on the plot.

In [12]:
f2 = fit.Fit(x2, y2, quadratic)
f2.print_results()
f2.show_plot()
-- Best Fit Results -------------------------------------
 a =  2.0010899311325625 ± 0.0009519405181984401 (0.048%)
 b =  1.4961402334744212 ±  0.004963983605141094 (0.332%)
 c = -3.0298968321880064 ±   0.04342067436055229 (1.433%)
---------------------------------------------------------
Out[12]:
(<Figure size 640x480 with 1 Axes>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd24859c748>)

This looks pretty good.

Here are the results with significant digits.

In [13]:
f2.print_results(sig_figs=True)
-- Best Fit Results ----------------
 a = (2.001 ± 0.001) × 10⁺⁰ (0.048%)
 b = (1.496 ± 0.005) × 10⁺⁰ (0.332%)
 c = (-3.03 ±  0.04) × 10⁺⁰ (1.433%)
------------------------------------

IV: A Few More Fits

In [14]:
x3, y3 = np.loadtxt('fittable3.csv', delimiter=',', unpack=True)
plt.plot(x3, y3)
Out[14]:
[<matplotlib.lines.Line2D at 0x7fd2484fff60>]
In [15]:
# CISV
def gaussian(x, a, b, c):
    return a*np.exp( -(x-b)**2/(2*c**2) )
In [16]:
# CISV
f3 = fit.Fit(x3, y3, gaussian)
f3.print_results()
f3.show_plot()
-- Best Fit Results -----------------------------------
 a = 10.539059499361839 ±   0.1271607248286727 (1.207%)
 b = 0.9734286318487347 ± 0.013142830945664347 (1.350%)
 c = 0.9433409459987722 ± 0.013142830888085547 (1.393%)
-------------------------------------------------------
Out[16]:
(<Figure size 640x480 with 1 Axes>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd248568b00>)
In [17]:
f3.print_results(True)
-- Best Fit Results ----------------
 a = (1.054 ± 0.013) × 10⁺¹ (1.207%)
 b =  (9.73 ±  0.13) × 10⁻¹ (1.350%)
 c =  (9.43 ±  0.13) × 10⁻¹ (1.393%)
------------------------------------

In [18]:
x4, y4 = np.loadtxt('fittable4.csv', delimiter=',', unpack=True)
plt.plot(x4, y4)
Out[18]:
[<matplotlib.lines.Line2D at 0x7fd24848a2e8>]
In [19]:
# CISV
def decaying_sine(x, A, b, k, ph0):
    return A * np.exp(-b*x) * np.sin(k*x + ph0)
In [20]:
# CISV
f4 = fit.Fit(x4, y4, decaying_sine)
f4.print_results()
f4.show_plot(False)
-- Best Fit Results --------------------------------------------
   A =     2.9956252835732307 ±  0.005985062404343675   (0.200%)
   b =     0.4998692642174855 ± 0.0013333431888298878   (0.267%)
   k =      2.299213179616869 ± 0.0011494488734127123   (0.050%)
 ph0 = -0.0003784119436911972 ± 0.0016169935017549706 (427.310%)
----------------------------------------------------------------
Out[20]:
(<Figure size 640x480 with 1 Axes>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd248568b00>)
In [21]:
f4.print_results(True)
-- Best Fit Results ----------------------
   A =  (2.996 ±  0.006) × 10⁺⁰   (0.200%)
   b =  (4.999 ±  0.013) × 10⁻¹   (0.267%)
   k = (2.2992 ± 0.0011) × 10⁺⁰   (0.050%)
 ph0 =     (-4 ±     16) × 10⁻⁴ (427.310%)
------------------------------------------

In [22]:
x5, y5 = np.loadtxt('fittable5.csv', delimiter=',', unpack=True)
plt.plot(x5, y5)
Out[22]:
[<matplotlib.lines.Line2D at 0x7fd2484a56d8>]
In [23]:
# CISV
def slit(x, I0, a, d, λ, L):
    arg1 = np.pi*a*x/(λ*L)
    arg2 = np.pi*d*x/(λ*L)
    return I0 * (np.sin(arg1)/arg1)**2 * np.cos(arg2)**2
In [24]:
# CISV
f5 = fit.Fit(x5, y5, slit)
f5.show_plot(False)
f5.print_results()
-- Best Fit Results -------------------------------------------------------
 I0 = 0.11685405654155188 ±   0.007827637103272663                 (6.699%)
  a =   78164.87804571819 ± 2.3917583082330165e+19 (30598887480309136.000%)
  d = -6795.0349070973225 ±     257487721020.72696        (3789350967.893%)
  λ =  5441.9499774431015 ±                    nan                   (nan%)
  L =   5441.949977448237 ±  2.369522180570806e+18 (43541785396599488.000%)
---------------------------------------------------------------------------

That's no good! We have to supply some guesses for the parameters. We create a list with guesses for all parameters, in the order of those in the model function signature.

guesses = [I0, a, d, λ, L]

CISV (code line only)

These get supplied to the initial_parameters argument.

In [25]:
# CISV
guesses = [1.2, 0.04e-3, 0.25e-3, 400e-9, 1.0]

f5b = fit.Fit(x5, y5, slit, initial_parameters=guesses)
f5b.print_results()
f5b.show_plot()
-- Best Fit Results ---------------------------------------------
 I0 =     1.2009636260272332 ± 0.0009592480079069257     (0.080%)
  a =  0.0003573077226141113 ±    0.1179269849114221 (33004.320%)
  d =  0.0022317365963676274 ±    0.7365700101966483 (33004.343%)
  λ = 1.5230191123086274e-06 ±                   nan       (nan%)
  L =        3.8101043816391 ±                   nan       (nan%)
-----------------------------------------------------------------
Out[25]:
(<Figure size 640x480 with 1 Axes>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd24845a208>)
In [26]:
f5b.print_results(True)
-- Best Fit Results ---------------------
 I0 = (1.201 ± 0.001) × 10⁺⁰     (0.080%)
  a =     (0 ±    12) × 10⁻² (33004.320%)
  d =     (0 ±     7) × 10⁻¹ (33004.343%)
  λ = (1.523 ±   nan) × 10⁻⁶       (nan%)
  L = (3.810 ±   nan) × 10⁺⁰       (nan%)
-----------------------------------------

This is much better, but some of the values are nonsense---so much so that uncertainties cannot even be estimated for some parameters. We can apply bounds to the guesses by specifying tuples of three floats, instead of single floats. Assume that the wavelength is the only parameter which is not fairly well known.

In [27]:
# CISV
guesses = [(1.20, 1.19, 1.21),
           (0.040e-3, 0.039e-3, 0.041e-3),
           (0.25e-3, 0.24e-3, 0.26e-3),
           650e-9,
           (1.00, 0.99, 1.11)]
f5c = fit.Fit(x5, y5, slit, initial_parameters=guesses)
f5c.print_results()
f5c.show_plot()
-- Best Fit Results ------------------------------------------
 I0 =    1.2009212551316928 ±  0.0009638415177234494  (0.080%)
  a = 4.001417930441337e-05 ± 1.4095568730459452e-05 (35.226%)
  d = 0.0002498901336648699 ±  8.799346914063234e-05 (35.213%)
  λ = 6.495220141508169e-07 ±  6.655256855630443e-10  (0.102%)
  L =    1.0003421843885336 ±    0.35312864809316014 (35.301%)
--------------------------------------------------------------
Out[27]:
(<Figure size 640x480 with 1 Axes>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd2483b8748>)
In [28]:
f5c.print_results(True)
-- Best Fit Results ------------------
 I0 = (1.201 ± 0.001) × 10⁺⁰  (0.080%)
  a =   (4.0 ±   1.4) × 10⁻⁵ (35.226%)
  d =   (2.5 ±   0.9) × 10⁻⁴ (35.213%)
  λ = (6.495 ± 0.007) × 10⁻⁷  (0.102%)
  L =   (1.0 ±   0.4) × 10⁺⁰ (35.301%)
--------------------------------------
In [ ]: