Hooke’s law of elasticity: F = -kx
F
, stored in a spring is linearly related to the distance, x
, the spring has been compressed(or stretched).k
is called the spring constant.For example, How much does a rider have to weigh to compress the spring on motorbike 1cm? ($k \approx 35,000 N/m$)
Finding k
We can’t conduct the experiment perfectly, so, to hang a series of increasingly heavier weights on the spring, measure the stretch of the spring each time, and plot the results will be a better way.
First we should know:
F=-kx
k=-F/x
k=9.81*m/x
Then, we got some data
1 | 0.0865 0.1 |
Plot the data, we got:
Next step is to fit curves to data.
When we fit a curve to a set of data, we are finding a fit that relates an independent variable (the mass) to an estimated value of a dependent variable (the distance)
To fit curves to data, we need to define an objective function that provides a quantitative assessment of how well the curve fits the data. In this case, a straight line, that is linear function.
Then, finding the best fit is an optimization problem. most commonly used objective function is called least squares:
Next is to use a successive approximation algorithm to find the best least-squares fit. PyLab provides a built-in function, polyfit:
1 | pylab.polyfit(observedXVals, observedYVals, n) |
n
that provides a best least-squares fit for the set of points defined by the arrays observedXVals and observedYVals.Visualizing the Fit
1 | def getData(fileName): |
k = 21.53686
to fit curves to these mystery data:
how good are these fits
we can see that quadratic model is better than linear model, but how bad a fit is the line and how good is the quadratic fit?
comparing mean squared error
1 | def aveMeanSquareError(data, predicted): |
$R^{2}=1-\frac{\sum_{i}(y_i-p_i)^2}{\sum_{i}(y_i-\mu_i)^2}$
1 | def rSquared(observed, predicted): |
Testing Goodness of Fits
1 | def genFits(xVals, yVals, degrees): |
Why We Build Models
How Mystery Data Was Generated
1 | def genNoisyParabolicData(a, b, c, xVals, fName): |
Let’s Look at Two Data Sets
1 | #parameters for generating data |
Hence Degree 16 Is Tightest Fit, But
Cross Validate
1 | pylab.figure() |
Increasing the Complexity
y=ax^2+bx+c
, will get y=0x^2+1x+0
, which is y=x
y=ax^2+bx+c
, will get y=0.025x^2+0.955x+0.005
, which is y=x
y=ax^2+bx+c
, we got:
model = pylab.polyfit(xVals, yVals, 1)
Conclusion
Let D
be the original data set
1 | testResults = [] |
Let D
be the original data set
Let n
be the number of random samples
1 | testResults = [] |
Task: Model how the mean daily high temperature in the U.S. varied from 1961 through 2015
The Whole Data Set:
Results:
Line seems to be the winner