-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer Expenditure Survey (CEX) data #5
Comments
Option (1) above requires creating an interpolating function to fit data. The # Import libraries
import numpy as np
import scipy.interpolate as si
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
...
# Read in (create) the data
fert_data = (np.array([0.0, 0.0, 0.3, 12.3, 47.1, 80.7, 105.5, 98.0,
49.3, 10.4, 0.8, 0.0, 0.0]) / 2000)
age_midp = np.array([9, 10, 12, 16, 18.5, 22, 27, 32, 37, 42, 47,
55, 56]) The # Create the scatter plot
fig, ax = plt.subplots()
plt.scatter(age_midp, fert_data, s=70, c='blue', marker='o', label='Data')
minorLocator = MultipleLocator(1)
ax.xaxis.set_minor_locator(minorLocator)
plt.grid(b=True, which='major', color='0.65',linestyle='-')
plt.title('Average fertility rates by age bin ($f_{s}$)', fontsize=20)
plt.xlabel(r'Age $s$')
plt.ylabel(r'Fertility rate $f_{s}$')
plt.xlim((1, 60))
plt.ylim((-0.01, 1.15*(fert_data.max())))
plt.text(-5, -0.022, "Source: National Vital Statistics Reports, " +
"Volume 64, Number 1, January 15, 2015.", fontsize=9)
plt.tight_layout(rect=(0, 0.03, 1, 1))
Now we can fit a function to these points using the # Generate interpolation function for fertility rates
fert_func = si.interp1d(age_midp, fert_data, kind='cubic') The # Use interpolation function to get interpolated values
age_fine = np.linspace(1, 100, 10000)
age_fine_sub = (age_fine >= age_midp.min()) & (age_fine <= age_midp.max())
fert_rates_fine = np.zeros_like(age_fine)
fert_rates_fine[age_fine_sub] = fert_func(age_fine[age_fine_sub])
# Plot interpolated values and original data
fig, ax = plt.subplots()
plt.scatter(age_midp, fert_data, s=70, c='blue', marker='o', label='Data')
plt.plot(age_fine, fert_rates_fine, label='Cubic spline')
minorLocator = MultipleLocator(1)
ax.xaxis.set_minor_locator(minorLocator)
plt.grid(b=True, which='major', color='0.65',linestyle='-')
plt.title('Fitted cubic spline fertility rates by age ($f_{s}$)', fontsize=20)
plt.xlabel(r'Age $s$')
plt.ylabel(r'Fertility rate $f_{s}$')
plt.xlim((1, 100))
plt.ylim((-0.01, 1.15*(fert_data.max())))
plt.legend(loc='upper right')
plt.text(-5, -0.022, "Source: National Vital Statistics Reports, " +
"Volume 64, Number 1, January 15, 2015.", fontsize=9)
plt.tight_layout(rect=(0, 0.03, 1, 1)) |
In calibrating this model, we have to incorporate consumption data. The best source for consumption expenditures in the United States is the Consumer Expenditure Survey (CEX). We need data on household consumer expenditure by age of the primary respondent (head of household). I see two ways that we can do this.
"Consumption over the Life Cycle: Facts from Consumer Expenditure Survey Data" (REStat, 89:3, Aug. 2007). This paper calculates exactly the lifecycle consumption profiles by age that we are interested in using the CEX microdata. For our calibration, we would probably want to average data from two or three of the most recent surveys in order to get rid of any noise that comes with the fine granularity of one-year age bins.
Each of these methods has unique advantages and disadvantages. Method 1 is less precise but significantly easier than method 2. Although it is tricky to estimate a smooth curve whose integral over a particular portion (or even average across discrete one-year age bins) equals the average in the summary data. Method 2 is more accurate, although might be significantly harder than method 1 due to the need to access, manipulate, and clean the source data. Method 2 also includes more noise in the averages from year to year.
The text was updated successfully, but these errors were encountered: