Skip to content

Commit

Permalink
fix examples (and update readme)
Browse files Browse the repository at this point in the history
  • Loading branch information
Gkreindler committed Dec 31, 2023
1 parent 2701f6f commit 934d301
Show file tree
Hide file tree
Showing 6 changed files with 57 additions and 115 deletions.
31 changes: 26 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,22 +38,35 @@ The user must provide a moment function `mom_fn(data, theta)` that takes a `data

To estimate a model using two-step optimal GMM, compute the asymptotic variance-covariance matrix, and display a table with the results, run
```julia
myfit = GMMTools.fit(MYDATA, mom_fn, theta0, mode=:twostep)
myfit = GMMTools.fit(MYDATA, mom_fn, theta0) # defaults: identify weights matrix, one-step GMM
GMMTools.vcov_simple(MYDATA, mom_fn, myfit)
regtable(myfit)
GMMTools.regtable(myfit)
```

The parameter `theta0` is either a vector of size $P$, or a $K\times P$ matrix with $K$ sets of initial conditions. The convenience function `GMMTools.random_initial_conditions(theta0::Vector, K::Int; theta_lower=nothing, theta_upper=nothing)` generates `K` random initial conditions around `theta0` (and between `theta_lower` and `theta_upper`, if provided).

For inference using Bayesian (weighted) bootstrap, replace the second line by
```julia
# generate bootstrap weights first and separately. In case the estimation is interrupted, running this code again generates exactly the same weights, so we can continue where we left off.
bweights_matrix = boot_weights(MYDATA, myfit, nboot=500, method=:simple, rng_initial_seed=1234)
vcov_bboot(MYDATA, mom_fn, theta0, myfit, bweights_matrix, opts=myopts)
bweights_matrix = GMMTools.boot_weights(MYDATA, myfit, nboot=500, method=:simple, rng_initial_seed=1234)
GMMTools.vcov_bboot(MYDATA, mom_fn, theta0, myfit, bweights_matrix, opts=myopts)
```


# Options
The fit function accepts the following arguments:
```julia
function fit(
data, # any object that can be passed to mom_fn as the first argument
mom_fn::Function, # mom_fn(data, theta) returns a matrix of moments (N x M)
theta0; # initial conditions (vector of size P or K x P matrix for K sets of initial conditions)
W=I, # weight matrix (N x N) or uniform scaling identity (I)
weights=nothing, # Vector of size N or nothing
mode=:onestep, # :onestep or :twostep
run_parallel=false, # run in parallel (pmap, embarasingly parallel) or not
opts=GMMTools.GMMOptions() # other options
)
```

`fit()` accepts detailed options that control (1) whether and how results are saved to file, and (2) the optimization backend and options.

```julia
Expand Down Expand Up @@ -85,6 +98,13 @@ myfit = GMMTools.read_fit(mypath1)
myfit.vcov = GMMTools.read_vcov(mypath2)
```

## Parallel
Embarassingly parallel using Distributed.jl.

For an example, see [examples/example_ols_parallel.jl](https://github.com/Gkreindler/GMMTools.jl/blob/main/examples/example_ols_parallel.jl).

Note, the `MYDATA` object is copied to all workers. Currently, no support exists for SharedArrays.jl or other options that are less memory intensive.

## Capturing errors
By default, any error during optimization stops the entire estimation (or inference) command.

Expand Down Expand Up @@ -112,6 +132,7 @@ Example. Suppose `theta = [alpha, beta]` and we expect `alpha` to be between 0 a

## Documentation to-do list
1. docs
1. Metrics theory recap
1. Bootstrap options, including custom weihts function and an example
1. walkthrough for re-running failed estimation, stability of the random initial condition and bootstrap weights
1. (easy) example with subset parameters
Expand Down
60 changes: 12 additions & 48 deletions examples/example_ols.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,13 @@ Pkg.instantiate()
using Revise
using LinearAlgebra # for identity matrix "I"
using CSV
using Random

using DataFrames
using FixedEffectModels # for benchmarking
using RegressionTables


using GMMTools


Expand All @@ -25,58 +28,19 @@ using GMMTools
# define moments for OLS regression
# residuals orthogonal to the constant and to the variable (acceleration)
# this must always return a Matrix (rows = observations, columns = moments)
function ols_moments(prob, theta)

resids = @. prob.data.mpg - theta[1] - theta[2] * prob.data.acceleration

# n by 2 matrix of moments
moms = hcat(resids, resids .* prob.data.acceleration)

return moms
function ols_moments_fn(data, theta)
resids = @. data.mpg - theta[1] - theta[2] * data.acceleration
return hcat(resids, resids .* data.acceleration)
end

# initial parameter guess
theta0 = [0.0, 0.0]

# create GMM problem (data + weighting matrix + initial parameter guess)
myprob = create_GMMProblem(data=df, W=I, theta0=theta0)

# estimate model
fit(myprob, ols_moments, mode=:twostep)
Random.seed!(123)
theta0 = random_initial_conditions([10.0, 0.0], 20)


fdsfds
myfit = fit(myprob, ols_moments)

# compute asymptotic variance-covariance matrix and save in myfit.vcov
vcov_simple(myprob, ols_moments, myfit)

# print table with results
### Most parsimonius usage
myfit = GMMTools.fit(df, ols_moments_fn, theta0)
GMMTools.vcov_simple(df, ols_moments_fn, myfit)
GMMTools.regtable(myfit)

# compute Bayesian (weighted) bootstrap inference and save in myfit.vcov
vcov_bboot(myprob, ols_moments, myfit, nboot=500)
GMMTools.regtable(myfit) # print table with new bootstrap SEs

# bootstrap with weightes drawn at the level of clusters defined by the variable df.cylinders
vcov_bboot(myprob, ols_moments, myfit, cluster_var=:cylinders, nboot=500)
GMMTools.regtable(myfit)

### Multiple initial conditions and save to file
theta0 = [0.3, 0.5]

theta0_matrix = random_theta0(theta0, 100) # generate 100 random initial conditions "around" theta0

myprob = create_GMMProblem(data=df, W=I, theta0=theta0_matrix)

myopts = default_gmm_opts(
path = "C:/git-repos/GMMTools.jl/examples/temp/", # replace with target folder on your machine
write_iter=true, # save results to file option
clean_iter=true, # save results to file option
overwrite=false, # save results to file option
trace=1
)

myfit = fit(myprob, ols_moments, opts=myopts)
vcov_simple(myprob, ols_moments, myfit)
GMMTools.regtable(myfit)
### See `examples/example_ols_2step.jl` for more options and functionalities
58 changes: 5 additions & 53 deletions examples/example_ols_parallel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -40,74 +40,26 @@ end
# residuals orthogonal to the constant and to the variable (acceleration)
# this must always return a Matrix (rows = observations, columns = moments)
@everywhere function ols_moments_fn(data, theta)

resids = @. data.mpg - theta[1] - theta[2] * data.acceleration

# n by 2 matrix of moments
moms = hcat(resids, resids .* data.acceleration)

return moms
return hcat(resids, resids .* data.acceleration)
end

# initial parameter guess
Random.seed!(123)
theta0 = randn(20,2)
theta0 = random_initial_conditions([10.0, 0.0], 20)

### using Optim.jl
# estimation options
myopts = GMMTools.GMMOptions(
path="C:/git-repos/GMMTools.jl/examples/temp/",
optimizer=:lsqfit,
optim_algo_bounds=true,
lower_bound=[-Inf, -Inf],
upper_bound=[Inf, Inf],
theta_lower=[-Inf, -Inf],
theta_upper=[Inf, Inf],
optim_autodiff=:forward,
write_iter=true,
clean_iter=true,
overwrite=true,
optim_opts=(show_trace=false,), # additional options for LsqFit in a NamedTuple
trace=1)

# estimate model
myfit = GMMTools.fit(df, ols_moments_fn, theta0, mode=:twostep, opts=myopts, run_parallel=true)

fsdfd

# ### using Optim.jl
# estimation options
myopts = GMMTools.GMMOptions(
path="C:/git-repos/GMMTools.jl/examples/temp/",
optim_algo=LBFGS(),
optim_autodiff=:forward,
write_iter=true,
clean_iter=true,
overwrite=true,
trace=1)

# estimate model
myfit = GMMTools.fit(df, ols_moments_fn, theta0, mode=:twostep, opts=myopts)

# compute asymptotic variance-covariance matrix and save in myfit.vcov
vcov_simple(df, ols_moments_fn, myfit)

# print table with results
regtable(myfit)



# compute Bayesian (weighted) bootstrap inference and save in myfit.vcov
myopts.trace = 1
vcov_bboot(df, ols_moments_fn, theta0, myfit, nboot=500, opts=myopts)
GMMTools.regtable(myfit) # print table with new bootstrap SEs -- very similar to asymptotic SEs in this case. Nice!

sdfsd
# using Plots
# histogram(myfit.vcov[:boot_fits].all_theta_hat[:, 1])

# bootstrap with weightes drawn at the level of clusters defined by the variable df.cylinders
myopts.trace = 0
vcov_bboot(df, ols_moments_fn, theta0, myfit, boot_weights=:cluster, cluster_var=:cylinders, nboot=500, opts=myopts)
myfit.vcov

GMMTools.regtable(myfit)

myfit = GMMTools.fit(df, ols_moments_fn, theta0, opts=myopts, run_parallel=true)
17 changes: 9 additions & 8 deletions src/functions_estimation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -167,14 +167,15 @@ end
gateway function to estimate GMM model
"""
function fit(
data,
mom_fn::Function,
theta0;
W=I,
weights=nothing,
mode=:onestep,
run_parallel=true,
opts=GMMTools.GMMOptions())
data, # any object that can be passed to mom_fn as the first argument
mom_fn::Function, # mom_fn(data, theta) returns a matrix of moments (N x M)
theta0; # initial conditions (vector of size P or K x P matrix for K sets of initial conditions)
W=I, # weight matrix (N x N) or uniform scaling identity (I)
weights=nothing, # Vector of size N or nothing
mode=:onestep, # :onestep or :twostep
run_parallel=false, # run in parallel (pmap, embarasingly parallel) or not
opts=GMMTools.GMMOptions() # other options
)

# checks # TODO: add more
@assert isa(mom_fn, Function) "mom_fn must be a function"
Expand Down
2 changes: 1 addition & 1 deletion src/functions_inference.jl
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ function vcov_simple(
data,
mom_fn::Function,
myfit::GMMFit;
opts::GMMOptions=default_gmm_opts()
opts::GMMOptions=GMMOptions()
)

# cannot compute if estimation errored
Expand Down
4 changes: 4 additions & 0 deletions src/io.jl
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,8 @@ Example: fit object saved under "C:/temp/fit.json" and "C:/temp/fit.csv". Then c
"""
function read_fit(full_path; subpath="fit", show_trace=false)

(full_path == "") && return nothing

(full_path[end] == '/') && (full_path *= '/') # ? platform issues?
full_path *= subpath

Expand Down Expand Up @@ -254,6 +256,8 @@ Example: vcov object saved under "C:/temp/vcov.json". Then call `read_vcov("C:/t
"""
function read_vcov(full_path; subpath="vcov", show_trace=false)

(full_path == "") && return nothing

(full_path[end] == '/') && (full_path *= '/') # ? platform issues?
full_path *= subpath

Expand Down

0 comments on commit 934d301

Please sign in to comment.