---
title: "GMWMX: Estimate linear models with dependent errors and missing observations"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{GMWMX: Estimate linear models with dependent errors and missing observations}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

This vignette demonstrates how to use the GMWMX estimator to estimate linear
models with dependent errors described by a composite stochastic process in
the presence of missing observations. Consider the model defined as:

\begin{equation}
    \mathbf{Y} =  \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\varepsilon}, \quad \boldsymbol{\varepsilon} \sim \mathcal{F}\left\{\mathbf{0}, \boldsymbol{\Sigma}\left(\boldsymbol{\gamma}\right)\right\},
\end{equation}

where $\mathbf{X} \in \mathbb{R}^{n \times p}$ is a design matrix of observed predictors, $\boldsymbol{\beta} \in \mathbb{R}^p$ is the regression parameter vector and $\boldsymbol{\varepsilon} = \{\varepsilon_{i}\}_{i=1,\ldots,n}$ is a zero-mean process following an unspecified joint distribution $\mathcal{F}$ with positive-definite covariance function $\boldsymbol{\Sigma}(\boldsymbol{\gamma}) \in \mathbb{R}^{n\times n}$ characterizing the second-order dependence structure of the process and parameterized by the vector $\boldsymbol{\gamma} \in \boldsymbol{\Gamma} \subset \mathbb{R}^q$.

Missingness is modeled by a binary process \(\mathbf{Z} \in \{0,1\}^n\) with
\(Z_i = 1\) indicating an observation and \(Z_i = 0\) a missing observation. The missingness process is independent of
$\mathbf{Y}$ and assumes the definition:

\begin{equation}
  Z_i =
\begin{cases}
    Z_{i-1}, & \text{with probability } \rho, \\
    W_i \sim \mathrm{Bernoulli}(\mu(\bm{\vartheta})), & \text{with probability } 1-\rho,
\end{cases}
\end{equation}

with expectation $\mu(\boldsymbol{\vartheta}) = \mathbb{E}[Z_i] \in (0, 1]$ for
all $i \in \{1, \ldots, n\}$, with covariance matrix
$\boldsymbol{\Lambda}(\boldsymbol{\vartheta}) = \operatorname{var}\left(\mathbf{Z}\right)
\in \mathbb{R}^{n\times n}$ whose structure is assumed known up to the
parameter vector $\boldsymbol{\vartheta} \in \boldsymbol{\Upsilon} \subset \mathbb{R}^k$.
The observed series is

\[
  \tilde{\mathbf{Y}} = \mathbf{Y} \odot \mathbf{Z}.
\]

The design matrix with rows zeroed out for missing observations
can be written as \(\tilde{\mathbf{X}} = \left(\mathbf{Z} \otimes \mathbf{1}^T\right) \odot \mathbf{X}\).
The least‑squares estimator based on observed data is

\[
  \hat{\boldsymbol{\beta}} =
  \left(\tilde{\mathbf{X}}^T \tilde{\mathbf{X}}\right)^{-1}\tilde{\mathbf{X}}^T \tilde{\mathbf{Y}},
\]

and we compute residuals

\begin{equation}
\hat{\boldsymbol{\varepsilon}} = \tilde{\mathbf{Y}} - \tilde{\mathbf{X}}\hat{\boldsymbol{\beta}}.
\end{equation}

We then obtain $\hat{\boldsymbol{\vartheta}}$, the estimated parameters of the missingness process using the maximum likelihood estimator, assuming a two-state Markov model.

We then estimate the stochastic parameters using the GMWM estimator:

\begin{equation}
    \hat{\boldsymbol{\gamma}} = \underset{\boldsymbol{\gamma} \in \boldsymbol{\Gamma}}{\arg\min} \,\left\{\hat{\boldsymbol{\nu}} - \boldsymbol{\nu}(\boldsymbol{\gamma}, \hat{\boldsymbol{\vartheta}})\right\}^{T} \, \boldsymbol{\Omega} \, \left\{\hat{\boldsymbol{\nu}} - \boldsymbol{\nu}(\boldsymbol{\gamma}, \hat{\boldsymbol{\vartheta}})\right\},
\end{equation}

where $\hat{\boldsymbol{\nu}}$ is the estimated wavelet variance computed on the
estimated residuals $\hat{\boldsymbol{\varepsilon}}$ and
$\hat{\boldsymbol{\vartheta}}$ is the pre-computed estimator of the parameter
$\boldsymbol{\vartheta}$, with $\boldsymbol{\Omega}$ being any positive-definite
matrix.

The variance-covariance matrix of the estimated regression parameters
$\hat{\boldsymbol{\beta}}$ is then obtained as:

\begin{equation}
\mu(\hat{\boldsymbol{\vartheta}})^{-2}(\mathbf{X}^{\top}\mathbf{X})^{-1}
\mathbf{X}^{\top} \boldsymbol{\Sigma}(\hat{\boldsymbol{\gamma}}, \hat{\boldsymbol{\vartheta}})
\mathbf{X} (\mathbf{X}^{\top}\mathbf{X})^{-1}.
\end{equation}

where $\boldsymbol{\Sigma}(\hat{\boldsymbol{\gamma}}, \hat{\boldsymbol{\vartheta}}) = \left\{\boldsymbol{\Lambda}(\hat{\boldsymbol{\vartheta}}) + \mu(\hat{\boldsymbol{\vartheta}})^2 \mathbf{1}\mathbf{1}^{\top}\right\} \odot \boldsymbol{\Sigma}(\hat{\boldsymbol{\gamma}})$.


```{r setup, include=FALSE}
knitr::opts_chunk$set(
  fig.width = 10,
  fig.height = 6,
  fig.align = "center"
)

# so as not to execute this vignette during package development
# knitr::opts_chunk$set(eval = FALSE)
```

Overall, these examples illustrate how `gmwmx2()` supports inference for
linear regression models with dependent errors in presence of missing observations,
and how the results compare to a naive OLS fit that ignores dependence.

We load the packages used for simulation, wavelet variance diagnostics, and
summary plots. The helper `boxplot_mean_dot()` overlays Monte Carlo means on
top of each boxplot for easier comparison.

```{r, warning=FALSE, message=FALSE}
library(gmwmx2)
library(wv)
library(dplyr)

boxplot_mean_dot <- function(x, ...) {
  boxplot(x, ...)
  x_mat <- as.matrix(x)
  mean_vals <- colMeans(x_mat, na.rm = TRUE)
  points(seq_along(mean_vals), mean_vals, pch = 16, col = "black")
}
```


We first construct a design matrix with an intercept, linear trend, and annual
seasonal components. This deterministic signal will be used across examples.


# Build an arbitrary design matrix X
```{r}
n = 15*365
X = matrix(NA, nrow = n, ncol = 4)
# intercept
X[, 1] = 1
# trend
X[, 2] = 1:n
# annual sinusoid
omega_1 <- (1 / 365.25) * 2 * pi
X[, 3] <- sin((1:n) * omega_1)
X[, 4] <- cos((1:n) * omega_1)
beta = c(0.5 , 0.00004, 0.005,  0.0008)

# visualize the deterministic signal
plot(x = X[, 2], y = X %*% beta, type = "l",
     main = "Signal", xlab = "Time", ylab = "Signal")
```

Next we simulate dependent errors, add them to the signal, and then introduce
missing observations using a two‑state Markov model. 


# Example 1: White noise + AR(1)

## Generate signal

```{r}
phi_ar1 = 0.975
sigma2_ar1 =  5e-05
sigma2_wn =  7e-04
eps = generate(ar1(phi = phi_ar1, sigma2 = sigma2_ar1) + wn(sigma2_wn), n = n, seed = 123)$series
plot(wv::wvar(eps))
y = X %*% beta + eps

# generate missingness
mod_missing = markov_two_states(p1 = 0.05, p2 = .95)
Z = generate(mod_missing, n = n, seed = 123)
plot(Z)
Z_process = Z$series
y_observed = ifelse(Z_process == 1, y, NA)
plot(X[, 2], y_observed, type = "l", main = "Simulated y (WN + AR1)")


id_not_observed = which(is.na(y_observed))
for(i in id_not_observed){
  abline(v = X[i, 2], col = "grey70", lty =1)
}
```

We then fit the dependent‑error regression using `gmwmx2()`, which accounts
for the stochastic error model and the missing observations.


## Fit model
```{r}
fit = gmwmx2(X = X, y = y_observed, model = wn() + ar1() )
print(fit)
```

## Monte Carlo simulation
We now assess inference quality under missingness by repeating the simulation,
fitting the correct dependent‑error model with `gmwmx2()`, and comparing
against a misspecified OLS fit that treats the errors as i.i.d. and uses only
observed data.

```{r}
B = 100
mat_res = matrix(NA, nrow=B, ncol=19)
for(b in seq(B)){

  eps = generate(ar1(phi=phi_ar1, sigma2=sigma2_ar1) + wn(sigma2_wn), n=n, seed = (123 + b))$series
  y = X %*% beta + eps
  Z = generate(mod_missing, n = n, seed = 123 + b)$series
  y_observed = ifelse(Z == 1, y, NA)
  fit = gmwmx2(X = X, y = y_observed, model = wn() + ar1() )
  # misspecified model assuming white noise as the stochastic model and only on observed data
  fit2 = lm(y_observed~X[,2] + X[,3] + X[,4])

  mat_res[b, ] = c(fit$beta_hat, fit$std_beta_hat,
                   summary(fit2)$coefficients[,1],
                   summary(fit2)$coefficients[,2],
                   fit$theta_domain$`AR(1)_2`,
                   fit$theta_domain$`White Noise_1`)
  # cat("Iteration ", b, " completed.\n")
}
```


## Plot empirical distributions of estimated parameters
```{r}
# compute empirical coverage
mat_res_df = as.data.frame(mat_res)
colnames(mat_res_df) = c("gmwmx_beta0_hat", "gmwmx_beta1_hat",
                         "gmwmx_beta2_hat", "gmwmx_beta3_hat",
                         "gmwmx_std_beta0_hat", "gmwmx_std_beta1_hat",
                         "gmwmx_std_beta2_hat", "gmwmx_std_beta3_hat",
                         "lm_beta0_hat", "lm_beta1_hat", "lm_beta2_hat", "lm_beta3_hat",
                         "lm_std_beta0_hat", "lm_std_beta1_hat", "lm_std_beta2_hat", "lm_std_beta3_hat",
                         "phi_ar1","sigma_2_ar1" ,"sigma_2_wn")


par(mfrow = c(1, 4))
boxplot_mean_dot(mat_res_df[, c("gmwmx_beta0_hat")],las=1,
                 names = c("beta0"),
                 main = expression(beta[0]), ylab = "Estimate")
abline(h = beta[1], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta1_hat")],las=1,
                 names = c("beta1"),
                 main = expression(beta[1]), ylab = "Estimate")
abline(h = beta[2], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta2_hat")],las=1,
                 names = c("beta2"),
                 main = expression(beta[2]), ylab = "Estimate")
abline(h = beta[3], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta3_hat")],las=1,
                 names = c("beta3"),
                 main = expression(beta[3]), ylab = "Estimate")
abline(h = beta[4], col = "black", lwd = 2)


par(mfrow = c(1, 3))

boxplot_mean_dot(mat_res_df$phi_ar1,las=1,
                 names = c("phi_ar1"),
                 main = expression(phi["AR1"]), ylab = "Estimate")
abline(h = phi_ar1, col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df$sigma_2_ar1,las=1,
                 names = c("phi_ar1"),
                 main = expression(sigma["AR1"]^2), ylab = "Estimate")
abline(h = sigma2_ar1, col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df$sigma_2_wn,las=1,
                 names = c("phi_ar1"),
                 main = expression(sigma["WN"]^2), ylab = "Estimate")
abline(h = sigma2_wn, col = "black", lwd = 2)
par(mfrow = c(1, 1))
```

The boxplots summarize the empirical distribution of the regression estimates
and stochastic parameters across Monte Carlo replications, with the true values
shown as horizontal lines.


We report the empirical fraction of times the true \(\boldsymbol{\beta}\)
entries fall inside the estimated intervals for both `gmwmx2()` and OLS.

## Compute empirical coverage of confidence intervals
```{r}
# Compute empirical coverage of confidence intervals for beta
zval = qnorm(0.975)
mat_res_df$upper_ci_gmwmx_beta1 = mat_res_df$gmwmx_beta1_hat + zval * mat_res_df$gmwmx_std_beta1_hat
mat_res_df$lower_ci_gmwmx_beta1 = mat_res_df$gmwmx_beta1_hat - zval * mat_res_df$gmwmx_std_beta1_hat
# empirical coverage of gmwmx beta
dplyr::between(rep(beta[2], B), mat_res_df$lower_ci_gmwmx_beta1, mat_res_df$upper_ci_gmwmx_beta1) %>% mean()

# do the same for lm beta
mat_res_df$upper_ci_lm_beta1 = mat_res_df$lm_beta1_hat + zval * mat_res_df$lm_std_beta1_hat
mat_res_df$lower_ci_lm_beta1 = mat_res_df$lm_beta1_hat - zval * mat_res_df$lm_std_beta1_hat
dplyr::between(rep(beta[2], B), mat_res_df$lower_ci_lm_beta1, mat_res_df$upper_ci_lm_beta1) %>% mean()
```

We now repeat the workflow for a long‑memory error model (white noise +
stationary power‑law). The steps mirror the AR(1) case: simulate with
missingness, fit `gmwmx2()`, and compare to a misspecified OLS fit.

The parameter \(\kappa\) controls the long‑memory behavior of the power‑law
component.

# Example 2: White noise + stationary power-law

## Generate signal

```{r}
kappa_pl <- -0.75
sigma2_wn <- 2e-07
sigma2_pl <- 2.25e-06
eps_pl = generate(wn(sigma2_wn) + pl(kappa = kappa_pl, sigma2 = sigma2_pl),
                  n = n, seed = 1234)
plot(eps_pl$series, type = "l")
plot(wv::wvar(eps_pl$series))
y_pl = X %*% beta + eps_pl$series
Z = generate(mod_missing, n = n, seed = 1234)
plot(Z)
Z_process = Z$series
y_observed = ifelse(Z_process == 1, y_pl, NA)

plot(X[, 2], y_observed, type = "l", main = "Simulated y (WN + PL)")
id_not_observed = which(is.na(y_observed))
for(i in id_not_observed){
  abline(v = X[i, 2], col = "grey70", lty =1)
}
```

## Fit model
```{r}
fit_pl = gmwmx2(X = X, y = y_observed, model = wn() + pl())
fit_pl
```


## Monte Carlo simulation
```{r}
B_pl = 100
mat_res_pl = matrix(NA, nrow = B_pl, ncol = 19)
for(b in seq(B_pl)){
  eps = generate(wn(sigma2_wn) + pl(kappa = kappa_pl, sigma2 = sigma2_pl),
                 n = n, seed = (123 + b))$series
  y = X %*% beta + eps

  Z = generate(mod_missing, n = n, seed = (123 + b))
  Z_process = Z$series
  y_observed = ifelse(Z_process == 1, y, NA)

  fit = gmwmx2(X = X, y = y_observed, model = wn() + pl())
  fit2 = lm(y_observed~X[,2] + X[,3] + X[,4])

  mat_res_pl[b, ] = c(fit$beta_hat, fit$std_beta_hat,
                      summary(fit2)$coefficients[,1],
                      summary(fit2)$coefficients[,2],
                      fit$theta_domain$`Stationary PowerLaw_2`,
                      fit$theta_domain$`White Noise_1`)
  # cat("Iteration ", b, " \n")
}
```

## Plot empirical distributions of estimated parameters

```{r}
mat_res_pl_df = as.data.frame(mat_res_pl)
colnames(mat_res_pl_df) = c("gmwmx_beta0_hat", "gmwmx_beta1_hat",
                            "gmwmx_beta2_hat", "gmwmx_beta3_hat",
                            "gmwmx_std_beta0_hat", "gmwmx_std_beta1_hat",
                            "gmwmx_std_beta2_hat", "gmwmx_std_beta3_hat",
                            "lm_beta0_hat", "lm_beta1_hat", "lm_beta2_hat", "lm_beta3_hat",
                            "lm_std_beta0_hat", "lm_std_beta1_hat",
                            "lm_std_beta2_hat", "lm_std_beta3_hat",
                            "kappa_pl","sigma2_pl" ,"sigma2_wn")


# Plot empirical distributions (beta and stochastic parameters)


par(mfrow = c(1, 4))
boxplot_mean_dot(mat_res_df[, c("gmwmx_beta0_hat")],las=1,
                 names = c("beta0"),
                 main = expression(beta[0]), ylab = "Estimate")
abline(h = beta[1], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta1_hat")],las=1,
                 names = c("beta1"),
                 main = expression(beta[1]), ylab = "Estimate")
abline(h = beta[2], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta2_hat")],las=1,
                 names = c("beta2"),
                 main = expression(beta[2]), ylab = "Estimate")
abline(h = beta[3], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta3_hat")],las=1,
                 names = c("beta3"),
                 main = expression(beta[3]), ylab = "Estimate")
abline(h = beta[4], col = "black", lwd = 2)

par(mfrow = c(1, 1))
```


```{r}
par(mfrow = c(1, 3))

boxplot_mean_dot(mat_res_pl_df$kappa_pl,las=1,
                 main = expression(kappa["PL"]), ylab = "Estimate")
abline(h = kappa_pl, col = "black", lwd = 2)

boxplot_mean_dot(mat_res_pl_df$sigma2_pl,las=1,
                 main = expression(sigma["PL"]^2), ylab = "Estimate")
abline(h = sigma2_pl, col = "black", lwd = 2)

boxplot_mean_dot(mat_res_pl_df$sigma2_wn,las=1,
                 names = c("phi_ar1"),
                 main = expression(sigma["WN"]^2), ylab = "Estimate")
abline(h = sigma2_wn, col = "black", lwd = 2)
par(mfrow = c(1, 1))
```

Compute empirical coverage for the regression coefficients under the power‑law
error model and compare against OLS.

## Compute empirical coverage of confidence intervals
```{r}
zval = qnorm(0.975)
mat_res_pl_df$upper_ci_gmwmx_beta1 = mat_res_pl_df$gmwmx_beta1_hat + zval * mat_res_pl_df$gmwmx_std_beta1_hat
mat_res_pl_df$lower_ci_gmwmx_beta1 = mat_res_pl_df$gmwmx_beta1_hat - zval * mat_res_pl_df$gmwmx_std_beta1_hat
# empirical coverage of gmwmx beta
dplyr::between(rep(beta[2], B_pl), mat_res_pl_df$lower_ci_gmwmx_beta1, mat_res_pl_df$upper_ci_gmwmx_beta1) %>% mean()

# do the same for lm beta
mat_res_pl_df$upper_ci_lm_beta1 = mat_res_pl_df$lm_beta1_hat + zval * mat_res_pl_df$lm_std_beta1_hat
mat_res_pl_df$lower_ci_lm_beta1 = mat_res_pl_df$lm_beta1_hat - zval * mat_res_pl_df$lm_std_beta1_hat

dplyr::between(rep(beta[2], B_pl), mat_res_pl_df$lower_ci_lm_beta1, mat_res_pl_df$upper_ci_lm_beta1) %>% mean()
```

#  Example 3: White noise + flicker

## Generate signal

```{r}
sigma2_wn_fl <- 8e-07
sigma2_fl <- 1e-06
eps_fl = generate(wn(sigma2_wn_fl) + flicker(sigma2 = sigma2_fl),
                  n = n, seed = 123)
plot(eps_fl$series, type = "l")
plot(wv::wvar(eps_fl$series))
Z = generate(mod_missing, n = n, seed = 1234)
plot(Z)

y_fl = X %*% beta + eps_fl$series

Z = generate(mod_missing, n = n, seed = 1234)
plot(Z)
Z_process = Z$series


y_observed = ifelse(Z_process == 1, y_fl, NA)
```


```{r}
plot(X[, 2], y_observed, type = "l", main = "Simulated y (WN + FL)")
id_not_observed = which(is.na(y_observed))
for(i in id_not_observed){
  abline(v = X[i, 2], col = "grey70", lty =1)
}
```


## Fit model
```{r}
fit_fl = gmwmx2(X = X, y = y_observed, model = wn() + flicker())
fit_fl
```

## Monte Carlo simulation
```{r}
B_fl = 100
mat_res_fl = matrix(NA, nrow = B_fl, ncol = 18)
for(b in seq(B_fl)){
  eps = generate(wn(sigma2_wn_fl) + flicker(sigma2 = sigma2_fl),
                 n = n, seed = (123 + b))$series
  y = X %*% beta + eps


  Z = generate(mod_missing, n = n, seed = (123 + b))
  Z_process = Z$series
  y_observed = ifelse(Z_process == 1, y, NA)


  fit = gmwmx2(X = X, y = y_observed, model = wn() + flicker())
  fit2 = lm(y_observed~X[,2] + X[,3] + X[,4])

  mat_res_fl[b, ] = c(fit$beta_hat, fit$std_beta_hat,
                      summary(fit2)$coefficients[,1],
                      summary(fit2)$coefficients[,2],
                      fit$theta_domain$`Flicker_2`,
                      fit$theta_domain$`White Noise_1`)
  # cat("Iteration ", b, " \n")
}
```

## Plot empirical distributions of estimated parameters
```{r}
mat_res_fl_df = as.data.frame(mat_res_fl)
colnames(mat_res_fl_df) = c("gmwmx_beta0_hat", "gmwmx_beta1_hat",
                            "gmwmx_beta2_hat", "gmwmx_beta3_hat",
                            "gmwmx_std_beta0_hat", "gmwmx_std_beta1_hat",
                            "gmwmx_std_beta2_hat", "gmwmx_std_beta3_hat",
                            "lm_beta0_hat", "lm_beta1_hat", "lm_beta2_hat", "lm_beta3_hat",
                            "lm_std_beta0_hat", "lm_std_beta1_hat",
                            "lm_std_beta2_hat", "lm_std_beta3_hat",
                            "sigma2_fl" ,"sigma2_wn")
```


```{r}
par(mfrow = c(1, 4))
boxplot_mean_dot(mat_res_df[, c("gmwmx_beta0_hat")],las=1,
                 names = c("beta0"),
                 main = expression(beta[0]), ylab = "Estimate")
abline(h = beta[1], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta1_hat")],las=1,
                 names = c("beta1"),
                 main = expression(beta[1]), ylab = "Estimate")
abline(h = beta[2], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta2_hat")],las=1,
                 names = c("beta2"),
                 main = expression(beta[2]), ylab = "Estimate")
abline(h = beta[3], col = "black", lwd = 2)

boxplot_mean_dot(mat_res_df[, c("gmwmx_beta3_hat")],las=1,
                 names = c("beta3"),
                 main = expression(beta[3]), ylab = "Estimate")
abline(h = beta[4], col = "black", lwd = 2)

par(mfrow = c(1, 1))
```


```{r}
par(mfrow = c(1, 2))

boxplot_mean_dot(mat_res_fl_df$sigma2_fl,las=1,
                 main = expression(sigma["FL"]^2), ylab = "Estimate")
abline(h = sigma2_fl, col = "black", lwd = 2)

boxplot_mean_dot(mat_res_fl_df$sigma2_wn,las=1,
                 names = c("phi_ar1"),
                 main = expression(sigma["WN"]^2), ylab = "Estimate")
abline(h = sigma2_wn_fl, col = "black", lwd = 2)
par(mfrow = c(1, 1))

```


## Compute empirical coverage of confidence intervals
```{r}
# Compute empirical coverage of confidence intervals for beta

zval = qnorm(0.975)
mat_res_fl_df$upper_ci_gmwmx_beta1 = mat_res_fl_df$gmwmx_beta1_hat + zval * mat_res_fl_df$gmwmx_std_beta1_hat
mat_res_fl_df$lower_ci_gmwmx_beta1 = mat_res_fl_df$gmwmx_beta1_hat - zval * mat_res_fl_df$gmwmx_std_beta1_hat
# empirical coverage of gmwmx beta
dplyr::between(rep(beta[2], B_fl), mat_res_fl_df$lower_ci_gmwmx_beta1, mat_res_fl_df$upper_ci_gmwmx_beta1) %>% mean()

# do the same for lm beta
mat_res_fl_df$upper_ci_lm_beta1 = mat_res_fl_df$lm_beta1_hat + zval * mat_res_fl_df$lm_std_beta1_hat
mat_res_fl_df$lower_ci_lm_beta1 = mat_res_fl_df$lm_beta1_hat - zval * mat_res_fl_df$lm_std_beta1_hat
dplyr::between(rep(beta[2], B_fl), mat_res_fl_df$lower_ci_lm_beta1, mat_res_fl_df$upper_ci_lm_beta1) %>% mean()

```