[NOTE: The R programs that produced the results in this blog are publicly available here: https://osf.io/w2vxg/. “Program1” produced the results for single simulated datasets. “Program2” produced the results when simulating 1000 datasets.]

Introduction

In a previous blog, I wrote about the curious (to me) practice among meta-analysts of focusing on univariate regression tests (Egger’s regression test, Funnel Asymmetry Tests) to reach conclusions about the presence of publication bias. I argued that to the extent the standard error variable is correlated with data, estimation, and study characteristics, these tests suffer from omitted variable bias.

In that blog, I gave some examples from the literature of how controlling for these characteristics can change one’s conclusion about publication bias. This led me to two recommendations:

1) Meta-analysts should always include a multivariate FAT in the section of their paper that is devoted to testing for publication bias.

2) Meta-analysts should always include a multivariate "effect beyond bias" alongside the univariate "effect beyond bias" estimate.

This blog picks up and expands on the second of these recommendations. It provides examples of how multivariate PET-PEESE often, but not always, improves on univariate PET-PEESE. It also illustrates just how poorly PET-PEESE calculations of “effect beyond bias” can perform.

As a final benefit, this blog provides R programming code so that readers can run their own simulations (see note above).

The Problem

Define βhat and se as the estimated effect and its standard error from primary study i, where we assume there is one estimate per study. Then the univariate PET and univariate PEESE specifications are given respectively by:

(1) βhat = α0 + α1 se + ε , and

(2) βhat = α0 + α1 se^2 + ε.

α0 represents the corresponding “effect beyond bias”. Its estimate represents the overall mean effect that the meta-analyst would expect to see in the absence of publication bias.

The key thing to note is that the estimates of α0 and α1 are necessarily negatively correlated. As a result, if α1 is biased, say due to omitted variables, that will induce a corresponding opposite bias in the estimated “effect beyond bias”, α0.

Simulation #1: No omitted variable bias

I’ll start with a simulated example that illustrates how PET-PEESE works. Let the true relationship (DGP) between estimated effects βhat and their standard errors se be given by

(3a) se ~ U(0,2).

(3b) βhat = 1.50 + ε , ε ̴ N(0,1), and

In this example, the true effect of the treatment is 1.50 and the only reason that estimates differ across primary studies is due to sampling error, ε. βhat is unrelated to se in the population.

The scatterplot below shows the joint distribution of βhat and se for a simulated sample of 150 observations.

I now introduce publication bias so that the meta-analyst only observes statistically significant estimated effects: βhat/se ≥ 1.96.

In the figure below, the open dots are unobserved, so that the meta-analyst’s dataset only consists of the black dots. Publication selection has induced a correlation between βhat and se in the meta-analyst’s sample.

Given this publication selection-biased sample, the meta-analyst estimates an overall (unadjusted) mean treatment effect of 2.11 (given by the horizontal blue line below). This is substantially greater than the true effect of 1.50.

Recognizing that her data are impacted by publication bias, the meta-analyst estimates the PET specification of Equation (1). The estimated relationship is represented by the red line in the figure below. Note that the estimated “effect beyond bias” (=”Intercept”) equals 1.5364, very close to the true value of 1.50.

The above displays the results from one simulated dataset. The associated R program is “Program1”. When I repeat the above to get a total of 1000 simulated datasets (see “Program2”), I obtain the results in the table below.

The true effect is 1.50. The average unadjusted effect over 1000 simulations is 2.11 (the same as in the example above). The average “effect beyond bias” using the PET specification is 1.50, and the average “effect beyond bias” using the PEESE specification is 1.76.

While the PEESE specification overestimates the true effect in this case, it is still an improvement on the estimate that makes no allowance for publication bias (“Unadjusted”).

The above is an example where the univariate PET (and to a lesser degree, PEESE) works to eliminate the bias associated with publication selection.

Simulation #2: Standard error is correlated with study characteristics

I now illustrate the problem when data, estimation, and study characteristics are correlated with both the estimated effect and its standard error but omitted from the univariate PET-PEESE specification. I then show how estimating a multivariate PET-PEESE specification can help.

Let the true relationship between estimated effects βhat and their corresponding standard errors now be given by

(4a) se ~ U(0,2)

(4b) z = se + ν, ν ̴ N(0,1),

(4c) βhat = 1 + 0.5 z + ε , ε ̴ N(0,1),

In this example, it is still the case that the estimated effects are independent of the standard errors in the DGP. However, the estimated effects are affected by data, estimation, and study characteristics, here represented by the variable z (cf. Equation 4c). And these characteristics are in turn correlated with the studies’ standard errors (cf. Equation 4b).

Equations (4a)-(4c) imply Corr(z,se) = Corr(βhat,z) = 0.5. Together, these induce a correlation between βhat and se (Corr(βhat,se) = 0.25).

In a univariate regression, this induced correlation will cause the estimated effect of publication bias to be overestimated, resulting in an estimate of “effect beyond bias” that is negatively biased.

The following numerical example bears this out. As a benchmark, note that Equations (4a) through (4c) imply that the overall (true) mean value of estimated effects in the population is 1.50.

As before we impose publication selection so that the meta-analyst only observes estimates where βhat/se ≥ 1.96. To address publication bias, she estimates the univariate PET.

The associated scatterplot of βhat and se values, along with the estimated PET specification is illustrated in the figure below. The estimate of “effect beyond bias” is given by 1.1199 < true value of 1.50. As expected, the overestimate of the effect of publication bias has caused the estimated "effect beyond bias" to be negatively biased.

I now show how a multivariate PET can provide an improved estimate of “effect beyond bias.” To address omitted variable bias, I include z in the PET regression. This produces the results below.

In the univariate PET, one obtains the estimate of “effect beyond bias” by predicting the value of βhat in the absence of publication selection as given by the condition se = 0.

In the multivariate PET, one also predicts the value of βhat, but that requires setting a value for z. The most natural candidate is the sample mean, zbar. Accordingly, when se = 0 and z = zbar, the predicted value of βhat = 1.38, less than the true value of 1.50, but closer to the true value than the univariate estimate of 1.12.

To get a more representative result, we repeat the process above and estimate 1000 simulated meta-analysis datasets characterized by the DGP in Equations (4a) through (4c). The table below reports true and unadjusted "effect beyond bias" values, along with univariate PET, univariate PEESE, multivariate PET, and multivariate PEESE estimates of “effect beyond bias”.

The following results are noteworthy:

1) The estimated “effect beyond bias” that is closest to the true value (1.50) comes from the Multivariate PET specification (1.48).

2) While the multivariate PET produces a superior estimate of “effect beyond bias” than the univariate PET specification, the univariate PEESE specification is superior to the multivariate PEESE (1.68 vs. 1.80).

3) All average, adjusted “effect beyond bias” estimates are closer to the true value than the average, unadjusted estimate (2.20).

Before drawing any conclusions from these results, we do one more simulation.

Simulation #3: Standard error is highly correlated with study characteristics

The third simulation repeats the previous analysis except that it increases the correlation between z and se in the population (cf. Equation 5b and compare with Equation 4b).

(5a) se ~ U(0,2)

(5b) z = 5 se + ν, ν ̴ N(0,1),

(5c) βhat= 1 + 0.5 z + ε , ε ̴ N(0,1),

As before, the estimated effects are independent of the standard errors in the DGP, but now Corr(z,se) is 0.94 and Corr(βhat,z) is 0.84. This causes βhat and se to also be more strongly correlated (Corr(βhat,se) = 0.79).

Further, while Equation (5c) is unchanged, the mean true effect is now 3.50, because z takes on larger values following Equation (5b).

The results from the corresponding simulation of 1000 meta-analysis datasets are given below.

The following results are noteworthy:

1) The best “effect beyond bias” estimate in terms of being closest to the true value is the Multivariate PEESE estimate of 3.57, which is close to the true value of 3.50.

2) The worst “effect beyond bias” is the Univariate PET estimate.

3) Both Multivariate PET and PEESE estimates of effect are superior to their Univariate analogues.

4) Not all adjustments for publication bias are better than doing nothing. The unadjusted “effect beyond bias” estimate of 3.77 is closer to the true value than both univariate estimates.

Conclusion

What are we to conclude from all this? First of all, the simulations in this blog are meant to be illustrative. One must be careful to generalize from these simulations for a variety of reasons, including:

- PET-PEESE specifications were estimated with OLS. Different results might be produced by using different estimators such as Fixed Effects, UWLS, or Bom and Rachinger’s Kinked Endogenous estimators.

- Estimators were compared solely on Bias. Different conclusions might result from using MSE or Coverage Rates as performance measures.

Nevertheless, I believe the results from these simulations are sufficient to support Recommendation #2, as stated in the introduction to this blog:

2) Meta-analysts should always include a multivariate "effect beyond bias" alongside the univariate "effect beyond bias" estimate.

Multivariate “effect beyond bias” will not always dominate univariate estimates. However, the logic behind using them is strong, especially when the meta-analyst’s dataset is characterized by significant correlation between standard errors and data, estimation, and study characteristics. In my (admittedly limited) experience, this is frequently the case.

Lastly, I was surprised to see that, sometimes, not adjusting estimates for publication bias can produce better estimates than attempting to correct for publication bias. Until one has a better idea of the conditions that generate this result, one should not always assume that publication bias-corrected estimates of “effect beyond bias” are better than unadjusted estimates.