Developing "Guidelines for the Use of AI in Meta-Analysis of Economics Research" (GUAI-MAER) - and - A Solicitation of Feedback

Nikolai Cook
Oct 27
7 min read

Developing "Guidelines for the Use of AI in Meta-Analysis of Economics Research" (GUAI-MAER)

- and -

A Solicitation of Feedback

Preamble

During the Open Forum of the 2025 MAER-Net Colloquium held at the University of Ottawa, a proposal to develop guidelines for the use of AI in meta-analysis of economics research was made. While some members noted that the rapid evolution of AI would preclude an authoritative and static document from offering specific guidance, others noted the benefits that a tool-agnostic set of guidelines could confer upon the meta-analysis community.

This blog post is meant to be a starting point for the community to engage with and collectively develop these guidelines. My aim is to be comprehensive in soliciting feedback - please feel invited to email any thoughts on these guidelines to ncook@wlu.ca in the coming weeks. This is in addition to the ongoing invitation to meta-analysis researchers to engage in the public discussion of AI in meta-analysis begun by another blog post already present on this website (https://www.maer-net.org/post/ai-tools-for-meta-analysis).

One of the underlying principles I have been using in organizing these thoughts is comparing a position against an extreme alternative. For example, when considering the role of AI in the production of meta-analysis: Is it reasonable to ban the use of AI in the GUAI-MAER? I think not. There are potential and not insignificant breadth, speed, and even potential accuracy gains to be had from the introduction of today's (let alone tomorrow's) available AI tools. In the converse, is it reasonable to accept an almost fully AI-conducted meta-analysis? Again, I think not. At the time of writing, there remain concerns such as hallucinated information making its way into a meta-analysis and non-transparent or non-strict adherence to inclusion/exclusion rules. To my thinking then, there is a non-zero role for AI in meta-analysis and the GUAI-MAER is meant to make this role clear.

I think then, a quick discussion of the potential costs and benefits of guidelines is in order.

Potential costs of GUAI-MAER

The primary cost of adopting a set of guidelines comes from implementation. It is best to avoid implementing guidelines that impede the scientific work of the community. This could happen by stopping a meta-analysis project before it begins, or by restricting a project's access to a useful AI-assistance tool. From this perspective, it is clear to me that the GUAI-MAER should not prescribe nor proscribe any particular tool. For if it were to prescribe a tool, and a better tool comes along, adherence to GUAI-MAER would hinder the researcher. If GUAI-MAER proscribes a tool (or similarly the application of AI to a stage of a meta-analysis), adherence to GUAI-MAER would again hinder the researcher if the proscribed tool is objectively better (or even better in the hands of a particular researcher).

Potential benefits of GUAI-MAER

The primary benefit of adopting a set of guidelines to my mind, comes from the alternative. If we do not adopt a set of guidelines, researchers, editors, and reviewers will be forced to look for guidance elsewhere. In the history of MAER-Net, we have seen the importance of being informed by the greater meta-science community, but have adopted our own field-specific Reporting Guidelines (https://doi.org/10.1111/joes.12363) and provided a field-specific Practitioner Guide (https://onlinelibrary.wiley.com/doi/full/10.1111/joes.12595) for the specific purpose of both addressing our field's idiosyncrasies as they arise in meta-analysis and to provide authoritative guidance to the meta-analysis researcher. From the perspective of the meta-analyst beginning a new meta-analysis assisted by AI, the presence of a static document means that the analyst is given a set of guidelines that they can follow that will not change over the course of their meta-analysis. The presence of a static and authoritative document means that editors and reviewers of the eventual submitted meta-analysis manuscript (particularly those who may not be familiar with the methodology) need not rely on their own internal valuation on whether the use of AI-assistance (and its degree) is appropriate; the professional community most engaged with meta-analysis in economics research will have provided their evaluation. That is, the introduction of GUAI-MAER should reduce the noise from the publication process for researchers. Indeed (and this is an aside) the introduction of the GUAI-MAER is an opportunity to set an initial state of AI in meta-analysis research rather than it being set "for us."

Some proposed components of GUAI-MAER

Small additions to PRISMA2020 Flow Diagram. A PRISMA diagram is often included in the presentation of systematic reviews and meta-analysis, and for good reason. The GUAI-MAER will recommend specific but concise information to be included in the PRISMA diagram. One example is below.

Disclosure is critical. As we have seen elsewhere in the profession, the disclosure of the use of AI by researchers is expected. GUAI-MAER's expectation is that for every stage of the meta-analysis process, whether AI is used, the specific form of AI, and the role of the human should be disclosed. With an expectation that at some point a comparison between meta-analysis that are human-only and those that are AI-assisted will be conducted, this disclosure should be done in a standardized fashion. See below for an example of a Standard GUAI-MAER Table. It is in its infancy, and requires much more thought.

A possible inclusion to this discussion is the costs, benefits, and best reporting methods for cross-checking using AI tools (Claude and ChatGPT) or cross checking with two humans using the same AI tool.

What follows is a first draft which includes the Reporting Guidelines for Meta-Analysis in Economics, with minimal additions in red text. Again, feedback is welcomed!

Guidelines for Use of AI in Meta-Analysis in Economics Research(GUAI-MAER)

Research papers that conduct meta-analysis with the assistance of AI in economics should include the points detailed below.

1. Research Questions and Effect Size

· A clear statement of the specific economic theories, hypotheses, or effects studied.

· A precise definition of how effects are measured (the "effect size") and their standard errors or other proxies for precision, accompanied by any relevant formulas if transformations are made.

· An explicit description about how measured effects are comparable, including any methods or formulas used to standardize or convert them to a common metric.

2. Research Literature Searching, Compilation, and Coding

· A full report of how the research literature was searched. This report should include:

o the exact databases or other sources used;

o the precise combination of keywords employed; and

o the date that the search was completed.

o If AI was used during any part of the literature search, it is disclosed in the standard GUAI-MAER table.

· A full disclosure of the rules for study (or effect size) inclusion/exclusion. This should be accompanied by a PRISMA flow diagram which includes "AI" in superscript immediately after the cell description if AI was used during that stage.

· A statement addressing who searched, read, and coded the research literature. Two or more reviewers should code the relevant research and disclose a measure of their agreement. An AI may be used as a substitute for one of the reviewers, however, this must be disclosed in the standard GUAI-MAER table and any discrepancies between the AI and the human reviewer must be reviewed by another human and a measure of the initial and final disagreement must be disclosed.

· A complete list of the information coded for each study or estimate. At a minimum, we recommend that reviewers conducting a meta-analysis code:

o the estimated effect size;

o its standard error, when feasible, and the degrees of freedom (or sample size);

o If an AI is used during coding, a human has either reviewed each estimate and its associated variables at least once or reviewed sufficient cases to give a reasonable estimate of accuracy (which is reported, including total number reviewed as well as the proportion of errors and omissions).

o Dummy (i.e., 0/1) variables indicating exactly which estimates were reviewed by a human.

· Reviewers conducting a meta-regression analysis also need to code:

o variables that distinguish which type of econometric model, methods, and techniques were employed;

o dummy (i.e.,0/1) variables for the omission of theoretically relevant variables in the research study investigated;

o empirical setting (e.g., region, market, and industry);

o data types (panel, cross-sectional, time series,...);

o alternative ways that effects were measured and reported before being converted to a common

o effect size;

o year of the data used and/or publication year;

o type of publication (journal, working paper, book chapter, etc.); and

o the primary study, publication, and/or dataset from which an observation is drawn.

· The rule or method used to identify outliers, leverage, or influence points when omitted.

3. Modeling Issues

· A table displaying definitions of all the coded variables along with their descriptive statistics (means and standard deviations and, if applicable, proportion coded by AI).

· A fully reported meta-regression analysis, along with the exact strategy used to simplify it (e.g., Bayesian or frequentist model averaging, general-to-specific, etc.). The use of an AI tool does not preclude the later sharing of reproducible analysis code.

· An investigation of publication, selection, and misspecification biases unless these biases can reasonably be expected to be absent. When suspected, these should be controlled for in subsequent meta-regression models.

· Methods to accommodate heteroscedasticity (e.g., inverse-variance weights) and dependence across estimates, such as within-study dependence (e.g., clustered or bootstrapped standard errors and panel or multilevel meta-regression models).

4. Further Reporting and Interpretation

· Graph(s) of the effect sizes, such as funnel graphs, forest plots, or other statistical displays of data, if produced by AI, this is disclosed in the figure's caption.

· Robustness checks for meta-regression models and publication bias methods.

· A discussion of the economic (or practical) significance of the main findings.

· "Best practice" estimate(s) and sensible variations from them. An AI should not be used in this stage, as substituting best practise values into the estimated meta regression requires professional judgement.

· A statement about sharing the data or link to its public posting along with the codes of the core analyses, and if applicable, sufficient details for a researcher to apply an AI tool in the same manner as in the meta-analysis.

Standard GUAI-MAER Table

Stage	Which AI Tool (incl. version)	Date	Prop.AI-Only, Human Only, Hybrid	Prop. AI-Human Agreement
Identification (Studies)	None
Screening (Studies)	None
Coding (Studies)	None
Coding (Estimates)	ChatGPT 5 Auto	2025-9-8 through 2025-10-15	0.25,0,0.75 (n=3000,0,9000)	0.92 to 0.95 (n=8280,8550)
Modelling	None		0,1,0	NA
Reporting (Fig 1)	ChatGPT 5 Auto	2025-10-16	1,0,0	NA
Reporting \| (All Else)	None		0,1,0	NA

13 Comments

Nikolai Cook

Nov 03

Hello all,

I am humbled by the thoughtful engagement and collective wisdom that this community has offered to the GUAI-MAER. Thank you, Zuzana, Sebastian, Matěj, Franz, Frantisek, Heiko, Pedro, Jerome, Klára, Martina, Tomas, and Tom - this discussion will no doubt lead to a better informed, nuanced, and refined version of the GUAI-MAER.

There is a general sense of agreement that the community believes these guidelines are timely and important, as well as a shared sense of optimism and anticipation around AI adoption in meta-analysis. I can also see some clear coalescing of themes: the central role of the human, risk-based proportional auditing, when disclosure of AI in the workflow (including when appropriate, and careful thought to prompts), much-needed suggestions…

Tom Stanley

Oct 31

Hi Nick,

Let me add my sincere thanks to Nick for leading this very thoughtful, rich, and important discussion, to Tomas for proposing this initiative in Ottawa, and to all of you who have made excellent comments. This discussion has been technically astute, deeply considered, and judiciously balanced. I can roughly see a consensus forming that balances the need for some guidance with dynamic nature of AIs technical developments and our human understanding of how best to use it. I have been persuaded that it is important for MAER-Net to establish a set of broad guidelines to help journal editors and the meta-analysis community understand what the ‘best-practice’ use of AI is. Once established, these guidelines would also serve as…

Tomas Havranek

Oct 30

Dear friends,

I think there is a real danger for Nick to become a victim of his own successful proposal. Many great ideas in this thread! I like the summaries by Jerome and Martina. A couple of brief points:

The guidelines should be published as a paper in JoES. Perhaps there could be a way to keep the document dynamic. For example, see https://www.aeaweb.org/articles?id=10.1257/jel.20231736 which is updated every 6 months. It is also a great resource we should cite in the guidelines.
We should frame the guidelines as recommendations, not strict requirements. In some contexts authors will need to deviate. For a meta-analysis submitted to JoES, we will require authors to explain any deviations.
It is likely that most data collection…

Edited

Martina Luskova

Oct 29

Dear Nick,

Thank you for leading the effort!

I agree with Zuzana’s proposition to audit a certain proportion of studies, and with František’s idea that auditing should depend on the reliability of the tool rather than being applied uniformly. To make the guidelines future-proof and account for the fast-growing AI field, where new tools are emerging almost every day, we might distinguish between general AI tools and specialized AI tools.

Assuming that specialized AI tools are explicitly developed and tested for a given task (such as screening or data collection), we might expect their reliability to be comparable to, or even exceed, that of a human coder. Therefore, in line with František’s comment, I suggest that auditing should focus…

Klára Kantová

Oct 29

Dear all,

Firstly, thank you, Nick, for leading this discussion. As I am still learning how AI tools can be used responsibly in meta-analysis/research in general, I find this blog post and the comments exchange incredibly helpful.

I very much agree with the idea that GUAI-MAER should aim for traceability without becoming a huge burden. AI is quickly becoming a routine part of many research tools, so disclosure must remain focused on substantive stages of the workflow rather than on every autocomplete or assistant feature. At the same time, I share the opinion that core AI-assisted steps should be transparent, particularly when AI is used for literature discovery, screening, or coding. From that perspective, I would like to add a few suggestions…

Edited