AI Disclosure in Systematic Reviews and Meta-Analyses

AI can save hours in a review. It can also hide mistakes.

A systematic review stands on its audit trail.

Readers need to know how you searched, screened, extracted, judged, and synthesized evidence. If AI touched any of those steps, you need to say so. Otherwise, nobody can tell where the tool shaped the review, where errors may have entered, or how your team caught them.

That does not mean you should avoid AI.

It means you should document AI use with the same care you already apply to search strategies, screening logs, and extraction forms. PRISMA 2020 asks authors to report review methods in enough detail for readers to assess what happened. AI use fits inside that same rule. (prisma-statement.org)

If you want one clean record of that use, generate an AI Usage Card for the review and adapt it for your manuscript, supplement, protocol, or repository.

Why evidence synthesis needs stricter disclosure

Systematic reviews already run on traceability.

You predefine the question. You specify databases. You set inclusion criteria. You document screening decisions. You explain extraction rules. You justify synthesis methods. AI disclosure is not extra bureaucracy. It is part of the method record.

AI can influence a review long before the writing stage.

A tool might suggest PICO terms, expand search strings, rank abstracts, recommend exclusions, extract effect sizes, summarize outcomes, draft risk-of-bias notes, or write chunks of the manuscript. Each step can change the final review. One bad extraction can alter a pooled estimate. One weak screening shortcut can change the studies that make it into the analysis.

That is why a vague line like "we used AI for writing assistance" often fails in this setting.

Recent guidance from Cochrane and partner evidence-synthesis groups takes the same line. Their joint position on responsible AI use in evidence synthesis calls for transparent reporting, human oversight, justification for use, and author responsibility for the final work. Cochrane also states that AI and automation must be disclosed and that authors remain accountable for final content. (cochrane.org)

If you want the broader case for this, read Why AI Transparency Matters in Research and AI Ethics and Documentation in Academic Research.

The real question is where AI entered the pipeline

Many researchers still treat disclosure as a writing issue.

That is too narrow for evidence synthesis.

In a systematic review or meta-analysis, AI can shape decisions at several points. You should map each one.

Planning the review

Some teams use chatbots to refine the review question, define PICO elements, suggest outcomes, or brainstorm search terms and databases.

That use deserves disclosure because it can shape scope before the first search ever runs.

Searching the literature

AI may expand keywords, translate terms, rewrite Boolean strings, or help identify related records.

Search transparency sits at the center of review quality. If AI changed the search strategy, say so. If it only suggested candidate terms and your team rebuilt the final strings by hand, say that too.

Screening and eligibility

This is one of the most sensitive stages.

AI tools may prioritize records, label likely exclusions, or recommend inclusion decisions. Tools like ASReview can help teams order records for screening, but that does not remove the need to explain how reviewers checked outputs and made final calls. If humans checked every exclusion and inclusion decision, state that plainly. If they did not, readers need to know that.

Extraction, appraisal, and synthesis

Tools can pull sample sizes, outcomes, intervention details, confidence intervals, and study characteristics from PDFs. They can also draft evidence tables or summary paragraphs.

This is the stage where speed can fool you. Extraction errors, hallucinated values, and flattened nuance can slip in fast. In a meta-analysis, a plausible-looking number can move straight into the effect estimate if nobody catches it.

Writing and editing

AI may draft methods text, shorten abstracts, rewrite prose, or clean up language.

This still needs disclosure, but it should not overshadow earlier uses that carry more methodological weight.

What readers need you to disclose

Your disclosure should let another researcher understand the role of AI without guessing.

In practice, six facts do most of the work.

First, name the tool and version if known. "ChatGPT" is thin. "GPT-4.1 accessed through the ChatGPT web interface in January 2026" gives readers something they can interpret.

Second, state the task. Did the tool help with search term generation, title and abstract screening, data extraction, coding study characteristics, narrative synthesis, or writing support?

Third, describe the inputs in plain language. You do not need to paste every prompt into the main paper, but you should summarize what you gave the model. If prompts shaped results in a material way, keep the full prompt set in a supplement or repository.

Fourth, explain human oversight. Did two reviewers check all AI-assisted screening suggestions? Did one reviewer verify every extracted number against the source PDF? This point matters more than vendor claims.

Fifth, state the limits you imposed. For example, did you bar AI from final inclusion decisions or final effect estimates? Did you avoid uploading copyrighted or sensitive full texts to a public system?

Sixth, say where the full record lives. That may be the Methods section, a supplement, a repository, a protocol appendix, or an AI Usage Card.

If you are still unsure whether your use crossed the line into reportable assistance, read Do I Need to Disclose AI Usage in My Paper? and How to Disclose ChatGPT Usage in Academic Papers.

PRISMA does not replace AI disclosure, and AI disclosure does not replace PRISMA

PRISMA 2020 remains the baseline reporting framework for systematic reviews. The official PRISMA 2020 materials include the statement, expanded checklist, abstract checklist, and flow diagrams. They ask authors to report the search strategy, selection process, data collection process, and synthesis methods in enough detail for readers to assess the review. (prisma-statement.org)

AI disclosure fits inside those same items.

If AI helped build search strings, report it with the search methods.

If AI ranked records for screening, report it with the selection process.

If AI extracted data, report it with the data collection process.

If AI drafted text, report it in the methods, acknowledgments, or a dedicated disclosure note, depending on the journal's policy.

That placement makes the paper easier to audit because readers can see the tool beside the method it affected.

New guidance asks for prompts, versions, and validation

This area moved fast in late 2025 and early 2026.

A December 11, 2025 editorial in Research Synthesis Methods set reporting expectations for manuscripts that test generative AI in systematic review and meta-analysis workflows. The guidance asks authors to describe research design, prompts, model behavior across conditions, validation methods, and task-specific performance metrics. It also notes that outputs can vary with prompt wording, model version, and random variation. (cambridge.org)

That editorial targets evaluation studies, not every ordinary review manuscript.

Still, the logic carries over. If your review used AI for screening, extraction, or synthesis support, you should keep records that let others understand what happened, what the model saw, and how your team checked the output.

Cochrane's public guidance points in the same direction. Cochrane says that AI use in evidence synthesis should be transparent, justified, and subject to human oversight, and that authors remain responsible for the final review. (cochrane.org)

So the safe rule is simple: if AI touched judgment-heavy review steps, document more than you think you need.

A disclosure structure that works in real papers

You do not need a long statement.

You need a statement that answers five questions:

What tool did you use?
What task did it perform?
What input did you provide?
Who verified the output?
What decisions did humans keep for themselves?

Here is a short example:

We used GPT-4.1 through the ChatGPT web interface to suggest search term variants for predefined PICO concepts and to draft initial summaries of included studies. Two authors reviewed all suggested search terms before database submission. One author checked all AI-generated study summaries against the source articles and corrected errors before synthesis. AI outputs did not determine inclusion, exclusion, risk-of-bias judgments, or final effect estimates.

That gives readers the minimum they need.

For more wording patterns, see AI Usage Cards Examples and Templates and What Are AI Usage Cards?.

Where teams get this wrong

The first mistake is vague disclosure.

If you write "AI was used during manuscript preparation" but the tool also touched screening or extraction, you hid the part that matters most.

The second mistake is trusting suggestions because they look tidy.

AI tools can miss negation, flatten complex interventions, confuse outcome measures, and invent numbers that look plausible on first read. In a meta-analysis, one bad extraction can flow straight into the effect estimate.

The third mistake is keeping no record.

If AI helped with screening, extraction, or synthesis, save prompts, outputs, access dates, model names, and notes on reviewer corrections. That record helps during peer review, protocol updates, and team handoffs.

The fourth mistake is blurring the boundary between suggestion and decision.

Readers should know what the model proposed and what the review team decided.

The fifth mistake is ignoring confidentiality and licensing.

Many evidence synthesis teams work with subscription PDFs, unpublished data, or sensitive material. Before you paste text into a public AI system, check copyright terms, institutional rules, data protection limits, and any sponsor or repository rules that apply to your project.

A workflow you can actually follow

Start after you lock the protocol.

That gives you a reference point. You can compare AI-assisted actions against methods you planned in advance.

Then log each AI use as it happens.

A plain spreadsheet works. Record the date, tool, version, task, input type, output type, reviewer who checked it, and final action taken. If the tool changed across time, log that too.

Next, verify every output against source material.

Do not let AI make final inclusion decisions on its own. Do not copy extracted values into analysis files without human checking. Do not treat polished prose as proof of accuracy.

Then convert your notes into a disclosure package.

That package may include a short methods paragraph, a supplement table, and an AI Usage Card. The card gives you a reusable summary that you can attach to a manuscript, thesis chapter, grant appendix, protocol appendix, or repository README.

If you work in a team, assign one person to own the log. Shared responsibility sounds good until nobody remembers who saved the prompt history.

Example wording for methods, supplement, and appendix

You can adapt the language below to your own review.

Methods paragraph in LaTeX

\paragraph{Use of AI tools.}
The review team used GPT-4.1 via the ChatGPT web interface for two limited tasks:
(1) generating candidate search term variants based on predefined PICO concepts and
(2) drafting preliminary narrative summaries of included studies.
All search strings were reviewed and finalized by the authors before submission to databases.
AI outputs did not make final screening, eligibility, risk-of-bias, or meta-analytic decisions.
One reviewer checked each AI-assisted summary against the original article, and a second reviewer spot-checked the final evidence table before synthesis.

Supplement table in LaTeX

\begin{table}[ht]
\centering
\begin{tabular}{p{2.6cm}p{2.8cm}p{3.3cm}p{4.2cm}}
\hline
Stage & Tool & Human check & Final use \\
\hline
Search planning & GPT-4.1 & Two authors reviewed all term suggestions & Selected terms entered databases manually \\
Screening triage & ASReview & Reviewers checked all records before final decisions & AI used for prioritization only \\
Study summaries & GPT-4.1 & One author verified each summary against source PDFs & Corrected summaries used in draft narrative synthesis \\
\hline
\end{tabular}
\caption{Summary of AI-assisted steps in the review workflow.}
\end{table}

Short appendix note in LaTeX

\begin{quote}
AI assistance disclosure: The team used GPT-4.1 for search term brainstorming and draft study summaries, and ASReview for screening prioritization.
Human reviewers verified all outputs against the protocol and source articles.
The tools did not make final inclusion, exclusion, risk-of-bias, or effect-size decisions.
\end{quote}

A simple AI log in LaTeX

\begin{table}[ht]
\centering
\begin{tabular}{p{2cm}p{2.3cm}p{2.8cm}p{2.3cm}p{3.7cm}}
\hline
Date & Tool/version & Review stage & Checked by & Notes \\
\hline
2026-01-12 & GPT-4.1 & Search planning & AB, CD & Suggested synonyms for intervention and outcome terms \\
2026-01-25 & ASReview 1.x & Title/abstract screening & EF, GH & Used for record prioritization only \\
2026-02-10 & GPT-4.1 & Study summaries & AB & Draft summaries corrected against source PDFs \\
\hline
\end{tabular}
\caption{Internal log of AI-assisted steps during the review.}
\end{table}

If you write in LaTeX, our LaTeX tutorial for AI Usage Cards and How to Use AI Usage Cards in Overleaf show how to format this cleanly.

What editors and reviewers will look for

Editors want specificity.

Reviewers want to know whether AI touched judgment-heavy steps and whether humans checked every output that entered the review. That scrutiny gets sharper in health, education, policy, and law, where downstream decisions can affect real people.

Some journal groups already require disclosure of AI use in manuscript preparation. JAMA Network guidance says that when authors use traditional or generative AI to create, review, revise, or edit manuscript content, they should report that use in the Acknowledgment section. JAMA's current Instructions for Authors also say that authors should report AI or similar technologies in the Acknowledgment section, or in the Methods section when the AI use is part of the formal research design or methods. (jamanetwork.com)

Systematic review journals may ask sharper questions than general journals because the whole genre depends on method transparency.

For journal-specific context, see AI Disclosure Policies by Major Journals and How to disclose AI use for NeurIPS, ICML, and ACL submissions.

A good disclosure package has three parts

Most teams do better when they stop treating disclosure as one sentence.

A usable package has three parts.

First, put a short methods statement next to the review step that AI affected. If AI changed search strategy development, say that in the search methods. If AI helped with extraction, say that in the data collection methods.

Second, keep a fuller record in a supplement, appendix, or repository. That is where prompt summaries, access dates, model names, validation steps, and correction notes belong.

Third, create an AI Usage Card. You can attach it as a supplement, cite it in the methods, or copy parts of it into your acknowledgments. That gives editors and readers one place to see the whole picture.

This three-part setup also helps when you update the review later. The paper keeps the short version. Your supplement keeps the detail. The card gives you a reusable summary.

Document it while you still remember it

Do not wait until submission week.

By then, you will forget where AI entered the workflow, what model version you used, or how much human correction the output needed. Good disclosure starts during the review, not after it.

Systematic reviews already demand careful records. AI use belongs in that record.

If you used AI to brainstorm search terms, prioritize records, extract study details, draft evidence tables, summarize findings, or polish prose, write it down now. Then turn that record into a short methods statement and an AI Usage Card that readers, editors, and your future self can trust.

Generate your AI Usage Card now. Use it as a supplement, paste parts into your methods section, or adapt it for acknowledgments. In a systematic review, trust comes from the trail you leave behind. AI should be part of that trail, not a gap in it.