top of page
Search

Stress-testing research with AI, now super easy and fully automated

Last December we shared a protocol for stress-testing (meta-)research by making AI models argue and keeping what survives. But running it by hand (opening several models, copying outputs back and forth) is a chore and our original automation via a GPT agent was not reliable.


So we automated the protocol using Claude Code. After a one-time setup it is a single sentence: you describe the task, and the skill (built with Zuzana Irsova) has Claude call OpenAI's Codex, runs the critique and synthesis rounds, and gives you a memo with the full trail of the debate.



We built it with meta-analysis in mind, but it works on any paper, proposal, or task. More generally, it is a simple way to have one AI check another's work. A worked example (WAIVE vs. MAIVE) is included so you can see what it produces.


The manual protocol (including a four-model version with Gemini and Grok in addition to Claude and GPT) is still here if you prefer copy-paste:



Try it on something you are working on and tell us how we could improve it!



 
 
 

Recent Posts

See All
MAER-Net Blog - Introduction

Introduction to the MAER-Net Blog Content overview How to create a new account for the MAER-Net Blog. Access your profile . Subscribe to...

 
 
 

2 Comments


bob.reed
6 days ago

I wholeheartedly recommend this new AI tool from Tomas and Zuzana. For those who are not used to working with the terminal versions of Claude Code and Codex -- like me -- the setup required some help from ChatGPT. However, well worth it. I used the mad-research skill. The comments I received were good and caused me to make some changes to my paper. I also compared it to two proprietary AI review sites: https://www.refine.ink/sign-up and https://reviewer3.com/ . mad-research was comparable, if not superior. And, of course, it is free. mad-research will be part of standard toolkit in the future when writing papers.

Like
Tomas Havranek
Tomas Havranek
6 days ago
Replying to

Thank you Bob!

Like
bottom of page