AI Documentation Frameworks Compared
A complete comparison of Model Cards, Datasheets, System Cards, FactSheets, Eval Cards, and other AI transparency frameworks, and where AI Usage Cards fit in.
The AI Documentation Ecosystem
Over the past several years, researchers and organizations have proposed a range of documentation frameworks for AI systems. Each one addresses a specific transparency gap. Some document the models themselves. Others document the data used to train them, the systems built around them, or the evaluations applied to them.
This guide covers all the major frameworks, explains what each one does, and shows where AI Usage Cards fit into the picture. If you are a researcher, developer, or policymaker trying to figure out which frameworks matter to you, this is the overview you need.
The Frameworks
Model Cards (Mitchell et al., 2019, Google)
Model Cards were one of the first standardized AI documentation proposals. Introduced by Margaret Mitchell and colleagues at Google, they provide a structured way to report a machine learning model's performance, intended use cases, limitations, and ethical considerations. Think of them as a product label for AI models. Model Cards are filled out by the people who build or train the model and are intended for anyone who might use or evaluate it. Hugging Face has made Model Cards a standard practice for models on its platform. For a direct comparison, see AI Usage Cards vs Model Cards.
Datasheets for Datasets (Gebru et al., 2018)
Proposed by Timnit Gebru and colleagues, Datasheets for Datasets apply the electronics industry's concept of datasheets to machine learning datasets. They document a dataset's motivation, composition, collection process, preprocessing steps, intended uses, distribution, and maintenance plan. Dataset creators fill them out so that downstream users can make informed decisions about whether a dataset is appropriate for their work. This framework has been widely adopted and is now encouraged by many conferences and data repositories. See also AI Usage Cards vs Datasheets.
System Cards (Gursoy & Kakadiaris, 2022)
System Cards go beyond documenting a single model. They document an entire deployed AI system, including the model, data pipelines, user interfaces, monitoring infrastructure, and organizational safeguards. OpenAI's published System Cards for GPT-4 and ChatGPT are the most prominent examples. These documents cover capabilities, risk assessments, red-teaming results, and safety mitigations. They are created by the deploying organization and aimed at regulators, policymakers, and informed users. For more detail, see AI Usage Cards vs System Cards.
FactSheets (Arnold et al., IBM, 2019)
IBM researchers proposed FactSheets as an approach to AI transparency inspired by product safety data sheets in the chemical industry. A FactSheet documents an AI service's purpose, performance, safety, fairness, and explainability characteristics. What distinguishes FactSheets from Model Cards is their emphasis on the service level rather than the model level and their inclusion of supply chain considerations. IBM envisions FactSheets as living documents that evolve as an AI service is updated.
Eval Cards (Dhar et al., 2025)
Eval Cards are a more recent addition to the ecosystem. They focus specifically on documenting AI evaluation procedures. When researchers evaluate an AI system, an Eval Card captures the evaluation methodology, metrics used, test datasets, conditions of evaluation, and limitations of the evaluation itself. This is important because evaluation results can be misleading without context. An Eval Card helps others understand exactly how performance numbers were generated and whether those numbers are relevant to their own use case.
Audit Cards
Audit Cards document the results of third-party audits of AI systems. When an external organization evaluates an AI system for bias, safety, or compliance, an Audit Card provides a standardized way to report the findings. This includes the scope of the audit, the methods used, the findings, and any recommended actions. Audit Cards are particularly relevant in regulated industries where independent verification of AI system properties is required.
Safety Cases (Clymer et al., 2024)
Safety Cases borrow from high-stakes engineering fields like aviation and nuclear power. A Safety Case is a structured argument, supported by evidence, that an AI system is safe to deploy in a given context. It includes claims about the system's safety properties, the evidence supporting those claims, and an analysis of the conditions under which those claims hold. This framework is gaining traction as AI systems are deployed in increasingly high-stakes settings like healthcare and autonomous vehicles.
Data Provenance Cards (Longpre et al., 2024)
Data Provenance Cards address the growing concern about the origins of training data. As AI models are trained on increasingly large and complex datasets scraped from across the internet, tracking where that data came from and what rights govern its use becomes difficult. A Data Provenance Card documents the sources, licenses, and lineage of training data, helping developers and regulators understand the legal and ethical standing of the data behind a model.
Reward Reports (Snoswell et al., 2023)
Reward Reports focus on reinforcement learning systems and document the reward functions and objectives that guide an AI system's behavior. Since the reward signal shapes everything an RL system does, understanding that signal is important for understanding the system. A Reward Report describes the reward function's design, the assumptions behind it, its known failure modes, and the relationship between the reward signal and the intended behavior.
AI Usage Cards (Liebrenz et al., JCDL 2023)
AI Usage Cards address a gap that none of the other frameworks cover. All the frameworks listed above document the supply side of AI, covering models, datasets, systems, evaluations, and training processes. AI Usage Cards document the demand side, specifically how a researcher actually used AI tools in their work. An AI Usage Card captures which AI tools were used, what tasks they performed, how outputs were verified, and what level of human oversight was maintained. You can generate one at ai-cards.org.
The 2025 MIT AI Agent Index observed that "there are no comparable frameworks for documenting agentic AI systems" and cited AI Usage Cards as a notable framework in this space. As AI agents become more autonomous and take on more complex research tasks, the need for researcher-side usage documentation will only grow.
The Big Comparison Table
| Framework | Year | Creator | Documents What | Filled Out By | Primary Audience | Format |
|---|---|---|---|---|---|---|
| Datasheets for Datasets | 2018 | Gebru et al. | Dataset composition, collection, and biases | Dataset creators | Downstream data users | Structured questionnaire |
| Model Cards | 2019 | Mitchell et al. (Google) | Model performance, intended uses, limitations | Model developers | Model users, regulators | Standardized report sections |
| FactSheets | 2019 | Arnold et al. (IBM) | AI service purpose, performance, safety, fairness | Service providers | Enterprise customers, regulators | Living document with supply chain info |
| System Cards | 2022 | Gursoy & Kakadiaris | Deployed AI system capabilities, risks, mitigations | System deployers | Regulators, policymakers, users | Detailed report (often 30 to 60+ pages) |
| AI Usage Cards | 2023 | Liebrenz et al. (Univ. of Goettingen) | Researcher's use of AI tools in a project | Researchers | Reviewers, readers, editors | Short structured card (1 to 2 pages) |
| Reward Reports | 2023 | Snoswell et al. | RL reward functions and objectives | System developers | Researchers, auditors | Structured report |
| Data Provenance Cards | 2024 | Longpre et al. | Training data sources, licenses, lineage | Data curators, model trainers | Developers, regulators, legal teams | Structured provenance record |
| Safety Cases | 2024 | Clymer et al. | Structured safety arguments for deployment | System developers, safety teams | Regulators, deployment decision-makers | Argument structure with evidence |
| Eval Cards | 2025 | Dhar et al. | AI evaluation methodology and results | Evaluators | Anyone interpreting evaluation results | Structured evaluation report |
| Audit Cards | Varies | Third-party auditors | Independent audit findings for AI systems | External auditors | Regulators, system operators | Audit report with findings |
Finding the Gap AI Usage Cards Fill
Look at the "Filled Out By" column in the table above. Every framework except AI Usage Cards is filled out by someone on the supply side. Model developers, dataset creators, system deployers, evaluators, and auditors. These are all people involved in building, testing, or overseeing AI systems.
AI Usage Cards are the only framework filled out by the end user, specifically the researcher who took an existing AI tool and applied it to their work. This is an important distinction. No matter how thoroughly OpenAI documents GPT-4 with a System Card, that documentation cannot tell you how Dr. Chen in the sociology department used GPT-4 to code her interview transcripts last month. Only Dr. Chen can provide that information, and an AI Usage Card gives her a standard way to do it.
Which Frameworks Do You Need?
The answer depends on your role.
If you are a model developer, start with Model Cards. If your model is part of a larger deployed system, add a System Card. If your training data has complex provenance, consider Data Provenance Cards.
If you are a dataset creator, Datasheets for Datasets are your primary framework.
If you are deploying an AI system, System Cards and Safety Cases are most relevant. FactSheets apply if you are offering an AI service to enterprise customers.
If you are evaluating AI systems, Eval Cards help others understand and interpret your results.
If you are a researcher who used AI tools in your work, AI Usage Cards are what you need. They are the framework designed specifically for your situation.
If you are a journal editor or reviewer, asking authors for AI Usage Cards gives you a standardized way to evaluate AI involvement in submitted manuscripts.
Most real-world situations involve multiple frameworks. A research team that trains a new model, creates a dataset, and uses AI tools in their workflow might produce a Model Card, a Datasheet, and an AI Usage Card for a single project. Each one covers a different aspect of transparency.
Start Documenting
The AI documentation ecosystem will continue to grow as new types of AI systems emerge and new transparency needs are identified. The most important step is to start documenting. Pick the frameworks that apply to your role and your work. If you used AI in your research, generate an AI Usage Card today. It takes just a few minutes, and it makes your work more transparent for everyone who reads it.
Generate Your AI Usage Report
Create a standardized AI Usage Card for your research paper in minutes. Free and open source.
Create Your AI Usage Card