AI Usage Cards vs Datasheets for Datasets
Datasheets document dataset composition and collection while AI Usage Cards document how researchers used AI tools. Two different transparency needs.
Different Documents for Different Transparency Needs
Datasheets for Datasets and AI Usage Cards both exist to make research more transparent. But they address entirely different gaps. If you are a researcher trying to figure out which one applies to your work, the short answer is that Datasheets describe the data you are sharing, while AI Usage Cards describe how you used AI tools.
You might need one, the other, or both. Let's look at what each does and where they overlap.
What Are Datasheets for Datasets?
Datasheets for Datasets were proposed by Timnit Gebru et al. in 2018, drawing an analogy from the electronics industry, where every component ships with a datasheet describing its operating characteristics. The idea is simple. Every dataset shared with the research community should come with a standardized document that describes how the data was collected, what it contains, how it should be used, and what its known limitations or biases are.
A Datasheet answers questions organized into several categories. Motivation describes why the dataset was created. Composition describes what is in it, including the number of instances, data types, and whether it contains sensitive information. Collection process describes how the data was gathered, including who was involved and what consent procedures were followed. Recommended uses and known limitations round out the picture.
The people who create a Datasheet are the people who created or curated the dataset. The primary audience is anyone who might use that dataset for their own research or product development.
This framework has been widely adopted. Major conferences and journals now encourage or require dataset documentation, and platforms like Hugging Face include Datasheet-style information for hosted datasets.
What Are AI Usage Cards?
AI Usage Cards were introduced by Liebrenz et al. at JCDL 2023 (DOI: 10.1109/JCDL57899.2023.00060) to address a different transparency gap. As AI tools became common in research workflows, there was no standardized way for researchers to report how they had used those tools in their projects.
An AI Usage Card captures which AI tools a researcher used, what tasks those tools performed, how the researcher verified the AI's output, and what level of human oversight was maintained throughout the process. It is filled out by the researcher and aimed at the readers, reviewers, and editors evaluating the work.
You can create one for free at ai-cards.org.
Head-to-Head Comparison
| Dimension | Datasheets for Datasets | AI Usage Cards |
|---|---|---|
| Purpose | Document the composition, collection, and intended use of a dataset | Document how AI tools were used in a research project |
| Introduced by | Gebru et al. (2018) | Liebrenz et al. at University of Goettingen (JCDL 2023) |
| Primary audience | Dataset consumers, downstream researchers, regulators | Journal reviewers, paper readers, fellow researchers |
| Who fills it out | The dataset creators or curators | The researcher who used AI tools |
| Content focus | What is in the data, how it was collected, known biases, recommended uses | Which AI tool, what task, verification methods, human oversight |
| Scope | One dataset | One research project or paper |
| Format | Structured questionnaire with sections on motivation, composition, collection, preprocessing, uses, distribution, and maintenance | Structured card with fields for AI tool, purpose, output verification, and human oversight |
When Do You Need Which?
You need a Datasheet when you are releasing or sharing a dataset. If you collected survey responses, scraped web data, curated a corpus of medical images, or assembled a benchmark, a Datasheet helps future users understand what they are working with. It helps them decide whether your dataset is appropriate for their use case and alerts them to potential biases or limitations.
You need an AI Usage Card when you used AI tools while conducting your research. If you used GPT-4 to help analyze interview transcripts, or used an AI coding assistant to write your data processing pipeline, or used an AI tool to help with literature review, an AI Usage Card tells your readers about that usage.
You might need both in a single project. This is actually quite common. Imagine you are creating a new sentiment analysis dataset. You used an AI classifier to pre-label a large set of social media posts, then had human annotators review and correct those labels. For the dataset itself, you would create a Datasheet documenting the composition, collection method, and annotation process. For the paper describing this work, you would create an AI Usage Card documenting your use of the AI classifier in the data creation pipeline.
A Practical Example
Professor Kim is building a dataset of annotated radiology reports for the NLP community. He collected 50,000 anonymized reports from a hospital network with proper IRB approval. To speed up annotation, he used GPT-4 to generate initial labels for key findings in each report. A team of radiologists then reviewed each label and made corrections.
Professor Kim needs a Datasheet for the dataset. It would document where the reports came from, the anonymization process, the annotation scheme, inter-annotator agreement, the demographics of the patient population represented, known limitations (for instance, the reports all come from hospitals in one region), and recommended uses.
Professor Kim also needs an AI Usage Card for the paper he is submitting about this dataset. The card would document that GPT-4 was used to generate initial annotation labels, that all labels were reviewed by qualified radiologists, the error rate of the AI-generated labels before human review, and the specific GPT-4 version and prompts used.
The Datasheet tells future users of the dataset what they need to know about the data itself. The AI Usage Card tells readers of the paper what they need to know about how AI was involved in creating it.
Part of a Larger Ecosystem
Both Datasheets and AI Usage Cards belong to a growing family of AI documentation frameworks. Model Cards document models. Datasheets document datasets. System Cards document deployed systems. AI Usage Cards document how researchers used AI in their work.
Each fills a distinct gap. None of them replaces the others. The goal of the whole ecosystem is to create layered transparency, where anyone looking at an AI system, a dataset, or a piece of research can find clear documentation about the part they care about.
For a full overview of how all these frameworks compare, see our guide to AI documentation frameworks. You might also be interested in AI Usage Cards vs Model Cards or AI Usage Cards vs System Cards.
Generate Your AI Usage Report
Create a standardized AI Usage Card for your research paper in minutes. Free and open source.
Create Your AI Usage Card