PDF ExplainedApril 2, 20265 min read

What Is PDF Metadata? Author, Title, Keywords and More

PDF metadata is information about the document stored inside the file — author, title, subject, creation date, software used. Learn where it's stored and why it matters.

PDF metadata is structured information about a document stored inside the PDF file — separate from the document's visible content. It includes descriptive properties (title, author, subject, keywords), technical properties (creator application, producer, PDF version), and temporal properties (creation date, modification date). Metadata is used by search engines, document management systems, accessibility tools, and archiving systems to catalog and retrieve PDFs.

Where Metadata Is Stored

PDF has two parallel metadata mechanisms:

  • DocInfo dictionary: a simple key-value dictionary in the PDF trailer with fixed fields — Title, Author, Subject, Keywords, Creator (the originating application), Producer (the PDF-generating library), CreationDate, and ModDate. It's been present since PDF 1.0 and is universally supported.
  • XMP (Extensible Metadata Platform): an XML-based metadata format (developed by Adobe) embedded as a stream in the PDF. XMP supports much richer, extensible metadata including Dublin Core, EXIF, IPTC, and custom namespaces. PDF 1.4+ supports XMP. Most modern PDF creation tools write both DocInfo and XMP for maximum compatibility.

Why Metadata Matters

Metadata affects multiple workflows: Search and discoverability — search engines and document management systems use PDF title, author, and keywords for indexing. Accessibility — PDF/UA requires the document language to be set in metadata; screen readers use it to select the correct speech synthesizer. Archiving — PDF/A standards require accurate metadata for long-term cataloging. Privacy — metadata can inadvertently reveal author names, organization names, software versions, and document revision history to recipients.

Privacy Risks in PDF Metadata

Every PDF you create likely contains metadata you haven't noticed: your name (from your operating system user account), your organization's name (from Office settings), the exact version of software you used (Creator and Producer fields), the original filename, and sometimes the full file path from your computer. When sharing PDFs externally — especially sensitive documents — it's good practice to review and strip metadata. Adobe Acrobat's "Examine Document" function and the "Remove Hidden Information" option strip metadata. Some organizations have policies requiring metadata removal before external document distribution.

How to View PDF Metadata

In Adobe Acrobat: File → Properties → Description tab shows the basic DocInfo fields. The Additional Metadata button opens the full XMP metadata viewer. In macOS Preview: Tools → Show Inspector → the ⓘ tab. Using command-line tools: exiftool filename.pdf shows all extractable metadata. In a text editor: search for /Author, /Title, and <x:xmpmeta near the end of the file.

Setting Good Metadata

For PDFs that will be indexed by search engines or document management systems, always set: a descriptive Title (more descriptive than the filename), the correct Author (person or organization), relevant Keywords (5-10 terms), and an appropriate Subject. In Word, set these in File → Info → Properties before exporting. In InDesign, set them in File → File Info. Good metadata improves search ranking in Google for PDFs, since Google indexes PDF metadata directly alongside content.

Try Edit Pages Now — Free

Browser-based, private, and instant. No account or software required.

Open Edit Pages
Report Bug
Send Feedback
Feature Request