PDF ExplainedApril 2, 20265 min read

What Is a Tagged PDF? Structure Trees and Accessibility

A tagged PDF has a hidden structure tree that defines reading order, headings, images, and tables. Learn why tags matter for accessibility, search, and reflow.

A tagged PDF contains a hidden logical structure tree — a set of tags that describe the semantic role of every element on the page: this is a heading, this is a paragraph, this is an image, this is a table cell. Without tags, a PDF is just a collection of drawing instructions: draw this glyph at coordinate (x, y). With tags, software knows what the content *means*, enabling accessibility tools, document reflow, and reliable copy-paste.

What the Structure Tree Contains

Every tagged PDF has a structure tree rooted at a "Document" tag. Under it, the hierarchy mirrors the document's logical organization:

  • Part / Section / Article: grouping elements
  • H1–H6: heading levels
  • P: paragraphs
  • L / LI / LBody: lists and list items
  • Table / TR / TH / TD: tables and cells
  • Figure: images and graphics, with Alt text attribute
  • Form: form fields
  • Artifact: decorative elements, headers, footers that screen readers should skip

Why Tags Matter for Screen Readers

Screen readers like JAWS or NVDA read a PDF by traversing the structure tree, not by scanning the visual layout. In an untagged PDF, the screen reader guesses reading order from the positions of text objects on the page — often producing garbled, out-of-order results. With proper tags, the screen reader follows the intended reading sequence: main heading first, then sections in order, alt text for images, form labels before form fields. Tags are what make a PDF usable by someone who cannot see it.

Tags and PDF Reflow

When you open a PDF on a small screen (phone, e-reader) and enable "Reflow" mode, the viewer uses the structure tree to reformat the text into a single column that fits the screen width. Without tags, reflow is either impossible or produces random character sequences. With a well-structured tag tree, reflow produces readable single-column text. This is why PDFs from major publishers and government agencies tend to reflow well — they invest in proper tagging during production.

How to Check If a PDF Is Tagged

In Adobe Acrobat Reader, go to File → Properties → Description tab and look for "Tagged PDF: Yes." In Chrome, the PDF viewer will show a message about accessibility if tagging is detected. Programmatically, the PDF's Catalog dictionary has a MarkInfo entry with Marked: true when the document is tagged. You can also use the free PAC (PDF Accessibility Checker) tool to validate both the presence of tags and whether they are correctly structured.

Tagged PDF vs. PDF/UA

Tagged PDF means the structure tree exists. PDF/UA (ISO 14289) means the structure tree is correctly and completely constructed — all images have alt text, reading order is logical, headings are in proper sequence, tables have headers. A tagged PDF is a necessary condition for PDF/UA compliance, but many tagged PDFs still fail PDF/UA validation because the tags are incomplete or incorrectly applied. For government and regulated industry documents, full PDF/UA compliance is the target, not just "tagged: yes."

Try Edit Pages Now — Free

Browser-based, private, and instant. No account or software required.

Open Edit Pages
Report Bug
Send Feedback
Feature Request