Problem → SolutionApril 2, 20265 min read

Copying Text From PDF Shows Wrong or Garbled Characters — How to Fix

PDF text that copies as random symbols, question marks, or wrong letters has an encoding problem. Learn why it happens and how to extract the correct text.

Copying text from a PDF and pasting it to find random symbols, reversed characters, ligature replacements (copying "fi" gets a single symbol), or completely wrong characters is an encoding problem — specifically a missing or incorrect ToUnicode mapping in the PDF's font definition. The text looks correct on screen because the renderer uses glyph positions, but the underlying character codes are wrong or missing.

Why This Happens

PDF fonts map glyph codes (numbers) to visual shapes. A separate ToUnicode map tells the viewer "glyph code 65 corresponds to Unicode character U+0041 (A)." When ToUnicode is missing or wrong, the viewer renders the correct glyph (so it looks right) but cannot tell you what character it represents when you copy. This is common in: PDFs exported from old InDesign or Quark versions, PDFs from certain Asian-language typesetting systems, PDFs with ligature glyphs (ﬁ, ﬂ, ﬃ) not mapped to their Unicode equivalents, and scanned-then-OCR'd PDFs with imprecise character mapping.

Fix 1: Re-Export From the Source Application

If you have the source document (InDesign, Word, Publisher), re-export the PDF with Unicode text encoding enabled. In InDesign (CS6+): Export PDF → in the Advanced tab, ensure "Include Hyperlinks" and standard encoding options are on. In older InDesign versions, this was a known bug that was fixed in CS5.5. Exporting again from a current version of InDesign, Illustrator, or Word produces a PDF with correct ToUnicode maps, making text fully copyable.

Fix 2: Run OCR to Replace the Text Layer

For PDFs where re-export is not possible, running OCR on the file replaces the broken font-encoded text with freshly recognized Unicode characters. In Acrobat Pro: Tools → Enhance Scans → Recognize Text → In This File. Choose the correct language and run recognition. The OCR engine reads the visual glyphs (not the broken encoding) and writes correct Unicode characters. After OCR, copy-paste works correctly. The trade-off: OCR introduces recognition errors for unusual fonts or small text.

Fix 3: Use a PDF Text Extractor That Handles Encoding

Some PDF text extraction tools handle broken ToUnicode maps better than clipboard copy. Try: pdftotext (from the Poppler library, command line: pdftotext -enc UTF-8 file.pdf), which attempts ToUnicode reconstruction. Apache PDFBox's ExtractText command also handles some encoding recovery. These tools are not perfect for severely broken encodings but often recover more readable text than clipboard copy.

Fix 4: Ligature and Special Character Lookup

If only certain characters copy wrong — specifically sequences like "fi," "fl," "ffi," "ffl" showing as single symbols — this is a ligature encoding issue. The font uses combined ligature glyphs but maps them to a private-use Unicode code point rather than the component characters. Acrobat Pro's "Copy with Formatting" sometimes handles ligatures better than plain copy. Alternatively: search-and-replace in the pasted text — replace the symbol for each ligature with its component letters (copy the symbol → paste into Find → type the correct letters in Replace).

Try Edit Pages Now — Free

Browser-based, private, and instant. No account or software required.

Open Edit Pages

Browse all free PDF tools →

Related Guides

All Problem → Solution guides →

Report Bug

Send Feedback

Feature Request

Copying Text From PDF Shows Wrong or Garbled Characters — How to Fix

Why This Happens

Fix 1: Re-Export From the Source Application

Fix 2: Run OCR to Replace the Text Layer

Fix 3: Use a PDF Text Extractor That Handles Encoding

Fix 4: Ligature and Special Character Lookup

Try Edit Pages Now — Free

Related Guides

Why Is My PDF So Large? 7 Causes and How to Fix Each

PDF Quality Looks Bad After Compression — How to Fix It

PDF Opens Blank or Shows White Pages — 6 Fixes

PDF Images Show on My Computer But Not Others — Why and How to Fix