Explainer

What Is a PDF? Everything You Need to Know

Learn what PDF files are, how they work, and why they're the standard for document sharing. Covers history, features, and common uses.

February 22, 20269 min read

Convert-To Editorial Team

Editorial Policy

PDF files are so embedded in modern work that most people never think about what's actually inside them. You click a link, the document opens, and the text looks exactly the way the author intended — regardless of whether you're on a MacBook, an Android phone, or a government workstation running a decade-old operating system. That reliability isn't an accident. It's the result of a carefully designed specification that treats every page as a self-contained canvas with embedded fonts, precise coordinates, and multiple layers of content stacked on top of each other.

The Origin Story: Why PDF Was Invented

In 1991, Adobe co-founder John Warnock circulated an internal memo titled "The Camelot Project." The problem he described was simple: documents looked different on every computer, every printer, and every operating system. A letter formatted in WordPerfect on DOS would look completely different when printed on a Macintosh. Fonts wouldn't match, spacing would shift, and tables would collapse.

Warnock's proposed solution was a universal file format that described documents independent of the software, hardware, or operating system used to view them. By 1993, Adobe released the first version of PDF (Portable Document Format) alongside Acrobat Reader. The initial adoption was slow — Acrobat Reader cost $50 at first — but once Adobe made the reader free in 1994, PDF's dominance began.

PDF became an open ISO standard (ISO 32000) in 2008, meaning anyone can create PDF software without licensing fees from Adobe. Today, PDF is the default format for invoices, contracts, academic papers, government forms, and billions of other documents worldwide.

Inside a PDF: How Content Is Structured

A PDF file isn't a simple text document. Open one in a text editor and you'll see a mix of readable keywords and binary data. The internal structure has four main layers:

Objects and the Cross-Reference Table

Every element in a PDF — text blocks, images, fonts, annotations — is stored as a numbered object. A cross-reference table (xref) at the end of the file maps each object number to its byte position, allowing PDF readers to jump directly to any object without scanning the entire file. This is why large PDFs (hundreds of pages) can open and navigate quickly.

The Page Tree

Pages are organized in a tree structure. Each page object defines its dimensions (typically 8.5 x 11 inches for US Letter or 210 x 297 mm for A4) and references the content streams that contain the visible elements. This tree structure is why PDF readers can display any page instantly — they don't need to render pages sequentially.

Content Streams

The actual visible content lives in content streams — compressed sequences of drawing operators. These operators position text at exact coordinates, draw lines and curves, place images, and set colors. A simplified excerpt looks something like this:

BT                          % Begin text block
/F1 12 Tf                   % Use font F1 at 12pt
72 720 Td                   % Move to position (72, 720)
(Hello, World) Tj           % Draw the text string
ET                          % End text block

This coordinate-based approach is fundamentally different from how Word documents work. In Word, paragraphs flow and wrap automatically. In PDF, every character has a specific position on the page.

Resource Dictionaries

Fonts, images, and color profiles are stored in resource dictionaries and referenced by content streams. This is how PDFs embed fonts — the actual font outlines (or subsets of them) are included in the file, so the document renders correctly even if the viewer's computer doesn't have the same fonts installed.

PDF Compression: Why File Sizes Vary So Much

A 5-page PDF can be 50 KB or 50 MB depending on its content. PDF supports multiple compression methods, often using different algorithms for different objects within the same file:

Content TypeCompression MethodTypical Ratio
Text and vector graphicsFlate (DEFLATE/zlib)5:1 to 20:1
PhotographsJPEG or JPEG200010:1 to 40:1
Scanned pagesJBIG2 (bitonal), JPEG20005:1 to 100:1
Metadata and structureFlate3:1 to 10:1

The biggest factor in PDF file size is embedded images. A single high-resolution photograph (4000x3000 pixels, uncompressed) adds roughly 36 MB to a PDF. With JPEG compression at quality 85, that drops to about 800 KB. This is why compressing a PDF often reduces file size dramatically — the tool re-compresses embedded images at a lower quality setting.

Convert-To Tip

If a PDF is unexpectedly large, it usually contains high-resolution images. Before sharing via email (most providers cap attachments at 25 MB), run it through our PDF compressor — a typical 15 MB report with photos compresses down to 3-5 MB with minimal visible quality loss.

Types of PDF: Not All PDFs Are Created Equal

There are three fundamentally different kinds of PDF, and understanding which type you're working with explains why some PDFs are easy to convert and others aren't.

Text-Based PDFs (Native or "Born-Digital")

These are created by saving or printing from applications like Word, InDesign, or Google Docs. The text is stored as actual character data with font information, making it fully searchable, selectable, and convertible. This is the most common type for business documents and the easiest to work with.

Image-Based PDFs (Scanned Documents)

When you scan a paper document, the result is essentially a photograph wrapped in a PDF container. The "text" you see is actually pixels in an image — it can't be selected, searched, or copied. Converting these to editable formats requires OCR (Optical Character Recognition), which analyzes the image and attempts to identify characters.

Hybrid PDFs (OCR-Processed Scans)

After running OCR on a scanned document, the PDF contains both the original scan image and an invisible text layer positioned on top. This allows the text to be searched and selected while preserving the visual appearance of the original scan. Most modern document scanners produce hybrid PDFs automatically.

PDF TypeText SelectableSearchableConvertibleTypical Source
Text-basedYesYesHigh accuracyWord, InDesign, web browsers
Image-basedNoNoRequires OCR firstScanners, photos of documents
Hybrid (OCR'd)YesYesModerate accuracyScanned + OCR processed

Common PDF Operations and When They Break

Merging PDFs

Combining multiple PDFs is generally reliable because the operation simply concatenates page objects and updates the cross-reference table. Problems occur when the source PDFs use conflicting encryption settings, incompatible PDF versions, or when one file has corrupted internal references.

Splitting PDFs

Extracting specific pages works the same way in reverse — isolating page objects and creating a new cross-reference table. The main gotcha: interactive form fields that reference data on other pages may break when those pages are removed.

PDF to Word Conversion

Converting PDF to Word requires reverse-engineering the layout from absolute coordinates back into flowing paragraphs, styles, and tables. Simple text documents convert well. Complex layouts with multiple columns, text boxes, and wrapped images frequently break. In our testing, a standard business letter converts with 98% accuracy, while a two-column academic paper with footnotes converts with roughly 70-80% accuracy, requiring manual cleanup.

When PDF Operations Fail

This won't work reliably when:

  • The PDF is encrypted and the password is required for modification
  • The file is corrupted (damaged object references, truncated streams)
  • The PDF uses features from a newer specification version than the tool supports
  • Form fields use XFA (XML Forms Architecture), which many non-Adobe tools can't process
  • The document contains complex layer structures (PDF/X print files with spot colors and overprint)

PDF Variants You Should Know About

VariantStandardPurposeKey Restriction
PDF/AISO 19005Long-term archivalNo external dependencies, all fonts embedded
PDF/XISO 15930Print productionColor management required, no transparency in some versions
PDF/EISO 24517Engineering documents3D content support, large format pages
PDF/UAISO 14289AccessibilityFull tag structure required for screen readers
PDF 2.0ISO 32000-2Latest general standardRemoves XFA forms, adds AES-256 encryption

For everyday use, standard PDF is sufficient. PDF/A matters if you're in a regulated industry (legal, government, healthcare) where documents must remain readable for decades without depending on specific software.

Security Features and Limitations

PDF supports two levels of password protection:

  • User password (open password): Prevents opening the document without the password. Uses AES-256 encryption in modern PDFs.
  • Owner password (permissions password): Restricts actions like printing, copying, or editing. The document can be opened without a password, but modifications are restricted.

A critical limitation: owner passwords only enforce restrictions through the PDF viewer's cooperation. Specialized tools can ignore permission restrictions while still opening the document. For genuinely secure document sharing, always use a user (open) password with strong encryption, not just permission restrictions.

Privacy Note

When you convert a file on Convert-To.co, it is processed by CloudConvert, a GDPR-compliant and ISO 27001 certified service. All files are automatically deleted within 15 minutes after conversion. Convert-To.co does not store your files on its own servers. For documents containing sensitive information (contracts, medical records, financial data), consider whether online processing aligns with your data handling requirements.

Tags

pdfdocument formatsexplaineradobe
Back to Blog
Updated 2/22/2026

Try It Now

Ready to use PDF to JPG? Convert your files for free with our online tool.

Use PDF to JPG

Try It Now

Ready to use Merge PDF? Convert your files for free with our online tool.

Use Merge PDF

Try It Now

Ready to use Split PDF? Convert your files for free with our online tool.

Use Split PDF

Try It Now

Ready to use Compress PDF? Convert your files for free with our online tool.

Use Compress PDF

Try It Now

Ready to use PDF to Word? Convert your files for free with our online tool.

Use PDF to Word