Glossary
23terms, definitions, and explanations covering PDF technology, security, formats, and standards. Bookmark this page — it's a reference you'll come back to.
Portable Document Format
A file format developed by Adobe in 1993 and standardized as ISO 32000. PDF files present documents — including text, fonts, images, and vector graphics — in a way that looks identical on any device, in any application, regardless of the operating system.
PDF for Archival (ISO 19005)
A specialized subset of PDF designed for long-term preservation of digital documents. PDF/A embeds all fonts, color profiles, and metadata needed to render the document identically in 50+ years. It prohibits features that may fail over time (font linking, encryption, JavaScript). Multiple conformance levels exist: 1b (basic), 2b (better compression, layers), 3b (with embedded files).
PDF/Universal Accessibility (ISO 14289)
A standard that ensures PDFs are accessible to users with disabilities, particularly those using screen readers. Requires tagged content, reading order, alternative text for images, and proper form field labeling. PDF/UA compliance is increasingly required by government and education procurement standards.
Optical Character Recognition
The technology that converts images of text (scanned documents, photos, screenshots) into machine-readable, searchable, and editable text. Modern OCR uses deep learning models to recognize characters in 100+ languages, handle multiple fonts and sizes, and deal with poor scan quality. Common OCR libraries include Tesseract (open source), ABBYY FineReader, and Adobe Acrobat's built-in OCR.
PDF structure for accessibility
Adding semantic structure to a PDF that defines the reading order, identifies headings, lists, and tables, and provides alternative text for images. Tagged PDFs are accessible to screen readers and comply with PDF/UA standards. Most modern word processors can export tagged PDFs, but many PDFs from scanned documents or older software lack tags.
Two PDF compression methods
Lossless compression preserves every byte of the original data — the file is smaller but the visual output is identical. Lossy compression achieves smaller files by reducing image quality or removing data — smaller file, slight visual difference. The Compress PDF tool uses both: text is lossless, images are intelligently downscaled.
Advanced Encryption Standard, 256-bit key
The encryption standard used by PdfPix and most modern PDF tools. AES-256 is considered unbreakable with current technology — a 256-bit key would take billions of years to brute force. The Protect PDF tool uses AES-256 encryption to secure your documents.
Permanent removal of sensitive content
The act of removing sensitive information from a document in a way that cannot be reversed. True redaction removes the underlying text or image bytes, not just draws a black box on top. The Redact PDF tool performs true redaction — once applied, the original content is unrecoverable even with copy-paste or OCR.
XML Forms Architecture
An XML-based form technology used in PDF. XFA forms are dynamic — they can change layout based on user input, perform calculations, and connect to external data sources. XFA is used heavily in government, tax, and insurance forms. Most modern PDF viewers support XFA rendering, but the standard is being phased out in favor of AcroForm for PDF 2.0.
Standard PDF form technology
The original PDF form technology. AcroForms are static — form fields have fixed positions and behavior. They're simpler than XFA, work in every PDF viewer, and are the recommended form technology in PDF 2.0. The PDF Forms tool in PdfPix works with AcroForm fields.
PDF with form fields converted to text
A PDF where interactive form fields (text inputs, checkboxes, dropdowns) have been converted to static content (text, lines, boxes) that can no longer be edited. Flattening is done to 'lock in' form data for distribution. The Edit PDF tool can flatten forms after filling them.
Cryptographic signature for authenticity
A mathematical scheme for verifying the authenticity of digital documents. Unlike electronic signatures (which are images of handwritten signatures), digital signatures use public-key cryptography to prove the document hasn't been altered since signing. Digital signatures are required for high-value legal and financial documents in many jurisdictions.
Electronic signature (image-based)
An electronic mark indicating intent to sign. The most common form is drawing or typing a signature and embedding it as an image in the PDF. Legally valid in most jurisdictions for routine agreements (NDAs, contracts under a certain value, internal forms). Not the same as a digital signature with cryptographic verification.
Two ways to represent images in PDF
Vector graphics use mathematical descriptions (lines, curves, paths) — they scale infinitely without quality loss. Raster graphics are pixel grids — they look pixelated when zoomed in. Text in PDFs is always vector. Photos and scans are raster. The PDF format can mix both. Most PDFs are 95% vector, 5% raster (just the images).
Clickable link in PDF
PDFs support clickable hyperlinks, both internal (links to other pages in the same document — like a table of contents entry) and external (links to web URLs, email addresses, or other files). Hyperlinks are preserved during most PDF operations. The Edit PDF tool can add or modify hyperlinks.
Data about the PDF document
Information stored in the PDF that describes the document: title, author, subject, keywords, creation date, modification date, software used, and more. Metadata helps with document management, search, and organization. The Compress PDF tool can strip metadata to reduce file size; the Edit PDF tool can add or modify metadata.
Two categories of PDF software
A PDF reader (Adobe Reader, Preview, browser built-in) can view, print, and fill forms. A PDF editor (Adobe Acrobat Pro, Foxit, PdfPix) can also modify content — add text, change images, rearrange pages, redact, etc. PdfPix is an editor — the tools all modify PDFs in addition to viewing.
Digital Rights Management
Technology that controls how a PDF can be used after creation — limiting printing, copying, editing, or expiration dates. Used by publishers, e-book sellers, and some enterprises. PdfPix does not add DRM, but the Protect PDF tool can add password-based access control which serves a similar purpose for most use cases.
Compiled code that runs in browsers
A binary instruction format that runs in web browsers at near-native speed. PdfPix uses WebAssembly (via pdf-lib, pdfjs-dist, and Tesseract.js) to process PDFs directly in your browser, without uploading files to a server. This is what enables browser-based PDF processing at the performance and feature level needed for production use.
Mozilla's PDF rendering library
A general-purpose PDF library built in JavaScript and WebAssembly by Mozilla. It renders PDFs in the browser, extracts text, and provides a foundation for PDF tools. PdfPix uses PDF.js (via pdfjs-dist) for page rendering, text extraction, and thumbnail generation.
JavaScript PDF creation and editing library
A JavaScript library for creating and modifying PDF documents in the browser. Used by PdfPix to apply overlays, add watermarks, encrypt, sign, and otherwise modify PDFs. Runs entirely in the browser via WebAssembly — no upload required.
Web Content Accessibility Guidelines
International standards (published by W3C) for making web content accessible to people with disabilities. For PDFs, the relevant guidelines are about alternative text, sufficient contrast, keyboard navigability, and screen reader compatibility. WCAG 2.1 has three levels: A, AA, and AAA. PdfPix follows AA as a minimum.
The PDF standard
The international standard that defines the PDF format. First published in 2008, currently at version 2.0 (2017). ISO 32000 is the definitive reference for PDF behavior. Sub-standards (PDF/A, PDF/UA, PDF/E, PDF/VT, PDF/X) extend the base standard for specific use cases (archival, accessibility, engineering, variable data printing, graphic arts).
This glossary covers the most common PDF terminology. For deeper technical specifications, see the ISO 32000 standard or contact us.