Skip to main content
Utilavo

How to Convert Word Documents to PDF

Updated 9 min read

By Utilavo Editorial · Reviewed

The decision to convert Word to PDF is really a decision to *freeze* a document. A `.docx` file is a live OOXML package whose rendering depends on installed fonts, page-break heuristics, and the version of the rendering engine — Word 2019 and Word 365 do not always paginate the same source identically. PDF, defined by ISO 32000-2, embeds the resolved layout and the font subsets needed to render it, so every viewer sees the same pages. The cost is that you give up editability.

If long-term preservation is the goal, the right target is not a generic PDF but PDF/A under ISO 19005 — see our PDF/A archival matrix for which conformance level (1a, 1b, 2u, 3a) matches your retention requirements. This guide covers the standard `.docx` to PDF conversion using LibreOffice's headless renderer, the formatting that does and does not survive, and when to choose archival mode.

Why the conversion engine matters

Utilavo's Word to PDF tool uses LibreOffice running headless via the `soffice --convert-to pdf` command, which loads the OOXML package, resolves layout, and emits a PDF through LibreOffice's `pdfwrite` filter. LibreOffice's OOXML implementation tracks the ECMA-376 / ISO/IEC 29500 specification closely but is not byte-identical to Word's renderer — pagination of edge-case content (large tables that span columns, complex SmartArt, threaded text frames) can differ by a line.

For most documents this difference is invisible. For documents whose page layout is contractually significant — court filings, tax forms, anything paginated for legal reasons — convert from Word itself when possible, or build the document defensively (avoid threaded text frames, prefer tables to columns, keep font usage to the metrics-compatible set) so it renders identically across engines.

Step-by-step conversion

Open the Word to PDF tool and upload your `.docx` (or `.doc`, `.odt`, `.rtf` — LibreOffice handles all of them). The converter runs `soffice` in a temporary profile directory and emits a PDF through the standard `pdfwrite` filter. Headers, footers, page numbers, footnotes, tables, and images transfer with high fidelity. Hyperlinks survive as PDF link annotations, and the document outline is built from Word's heading styles.

Download the result, open it, and scan the first and last few pages plus any page containing a table or floating element. These are the highest-probability locations for pagination drift. For batch needs, convert files one at a time and merge the results. See our processing model for transport and retention details.

What survives and what doesn't

Surviving cleanly: paragraph and character formatting, paragraph-level numbered/bulleted lists, tables (including merged cells and shading), headers and footers, page numbers, footnotes and endnotes, embedded images, hyperlinks, and the heading-style outline. Standard fonts (Calibri, Cambria, Arial, Times New Roman) render via metric-compatible substitutes (Carlito, Caladea, Liberation Sans/Serif) preconfigured in the server's fontconfig, so line breaks and pagination match Word's output to within ~1 px on typical pages.

Lossy or unsupported: macros (VBA does not run in PDF and is silently dropped), tracked changes (only the accepted state renders, the revision history is gone), embedded videos and audio, ActiveX controls, and Word-only field codes that update at open time (`{ DATE }`, `{ TIME }`, `{ AUTHOR }` resolve to their current value at convert time and freeze). Custom fonts not installed on the server fall back to a substitute, which can shift line breaks. If exact typography matters, embed the fonts in the source `.docx` (File > Options > Save > Embed fonts in the file).

When to use PDF/A instead

If the document needs to remain readable decades from now — regulatory archives, court records, scholarly deposits, compliance retention — convert to PDF/A rather than plain PDF. PDF/A is a constrained profile of PDF defined by ISO 19005 that prohibits external dependencies (no external fonts, no encryption, no JavaScript, no transparent rendering) and requires embedded color profiles. Conformance levels 1a/2a/3a additionally require tagged structure for accessibility; 1b/2b/3b only require visual reproduction.

Convert with PDF to PDF/A after the standard Word-to-PDF step, and consult the PDF/A archival matrix for which level matches your retention requirement. Tagged conformance (1a/2a/3a) requires the source `.docx` to use real heading styles and table structure — flat text formatted to look like headings will fail validation.

After conversion: next steps

If the PDF is too large for email, compress it — see the Compress PDF guide for which preset to choose. Add password protection for confidential documents before distribution; the Protect PDF guide explains the user/owner password distinction in detail. To go back to Word later, PDF to Word extracts editable content, with the caveats covered in the PDF-to-Word guide.

Key takeaways

  • Word-to-PDF freezes the layout — pagination, fonts, and visual fidelity become independent of the recipient's environment.
  • LibreOffice's OOXML rendering tracks ISO/IEC 29500 closely but can drift by ~1 line on edge-case pages; convert from Word itself if pagination is contractually significant.
  • Embed fonts in the source `.docx` if exact typography matters — substitute fonts can shift line breaks even when metrics match closely.
  • Use PDF/A for long-term archival, and consult our PDF/A archival matrix for the right conformance level.
  • Compress and password-protect after conversion, never before — both invalidate digital signatures applied earlier in the chain.

Frequently asked questions

Why does my document have an extra blank page at the end after conversion?

This is almost always a trailing empty paragraph or a section break in the source `.docx`. Word hides empty paragraphs at the end of a document; LibreOffice and many PDF renderers do not. Open the source in Word, turn on Show/Hide Formatting Marks (Ctrl+Shift+8), and delete any trailing paragraph marks or section breaks before reconverting.

Why do my Calibri-set documents come out looking slightly different?

Calibri is a Microsoft proprietary font and not redistributable on a Linux conversion server. The conversion uses Carlito, an open-source font with metric-compatible glyph widths, so line breaks and pagination match Calibri to within a fraction of a pixel. Glyph shapes are similar but not identical. For documents where Calibri's exact appearance matters, embed the font in the source `.docx` via File > Options > Save > Embed fonts in the file.

Why don't my SmartArt diagrams render correctly?

SmartArt is a Word-specific layout engine that re-flows shapes at render time based on the host application's logic. LibreOffice converts SmartArt to a static drawing approximation; complex SmartArt with conditional layouts may simplify or lose styling. If precise SmartArt rendering is critical, export the SmartArt as an image in Word (right-click > Save as Picture) and re-insert it before converting.

Will tracked changes appear in the PDF?

No. The PDF rendering engine sees only the document's currently accepted state — pending insertions and deletions are not part of the visible flow unless you explicitly choose 'All Markup' in Word's review pane before exporting. If you want the PDF to *show* tracked changes as visible markup, set Word's review display to 'All Markup' and export from Word directly; the LibreOffice headless filter renders only the accepted state.

Why is my converted PDF rejected by my organization's archival system?

Most archival systems require PDF/A conformance, not generic PDF. A standard `pdfwrite` output references external fonts, may include transparency, and lacks embedded ICC color profiles — all of which fail PDF/A validation. Run the result through PDF to PDF/A and check the PDF/A archival matrix to confirm you are emitting the conformance level your archive accepts.

Can I convert a password-protected `.docx`?

No. The OOXML package is encrypted and the conversion server does not have the password. Open the file in Word, save a copy without protection, and convert that copy. Re-apply protection via Protect PDF on the output.