Base64 Encoding Guide for Developers

Base64 is the canonical answer to a recurring problem: a transport channel that accepts only printable text needs to carry arbitrary binary data. The specification that everyone actually implements is RFC 4648 ("The Base16, Base32, and Base64 Data Encodings"), which defines both the standard alphabet and the URL-safe variant. The trade-off is fixed: 33% size expansion in exchange for 7-bit ASCII safety. The decision is which variant to use and where the encoding actually buys you something.

JavaScript's `btoa()` and `atob()` implement standard Base64 from RFC 4648 §4 — but with two notorious caveats: they operate on byte strings, not Unicode strings, and they do not support the URL-safe variant. Misunderstanding either is the source of most production Base64 bugs. This guide covers what RFC 4648 actually requires, where browser APIs diverge from it, and the edge cases that bite real applications.

How Base64 works (per RFC 4648)

The encoder takes the input as a byte stream, groups every 3 bytes (24 bits) into 4 sextets (6 bits each), and maps each sextet to a printable character. The standard alphabet is `A-Z` (0-25), `a-z` (26-51), `0-9` (52-61), `+` (62), `/` (63), with `=` as the padding character used when the input length is not a multiple of 3 — see RFC 4648 §4 for the canonical alphabet table.

The 33% expansion is exact, not approximate: every 3 bytes become 4 characters, so encoded length = `4 × ceil(n / 3)` where n is the input byte count. Padding uses 0, 1, or 2 `=` characters depending on whether `n mod 3` is 0, 2, or 1 respectively. This is why Base64 strings always end on a 4-character boundary.

Real use cases (and the size threshold that decides them)

Data URIs embed bytes directly in HTML or CSS: `src="data:image/png;base64,..."` eliminates an HTTP round trip at the cost of a larger document. The break-even depends on the connection: for HTTP/2 multiplexed connections, separate requests are usually faster above 1-2 KB; for high-latency mobile connections, the threshold can be 5-10 KB. Above those sizes, the loss of independent caching dominates the round-trip savings.

JSON API payloads embed binary fields as Base64 strings because JSON cannot represent raw binary safely. JWT tokens specifically use Base64url (RFC 4648 §5) for header and payload sections — see RFC 7519. MIME email attachments use Base64 per RFC 2045, with the additional requirement of line-breaking every 76 characters. SMTP's 7-bit constraint is the historical reason every email attachment you have ever sent was Base64-encoded.

Standard vs URL-safe: the two-character difference that breaks systems

Standard Base64 uses `+` and `/`. Both are reserved in URL paths and query strings, so a token transmitted as `?t=abc+/de==` is interpreted by the receiving server as `?t=abc /de==` (the `+` becomes a literal space in `application/x-www-form-urlencoded` parsing). Base64url, defined in RFC 4648 §5, substitutes `-` for `+` and `_` for `/`, and conventionally omits padding because the length can be inferred.

Use Base64 Encoder to encode and decode both variants. The bug to watch for: decoding a Base64url string with a standard decoder appears to succeed because `-` and `_` are valid characters in the standard alphabet (they are not — but many decoders are lenient and produce silent corruption). Always match the variant on both ends.

JavaScript-specific gotchas

`btoa()` and `atob()` are documented at MDN's Window.btoa reference and they operate on byte strings, not Unicode strings. Calling `btoa("héllo")` throws an `InvalidCharacterError` because `é` is a Unicode code point above 0xFF and `btoa` does not know how to serialize it to bytes. The correct pattern is `btoa(new TextEncoder().encode(s).reduce((a, b) => a + String.fromCharCode(b), ""))` for Unicode-safe encoding, or use the modern `Uint8Array`-based path via `Buffer.from(s).toString("base64")` in Node.

`atob` and `btoa` do not implement Base64url. Either pre-substitute `-`/`+` and `_`/`/` and re-add padding before calling them, or use a dedicated library. The Web Crypto API's JWT functions handle the variant transparently; raw `atob`/`btoa` do not.

When NOT to use Base64

Base64 is encoding, not encryption — the inverse of `btoa` is one function call away. Using Base64 to "obfuscate" sensitive values provides zero security. For real confidentiality use AES or another cipher; for tamper-resistance, sign with HMAC or use a JWT with a verified signature.

Avoid Base64 for large files in data URIs. The 33% overhead matters less than the loss of independent caching: a 100 KB image inlined as a data URI must be downloaded again every time the parent document changes. Above ~5 KB, separate files almost always win on real-world page-load metrics.

Avoid using Base64 as a database column type for binary blobs. Native `BYTEA` (Postgres) or `BLOB` (MySQL/SQLite) is faster, smaller, and avoids the encode/decode CPU cost on every read.

Key takeaways

RFC 4648 defines Base64 (standard) and Base64url (URL-safe) variants — they differ in two characters and the optional padding.
Encoded length is exactly `4 × ceil(n/3)`; the 33% expansion is mathematically guaranteed.
Use Base64url for URL parameters, filenames, and JWT tokens; use standard Base64 for MIME attachments and JSON binary fields.
Base64 is not encryption — it provides zero confidentiality and zero integrity. Use AES or signed tokens when security matters.
Browser `btoa`/`atob` operate on byte strings and do not support Base64url — use `TextEncoder` for Unicode and substitute alphabet manually.

Frequently asked questions

Why does `btoa('héllo')` throw an error in the browser?

`btoa` is defined to operate on byte strings, not Unicode strings — see MDN's btoa reference. Characters above U+00FF cannot be serialized to a single byte and trigger `InvalidCharacterError`. The fix is to UTF-8 encode the string first: `btoa(String.fromCharCode(...new TextEncoder().encode(s)))` or use `Buffer.from(s, 'utf-8').toString('base64')` in Node.

Why does my JWT verification fail when the token is correct?

Almost always Base64url confusion. JWT specifies Base64url with no padding (RFC 7515 §2). Decoding the JWT segments with a standard Base64 decoder appears to succeed but produces wrong bytes wherever the original contained `+` or `/`, which corrupts the signature check. Use a JWT library that handles the variant, or substitute `-→+` and `_→/` and re-add padding before calling `atob`.

Why does my Base64-encoded payload contain newlines?

MIME-encoded Base64 (RFC 2045 §6.8) wraps lines at 76 characters. Email and OpenSSL's default Base64 output include line breaks; JSON and URL contexts do not. If a Base64 decoder rejects line-wrapped input, strip whitespace before decoding: `input.replace(/\s/g, '')`. Most lenient decoders ignore whitespace, but strict ones (and some streaming decoders) do not.

Why do my data-URI images make my page slower, not faster?

Data URIs cannot be cached separately from the parent document. A 50 KB icon embedded as a data URI is re-downloaded every time the HTML changes, even though the icon itself is unchanged. With HTTP/2 or HTTP/3, the round-trip cost of a separate request is small, while the cache loss from inlining is permanent. Above ~5 KB, separate files nearly always win on cache hit rates.

Should I Base64-encode binary data in my Postgres database?

No. Use the native `BYTEA` type. Base64 storage costs 33% more space, requires encode/decode on every read and write, and prevents the database from doing efficient binary comparisons. The only case where Base64 in a text column makes sense is when the storage layer is genuinely text-only (some legacy systems, certain key-value caches without binary support).

Base64 Encoding and Decoding: Developer Guide

How Base64 works (per RFC 4648)

Real use cases (and the size threshold that decides them)

Standard vs URL-safe: the two-character difference that breaks systems

JavaScript-specific gotchas

When NOT to use Base64

Key takeaways

Frequently asked questions

Why does `btoa('héllo')` throw an error in the browser?

Why does my JWT verification fail when the token is correct?

Why does my Base64-encoded payload contain newlines?

Why do my data-URI images make my page slower, not faster?

Should I Base64-encode binary data in my Postgres database?

Base64 Encoder

URL Encoder

Hash Generator

AES Encrypt / Decrypt

URL Encoding: Percent-Encoding for Web Developers

AES vs DES vs Triple DES: Encryption Algorithms Explained

Base64 Encoding and Decoding: Developer Guide

How Base64 works (per RFC 4648)

Real use cases (and the size threshold that decides them)

Standard vs URL-safe: the two-character difference that breaks systems

JavaScript-specific gotchas

When NOT to use Base64

Key takeaways

Frequently asked questions

Why does `btoa('héllo')` throw an error in the browser?

Why does my JWT verification fail when the token is correct?

Why does my Base64-encoded payload contain newlines?

Why do my data-URI images make my page slower, not faster?

Should I Base64-encode binary data in my Postgres database?

Related tools

Base64 Encoder

URL Encoder

Hash Generator

AES Encrypt / Decrypt

Related guides

URL Encoding: Percent-Encoding for Web Developers

AES vs DES vs Triple DES: Encryption Algorithms Explained