FrontendHOT

Base64 Encoding Explained: How It Works and When to Use It

2026-04-24·6 分钟阅读

What is Base64 Encoding?

Base64 is a binary-to-text encoding scheme that converts binary data into printable ASCII characters. It uses 64 characters (A-Z, a-z, 0-9, +, /) to represent binary data, with = used for padding.

A common misunderstanding: Base64 is NOT encryption and NOT compression. If you Base64-encode a file and it gets smaller, that is coincidence — the encoded output is always larger than the input, by about 33%.

How Base64 Encoding Works

The Conversion Process

1. Take every 3 bytes (24 bits) as a group

2. Split the 24 bits into four 6-bit groups

3. Map each 6-bit value (0-63) to the Base64 character table

4. If the last group has fewer than 3 bytes, pad with =

Example

Encoding the string Man to Base64:

code双击代码复制

M ASCII: 77  ->  01001101
a ASCII: 97  ->  01100001
n ASCII: 110 ->  01101110

Combined 24 bits:  01001101 01100001 01101110
Four 6-bit groups:  010011 010110 000101 101110
Base64 characters:  T  W  F  u

Result: "TWFu"

Why 33% Size Increase?

3 bytes (24 bits) become 4 Base64 characters, each taking 1 byte in transmission. The ratio is 4:3, a 1/3 increase. If you are sending large files, this overhead adds up fast.

When to Use Base64 (and When Not To)

Image Data URIs

Embedding small images directly in HTML eliminates HTTP requests:

html双击代码复制

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." />

I use this for icons and logos under 10KB. For anything larger, the HTML file size bloat is not worth it — serve the image file normally and let the browser cache it.

CSS Background Images

css双击代码复制

.logo {
  background-image: url("data:image/svg+xml;base64,PHN2ZyB4bWxucz0i...");
}

JWT Tokens

JWT (JSON Web Tokens) use Base64URL encoding — a variant that replaces + with - and / with _, and strips the = padding. Every time you decode a JWT on jwt.io, you are looking at three Base64-encoded sections.

javascript双击代码复制

// Decoding a JWT payload in JavaScript
const token = "eyJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOjF9.ZeMqMQ"
const payload = JSON.parse(atob(token.split(".")[1]))
console.log(payload) // { "userId": 1 }

Email Attachments

Email protocols (SMTP) were designed for text. Binary attachments are Base64-encoded before sending. This is why email messages are about 37% larger than the original attachment — the Base64 overhead plus MIME headers.

Browser API: btoa() and atob()

Modern browsers provide two Base64 functions:

javascript双击代码复制

// Encode a string to Base64
const encoded = btoa("hello world")
console.log(encoded) // "aGVsbG8gd29ybGQ="

// Decode Base64 to string
const decoded = atob("aGVsbG8gd29ybGQ=")
console.log(decoded) // "hello world"

Important: btoa() throws an error on non-Latin1 characters. For UTF-8 text, encode first:

javascript双击代码复制

function base64Encode(str) {
  const bytes = new TextEncoder().encode(str)
  let binary = ""
  bytes.forEach(b => binary += String.fromCharCode(b))
  return btoa(binary)
}

Base64 in Other Languages

python双击代码复制

import base64

# Encode
encoded = base64.b64encode(b"hello world")
print(encoded)  # b'aGVsbG8gd29ybGQ='

# Decode
decoded = base64.b64decode("aGVsbG8gd29ybGQ=")
print(decoded)  # b'hello world'

go双击代码复制

import "encoding/base64"

// Encode
encoded := base64.StdEncoding.EncodeToString([]byte("hello world"))

// Decode
decoded, _ := base64.StdEncoding.DecodeString("aGVsbG8gd29ybGQ=")

URL-Safe Base64

Standard Base64 uses + and / characters, which have special meaning in URLs. Base64URL encoding replaces them:

javascript双击代码复制

function base64UrlEncode(str) {
  return btoa(str)
    .replace(/+/g, "-")
    .replace(///g, "_")
    .replace(/=+$/, "")
}

JWT tokens use this variant. If you decode a JWT and get garbled output, check whether you are using Base64URL instead of standard Base64.

Base64 in Node.js

Node.js adds Buffer-based Base64 methods:

javascript双击代码复制

// Encode
const encoded = Buffer.from("hello world").toString("base64")

// Decode
const decoded = Buffer.from("aGVsbG8gd29ybGQ=", "base64").toString("utf-8")

This is more reliable than btoa() because Buffer handles UTF-8 and binary data natively.

A Real Debugging Story

Last month I spent two hours debugging a corrupted image upload. The frontend was encoding a JPEG as Base64 and sending it in a JSON body. The backend decoded it and saved to disk. But the image was unreadable — a pink-tinted mess.

The culprit: the Base64 string had a URL-encoded + somewhere in the query string layer. The server decoded %2B back to + but the frontend had already replaced + with spaces during serialization.

The fix: always use Base64URL for data that passes through URLs, and standard Base64 for data embedded directly in JSON bodies. Never mix them.

When NOT to Use Base64

Three scenarios where Base64 is actively harmful:

1. Large files over 1MB: The 33% overhead adds significant bandwidth cost. Send raw binary with proper Content-Type headers.

2. CDN-hosted images: A CDN-delivered image loads faster and caches better than a Base64 data URI in your HTML.

3. Binary APIs: If your API is internal and both client and server control the transport, use protobuf or MessagePack. They are smaller and faster to encode/decode.

Performance Considerations

Base64 encoding and decoding is CPU-cheap on modern hardware, but the 33% size increase means more data on the wire. For APIs returning large payloads, consider whether Base64 is necessary. Binary JSON formats like MessagePack or BSON avoid the overhead entirely.

My rule of thumb: if the encoded data fits in a single HTTP response (under 1MB), Base64 is fine. For file uploads or streaming data, send raw binary with the correct Content-Type header instead.