TL;DR
I reduced my geo-points API payload from 7.37 MB (uncompressed) to 2.74 MB (gzipped Protobuf) — a 63% reduction in wire transfer — and improved client-side parsing by 40%. This made our interactive map usable on mobile networks and drastically improved the experience for tens of thousands of users during traffic spikes.
The path wasn't obvious: I tested JSON, custom compact JSON, MessagePack, and Protocol Buffers, learning surprising lessons about compression, parsing performance, and when each format shines.
The Problem: A Map That Needed to Scale
I maintain an interactive mapping application that displays over 67,000 points on a Mapbox map. Each point represents a geographic location with associated metadata. When users load the map (whole world map is default view), their browser fetches all point coordinates from /geo_points endpoint.
By early 2025, this had become a problem:
- 7.37 MB uncompressed JSON sent over the wire
- 5-10 second load times on average connections
- 20+ seconds or timeouts on mobile networks
- Poor first impressions for new users
- Mounting bandwidth costs — 7+ MB × thousands of daily requests
The timing was critical. Most of the year, I see 100-300 daily visitors. But during major events, I get 30,000+ visitors in 24 hours. My infrastructure needed to handle both extremes gracefully.
Understanding the Problem: It's Not Just About Size
The Data: Simple but Voluminous
Each point is just four fields:
{
"uuid": "9652507b-1edb-4286-8447-30461bce299e",
"status": 0,
"longitude": -118.2514212138173,
"latitude": 34.23746375980207
}
With 67,627 points, that's 270,508 data values. The original JSON implementation repeated "uuid":, "status":, "longitude":, "latitude": for every record — 2.84 MB of just key names.
The Hidden Challenge: Traffic Patterns and Caching
Most optimization guides assume consistent traffic. My reality is different:
| Period | Daily Users | Challenge |
|---|---|---|
| Typical days | 100-300 | Cache expires between visits |
| Peak events | 30,000+ | Cache hammered constantly |
| Secondary events | 5,000-10,000 | Moderate load |
Real production logs prove the challenge. Here's a representative hour from January 2026:
=== REQUEST DISTRIBUTION (1 hour sample) ===
/statistics 220 requests (lightweight polling, ~170 bytes)
/geo_points 3 requests (heavy map data, ~2.7 MB each)
Only 3 map loads in an entire hour. With a standard 3-minute cache TTL and no pre-warming, that cache would expire ~20 times between requests. Nearly every visitor would trigger an expensive database query and serialization.
My Multi-Layered Solution
I needed four optimizations working together:
- Compact JSON — Eliminate redundant field names
- Binary serialization — Switch from text to Protobuf
- Gzip compression — Let Cloudflare CDN compress responses
- Cache pre-warming — Sidekiq job keeps all formats ready
Each optimization multiplies the benefits of the others. You can't cherry-pick just one.
The Four-Stage Evolution
Stage 1: Original JSON — The Baseline
[
{
"uuid": "9652507b-1edb-4286-8447-30461bce299e",
"status": 0,
"longitude": -118.2514212138173,
"latitude": 34.23746375980207
}
]
Size: 7.37 MB (7,725,105 bytes)
The problem jumps out: we're shipping the same four keys 67,627 times. That's 42 bytes per record × 67,627 = 2.84 MB of redundant field names.
Stage 2: Compact JSON — Arrays Instead of Objects
My first optimization was conceptually simple but required coordinating backend and frontend: use positional arrays instead of objects.
[
["9652507b-1edb-4286-8447-30461bce299e", 0, -118.2514212138173, 34.23746375980207]
]
Size: 5.30 MB (5,561,042 bytes) — 28% reduction
The backend change was a single SQL query modification:
class GeoPointsQuery
def results
Point.connection.select_rows(
"SELECT uuid, CASE WHEN active_at IS NOT NULL THEN 1 ELSE 0 END, longitude, latitude FROM points WHERE longitude IS NOT NULL"
).to_json
end
end
The frontend gained type safety through TypeScript tuples:
type CompactPoint = [string, 0 | 1, number, number];
const points: CompactPoint[] = await response.json();
Tradeoff: Less human-readable in dev tools, but no runtime performance cost and zero new dependencies. This alone saved 2 MB.
Stage 3: MessagePack — Binary JSON
MessagePack is a binary format that's API-compatible with JSON but more space-efficient. It encodes numbers and strings more compactly than text.
# Backend
gem 'msgpack'
MessagePack.pack(parsed_json_array)
// Frontend
import { decode } from '@msgpack/msgpack';
const buffer = await response.arrayBuffer();
const points = decode(new Uint8Array(buffer));
Size: 3.92 MB (3,922,371 bytes) — 47% reduction from original
MessagePack achieves this by:
- Encoding doubles as 8 bytes instead of 18+ text characters
- Using compact length prefixes for strings
- Eliminating quotes, colons, and commas
Stage 4: Protocol Buffers — Typed Binary Schema
Protobuf is Google's language-neutral binary format with strict schema definitions. Unlike MessagePack, it requires sharing a schema file between client and server.
// geo_points.proto
syntax = "proto3";
package geo_points;
message GeoPoint {
string uuid = 1;
uint32 status = 2;
double longitude = 3;
double latitude = 4;
}
message GeoPointList {
repeated GeoPoint points = 1;
}
Size: 3.92 MB (3,922,470 bytes) — 47% reduction from original
Wait — nearly identical to MessagePack? Yes, and here's why:
My data is primarily:
- UUIDs: Fixed 36-character strings (same size in any format)
- Coordinates: 64-bit doubles (8 bytes in any binary format)
- Status flag: Requires only 1 bit, but both formats use at least 1 byte
For this specific data shape, Protobuf and MessagePack converge to nearly identical raw sizes. The real differences emerge in parsing performance and gzip compression — more on that below.
The Complete Picture: Raw Size vs Wire Transfer
Here's where the story gets interesting. The original implementation sent uncompressed JSON over the wire. My optimized solution uses Protobuf with gzip compression via Cloudflare CDN:
| Format | Raw Size | Gzipped | Wire Transfer | vs Original |
|---|---|---|---|---|
| Original JSON (no gzip) | 7.37 MB | — | 7.37 MB | baseline |
| Compact JSON (gzipped) | 5.30 MB | 2.67 MB | 2.67 MB | -64% |
| MessagePack (gzipped) | 3.74 MB | 2.63 MB | 2.63 MB | -64% |
| Protobuf (gzipped) | 3.74 MB | 2.62 MB | 2.62 MB | -64% |
The improvement is dramatic: from 7.37 MB transferred to 2.62 MB — saving 4.75 MB per request.
Why Gzip Matters More Than You'd Think
Interestingly, gzip helps JSON more than binary formats:
| Format | Raw | Gzipped | Compression Ratio |
|---|---|---|---|
| Compact JSON | 5.30 MB | 2.67 MB | 50% reduction |
| MessagePack | 3.74 MB | 2.63 MB | 30% reduction |
| Protobuf | 3.74 MB | 2.62 MB | 30% reduction |
Binary formats are already optimized, so they compress less efficiently. JSON's repetitive patterns (quotes, brackets, commas) compress beautifully.
However, binary + gzip still wins because you start from a smaller base.
Real Production Logs
Here's actual data from my Heroku router logs:
# Original (no gzip, uncompressed JSON)
path="/geo_points" service=3500ms status=200 bytes=7725105
# Current implementation (with gzip):
# JSON format (gzipped)
path="/geo_points" service=1089ms status=200 bytes=2799184
# MessagePack format (gzipped)
path="/geo_points" service=1113ms status=200 bytes=2761771
# Protobuf format (gzipped)
path="/geo_points" service=394ms status=200 bytes=2743993
Notice: 7.7 MB uncompressed → 2.7 MB gzipped — almost 3× less data over the wire.
Performance Benchmarks: Where Time Really Goes
Download Times: The User-Facing Win
The most dramatic improvement is what users experience:
| Implementation | Wire Size | Avg Download Time |
|---|---|---|
| Original (uncompressed JSON) | 7.37 MB | 5-10 seconds |
| Optimized (gzipped Protobuf) | 2.62 MB | 1.37 seconds |
That's a 5-7× improvement for visitors — especially critical on mobile networks during traffic spikes.
Format Comparison: JSON vs MessagePack vs Protobuf
I measured 5 consecutive requests to each format (all gzipped via CDN):
| Format | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Average |
|---|---|---|---|---|---|---|
| JSON | 2.12s | 1.26s | 1.63s | 1.53s | 1.49s | 1.61s |
| MessagePack | 1.36s | 1.07s | 1.72s | 1.49s | 1.54s | 1.44s |
| Protobuf | 1.43s | 1.02s | 1.54s | 1.44s | 1.44s | 1.37s |
Protobuf is consistently 15% faster to download than JSON.
Client-Side Parsing: The Surprise Winner
This is where things got fascinating. I benchmarked 10 parsing iterations in Node.js:
| Format | Avg Time | Min | Max | vs JSON |
|---|---|---|---|---|
| JSON.parse | 8.74ms | 7.45ms | 11.15ms | baseline |
| MessagePack decode | 11.05ms | 8.21ms | 15.42ms | 27% slower ❌ |
| Protobuf decode | 5.27ms | 4.75ms | 7.08ms | 40% faster ✅ |
Surprising result: MessagePack is slower to parse than native JSON!
Why? JavaScript's JSON.parse() is implemented in highly optimized C++ within V8 and has been tuned for decades. MessagePack parsing requires JavaScript-level binary manipulation, which simply can't compete with native code.
Protobuf wins because protobufjs generates optimized static decoders at build time, and the binary format requires less branching during parsing.
Server-Side Serialization: Ruby's Performance Story
I benchmarked 10 iterations of serializing 67,627 records:
| Format | Total Time | Per-Iteration | vs JSON |
|---|---|---|---|
| Original JSON (with keys) | 2.02s | 202ms | baseline |
| Compact JSON (arrays) | 1.71s | 171ms | 15% faster |
| MessagePack | 0.06s | 6ms | 97% faster ✅ |
| Protobuf | 2.59s | 259ms | 28% slower ❌ |
Key insight: MessagePack serialization is blazingly fast in Ruby — nearly instant. This is because the msgpack gem is a native C extension, while google-protobuf has more Ruby-level object construction overhead.
Takeaway: For server-side performance, MessagePack wins. For client-side parsing, Protobuf wins.
Implementation: Supporting All Three Formats
I chose to support all three formats simultaneously via HTTP content negotiation. This enables:
- Gradual frontend migration without breaking changes
- A/B testing different formats
- JSON fallback for debugging
Backend: Rails Controller with Multi-Format Support
# config/initializers/mime_types.rb
Mime::Type.register 'application/msgpack', :msgpack
Mime::Type.register 'application/x-protobuf', :protobuf
# app/controllers/geo_points_controller.rb
class GeoPointsController < ApiController
def index
respond_to do |format|
format.json do
geo_points = Rails.cache.fetch('geo_points_json', expires_in: cache_ttl) do
GeoPointsQuery.new.results
end
render plain: geo_points, content_type: 'application/json'
end
format.msgpack do
geo_points = Rails.cache.fetch('geo_points_msgpack', expires_in: cache_ttl) do
MessagePack.pack(JSON.parse(GeoPointsQuery.new.results))
end
render body: geo_points, content_type: 'application/msgpack'
end
format.protobuf do
geo_points = Rails.cache.fetch('geo_points_protobuf', expires_in: cache_ttl) do
GeoPointsProtobufSerializer.new(GeoPointsQuery.new.results).serialize
end
render body: geo_points, content_type: 'application/x-protobuf'
end
end
end
private
def cache_ttl
ENV.fetch('CACHE_EXPIRES_IN', 180).to_i.seconds
end
end
Cache Pre-Warming: The Secret Sauce
Since serialization can be slow (especially Protobuf at 259ms), we pre-warm all three cached formats using Sidekiq:
# app/jobs/warm_geo_points_cache_job.rb
class WarmGeoPointsCacheJob
include Sidekiq::Job
def perform
expires_in = ENV.fetch('CACHE_EXPIRES_IN', 180).to_i.seconds
# Fetch once from database
json_result = GeoPointsQuery.new.results
parsed_data = JSON.parse(json_result)
# Serialize to all formats in parallel
msgpack_result = MessagePack.pack(parsed_data)
protobuf_result = GeoPointsProtobufSerializer.new(json_result).serialize
# Cache all three
Rails.cache.write('geo_points_json', json_result, expires_in: expires_in)
Rails.cache.write('geo_points_msgpack', msgpack_result, expires_in: expires_in)
Rails.cache.write('geo_points_protobuf', protobuf_result, expires_in: expires_in)
Rails.logger.info(
"[WarmGeoPointsCacheJob] Cache warmed: " \
"JSON=#{json_result.bytesize}, MsgPack=#{msgpack_result.bytesize}, " \
"Protobuf=#{protobuf_result.bytesize}"
)
end
end
Sidekiq-cron configuration:
warm_geo_points_cache: "*/2 * * * *" # Every 2 minutes
With 2-minute refresh and 3-minute TTL, the cache is always warm. Even during quiet periods with 25-minute gaps between visitors, every request gets instant cached data. No user ever waits for serialization.
Frontend: Type-Safe Protobuf Integration
We use protobufjs-cli to generate TypeScript definitions from the .proto file:
npm run proto:generate
# Runs: pbjs -t static-module -w es6 -o lib/generated/geo_points.js protos/geo_points.proto
# && pbts -o lib/generated/geo_points.d.ts lib/generated/geo_points.js
This generates strongly-typed classes:
// lib/generated/geo_points.d.ts (auto-generated)
export namespace geo_points {
interface IGeoPoint {
uuid?: string | null;
status?: number | null;
longitude?: number | null;
latitude?: number | null;
}
class GeoPointList {
static decode(reader: Uint8Array): GeoPointList;
points: IGeoPoint[];
}
}
The API client requests Protobuf via the Accept header:
// lib/apiClient.ts
import { geo_points } from "./generated/geo_points";
export async function getAllGeoPoints(): Promise<GeoPointsResponse> {
const apiHost = process.env.NEXT_PUBLIC_API_HOST;
const response = await fetch(`${apiHost}/geo_points`, {
method: "GET",
headers: {
...makeAuthHeaders(),
Accept: "application/x-protobuf", // Request Protobuf format
},
cache: "no-cache",
});
if (!response.ok) {
throw new Error(`Failed to fetch geo_points: ${response.status}`);
}
// Parse binary response
const buffer = await response.arrayBuffer();
const decoded = geo_points.GeoPointList.decode(new Uint8Array(buffer));
// Convert to internal format
return decoded.points.map((p) => [
p.uuid ?? "",
(p.status ?? 0) as 0 | 1,
p.longitude ?? 0,
p.latitude ?? 0,
]);
}
Hard-Won Lessons
1. Proto Naming Conflicts with ActiveRecord Models
My initial .proto file used message Point, which conflicted with our Rails Point model. This caused cryptic test failures where Point.find(id) was calling the Protobuf class.
Solution: Always use a package namespace:
package geo_points;
message GeoPoint { // Not just "Point"
...
}
In Ruby, this generates GeoPoints::GeoPoint which doesn't conflict.
2. Rails Zeitwerk Autoloading Hates Generated Files
Rails 7 uses Zeitwerk for autoloading, which expects files to define constants matching their path. Our generated geo_points_pb.rb didn't follow this convention, causing boot failures.
Solution: Move generated files outside app/ to lib/protobuf/ and require them explicitly:
require_relative '../../lib/protobuf/geo_points_pb'
3. Node.js ESM/CommonJS Module Wars
Generated protobufjs files use ES modules, but our test scripts used CommonJS. Module resolution errors everywhere.
Solution: Use dynamic imports in CommonJS:
async function main() {
const { geo_points } = await import('./lib/generated/geo_points.js');
// ...
}
4. Gzip Compression Levels the Playing Field
I expected Protobuf to be dramatically smaller over the wire, but gzip compression reduced the differences to just 2%. This was initially disappointing, but the 40% parsing speed improvement made it absolutely worthwhile.
When to Use Each Format
| Format | Best For | Avoid When |
|---|---|---|
| JSON | Debugging, dev tools, simple APIs, small payloads (<100KB) | Large datasets, mobile networks, high traffic |
| Compact JSON | Quick wins, no dependencies, rapidly changing schemas | Binary data, maximum compression |
| MessagePack | Server-side caching, Ruby/Python backends, schema-less data | Client-heavy parsing in JavaScript |
| Protobuf | Client-heavy parsing, strict schemas, cross-language APIs | Rapidly changing schemas, small teams |
Why We Chose Protobuf
- Fastest client parsing — 40% faster than JSON, critical for mobile users
- Type safety — Catches data shape errors at build time
- Language neutral — Same
.protofile generates Ruby and TypeScript code - Smallest gzipped size — Every byte counts at scale
- Future-proof — Industry standard (Google, Netflix, Uber)
When MessagePack Might Be Better
- You need schema-less flexibility
- Your backend does more serialization work than your frontend does parsing
- You want simpler tooling (no code generation step)
- You're using Ruby and care deeply about serialization speed
Final Results: Before and After
| Metric | Before | After | Improvement |
|---|---|---|---|
| Wire transfer | 7.37 MB (uncompressed) | 2.62 MB (gzipped) | 64% smaller |
| Raw payload | 7.37 MB | 3.74 MB | 49% smaller |
| Download time | 5-10 seconds | ~1.4 seconds | 5-7× faster |
| Parse time | 8.7ms | 5.3ms | 40% faster |
| Cache strategy | None (DB hit) | Pre-warmed Redis | Instant response |
The User Experience Transformation
For our 67,000+ point map, visitors now see data appear 5-7 seconds faster than before. This matters most during traffic spikes when tens of thousands of users visit simultaneously.
But thanks to our pre-warming cache strategy, even the 300 daily visitors during quiet periods get the same instant experience. The cache never goes cold.
Key Takeaways
-
Don't optimize in isolation — My wins came from combining four optimizations: compact format, binary serialization, gzip, and cache pre-warming.
-
Measure everything — I was surprised by MessagePack's slow JavaScript parsing and Protobuf's slow Ruby serialization. Benchmarks revealed the truth.
-
Gzip changes everything — Binary formats compress less than JSON, but starting from a smaller base still wins.
-
Client parsing matters — With 67K records, even milliseconds add up. Protobuf's 40% parsing speedup was critical.
-
Pre-warming is underrated — For spiky or low traffic, cache pre-warming ensures instant responses even after long gaps.
-
Content negotiation enables gradual migration — Supporting multiple formats let me A/B test, debug, and migrate without breaking changes.
Tech Stack
| Component | Technology |
|---|---|
| Backend | Ruby on Rails 7.0.8, Ruby 3.1.4 |
| Protobuf (Ruby) | google-protobuf gem ~4.29 |
| MessagePack (Ruby) | msgpack gem ~1.7 |
| Frontend | Next.js 14, React 18, TypeScript 5.8 |
| Protobuf (JS) | protobufjs ~7.5, protobufjs-cli ~1.2 |
| MessagePack (JS) | @msgpack/msgpack ~3.1 |
| Hosting | Heroku (backend), Vercel (frontend) |
| CDN | Cloudflare (automatic gzip) |
| Cache | Redis (Heroku Redis) |