Why Are 3D Models So Heavy?
Your model is secretly getting heavier
You exported a GLB file—10MB, feels fine. You open it on your phone: white screen, stuttering, or even a outright crash.
No problem on your computer, yet it explodes on your phone. This isn't your code's fault. It's a counterintuitive property of 3D models: on disk and in VRAM, they're completely different sizes.
A JPG image might be only 200KB on disk. But the GPU doesn't understand JPG; it only understands raw pixels. So before it's uploaded to VRAM, the image gets fully decompressed. A 2048x2048 texture eats about 22MB of VRAM once decompressed. If you use 6 textures (albedo, normal, roughness, metallic, AO, emissive), a single material costs 132MB.
A phone might only have 2-4GB of VRAM in total, and one model's textures eat 3-6% of it. Now imagine 10 models in the scene.
Break it down: where the bytes go
A typical GLB model has three main parts: vertex data, textures, and metadata plus animations.
Take a real PBR model:
| Component | What it is | Typical share | Notes |
|---|---|---|---|
| Textures | albedo, normal, roughness, metallic, AO, etc. | 70-85% | Almost always the bulk |
| Vertex data | position, normal, UV, tangent, color | 10-20% | Depends on model complexity |
| Animation data | skeletons, skinning, keyframes | 0-15% | Only when animated |
| Other | material defs, scene structure, cameras | < 2% | Negligible |
Textures take up about 80%. Often you think the vertices are what need optimizing, but what's really hogging the space is the textures.
"Small on disk" doesn't mean "small in VRAM"
This might be the single most important point in understanding 3D performance.
PNG and JPG are designed for network transfer—tiny on disk, fast to download. But the GPU can't use them directly; they must be fully decompressed into raw pixels first. The formula:
VRAM = width * height * 4 bytes (RGBA) * 1.333 (with mipmaps)
A 4096x4096 RGBA texture:
| Metric | Value |
|---|---|
| PNG file size | ~8MB |
| JPG file size | ~1.5MB |
| VRAM usage (with mipmaps) | ~87MB |
A 1.5MB JPG becomes 87MB in VRAM.
What are mipmaps? The GPU generates a series of progressively smaller versions of a texture, from the original size down to a 1x1 pixel, each level half of the previous one. This makes distant objects render faster and sharper, but costs about 33% more VRAM. Nearly every 3D app uses mipmaps, so this overhead is essentially standard.
So PNG and JPG are like vacuum storage bags—compressed small for the trip, but you have to fully expand them at your destination. The download got faster, but VRAM savings: zero.
What happens when VRAM runs out
You won't get a neat "out of memory" dialog. Reality is worse:
- Mobile: white screen, or the OS kills the tab outright
- VR headsets: frame drops. In VR, a dropped frame isn't "a little laggy"—it triggers motion sickness
- Desktop: texture flickering, degradation, slower rendering
On Reddit, a developer built a WebXR gallery and crammed 60 stereoscopic images onto a Quest. It worked at first, then grew increasingly unstable until it crashed. He spent days debugging his code before realizing he'd never seriously thought about VRAM—he'd just kept shoving JPGs at the GPU.
Two paths for compression
3D model compression splits into two directions:
Vertex compression—stores geometry data (vertex coordinates, normals, UVs) more compactly. For example, swapping 32-bit floats for 16-bit integers (this is called quantization). Representative solutions: Draco, MeshOpt, KHR_mesh_quantization.
Texture compression—keeps textures compressed even inside VRAM. The GPU decodes individual pixels on the fly while sampling, with almost no performance cost. Representative solution: KTX2 + Basis Universal.
| Vertex compression | Texture compression | |
|---|---|---|
| What it reduces | Geometry data | Textures |
| Typical effect | 50-90% smaller | 50-70% smaller on disk, 75% less VRAM |
| Lossy? | Yes, precision loss | Yes, quality loss |
| Best for | Vertex-dense models | Almost every PBR model |
| See | Part 2 | Parts 3 and 4 |
A common misconception: compress the vertices with Draco and call it done. But textures are 80% of the model's size—halve the vertices and the whole thing might only shrink 10%. You have to work both ends.
No single method wins everywhere
This is the core thesis of the entire series:
Different platforms, different devices, different use cases call for different compression strategies.
| Scenario | Primary bottleneck | Focus |
|---|---|---|
| Desktop web showcase | Download speed | File size |
| Mobile browser | VRAM | Texture compression |
| VR headset | VRAM + frame rate | Texture compression + vertex simplification |
| WeChat Mini Program | Package size + compatibility | Lightweight options (MeshOpt) |
| Large scenes | VRAM + draw calls | Full-stack compression + LOD |
Each upcoming article won't just say "use X and you're done." It will explain: where X shines, when it actually backfires, and when you should reach for Y instead.
What's next
This article nailed down the problem. The next one gets hands-on—meeting the three tools of vertex compression: quantization, MeshOpt, and Draco.