Any3DAny3D
·Any3D Team

Why Are 3D Models So Heavy?

3d-compressiongltfwebglperformance

Your model is secretly getting heavier

You exported a GLB file—10MB, feels fine. You open it on your phone: white screen, stuttering, or even a outright crash.

No problem on your computer, yet it explodes on your phone. This isn't your code's fault. It's a counterintuitive property of 3D models: on disk and in VRAM, they're completely different sizes.

A JPG image might be only 200KB on disk. But the GPU doesn't understand JPG; it only understands raw pixels. So before it's uploaded to VRAM, the image gets fully decompressed. A 2048x2048 texture eats about 22MB of VRAM once decompressed. If you use 6 textures (albedo, normal, roughness, metallic, AO, emissive), a single material costs 132MB.

A phone might only have 2-4GB of VRAM in total, and one model's textures eat 3-6% of it. Now imagine 10 models in the scene.

Break it down: where the bytes go

A typical GLB model has three main parts: vertex data, textures, and metadata plus animations.

Take a real PBR model:

ComponentWhat it isTypical shareNotes
Texturesalbedo, normal, roughness, metallic, AO, etc.70-85%Almost always the bulk
Vertex dataposition, normal, UV, tangent, color10-20%Depends on model complexity
Animation dataskeletons, skinning, keyframes0-15%Only when animated
Othermaterial defs, scene structure, cameras< 2%Negligible

Textures take up about 80%. Often you think the vertices are what need optimizing, but what's really hogging the space is the textures.

"Small on disk" doesn't mean "small in VRAM"

This might be the single most important point in understanding 3D performance.

PNG and JPG are designed for network transfer—tiny on disk, fast to download. But the GPU can't use them directly; they must be fully decompressed into raw pixels first. The formula:

VRAM = width * height * 4 bytes (RGBA) * 1.333 (with mipmaps)

A 4096x4096 RGBA texture:

MetricValue
PNG file size~8MB
JPG file size~1.5MB
VRAM usage (with mipmaps)~87MB

A 1.5MB JPG becomes 87MB in VRAM.

What are mipmaps? The GPU generates a series of progressively smaller versions of a texture, from the original size down to a 1x1 pixel, each level half of the previous one. This makes distant objects render faster and sharper, but costs about 33% more VRAM. Nearly every 3D app uses mipmaps, so this overhead is essentially standard.

So PNG and JPG are like vacuum storage bags—compressed small for the trip, but you have to fully expand them at your destination. The download got faster, but VRAM savings: zero.

What happens when VRAM runs out

You won't get a neat "out of memory" dialog. Reality is worse:

  • Mobile: white screen, or the OS kills the tab outright
  • VR headsets: frame drops. In VR, a dropped frame isn't "a little laggy"—it triggers motion sickness
  • Desktop: texture flickering, degradation, slower rendering

On Reddit, a developer built a WebXR gallery and crammed 60 stereoscopic images onto a Quest. It worked at first, then grew increasingly unstable until it crashed. He spent days debugging his code before realizing he'd never seriously thought about VRAM—he'd just kept shoving JPGs at the GPU.

Two paths for compression

3D model compression splits into two directions:

Vertex compression—stores geometry data (vertex coordinates, normals, UVs) more compactly. For example, swapping 32-bit floats for 16-bit integers (this is called quantization). Representative solutions: Draco, MeshOpt, KHR_mesh_quantization.

Texture compression—keeps textures compressed even inside VRAM. The GPU decodes individual pixels on the fly while sampling, with almost no performance cost. Representative solution: KTX2 + Basis Universal.

Vertex compressionTexture compression
What it reducesGeometry dataTextures
Typical effect50-90% smaller50-70% smaller on disk, 75% less VRAM
Lossy?Yes, precision lossYes, quality loss
Best forVertex-dense modelsAlmost every PBR model
SeePart 2Parts 3 and 4

A common misconception: compress the vertices with Draco and call it done. But textures are 80% of the model's size—halve the vertices and the whole thing might only shrink 10%. You have to work both ends.

No single method wins everywhere

This is the core thesis of the entire series:

Different platforms, different devices, different use cases call for different compression strategies.

ScenarioPrimary bottleneckFocus
Desktop web showcaseDownload speedFile size
Mobile browserVRAMTexture compression
VR headsetVRAM + frame rateTexture compression + vertex simplification
WeChat Mini ProgramPackage size + compatibilityLightweight options (MeshOpt)
Large scenesVRAM + draw callsFull-stack compression + LOD

Each upcoming article won't just say "use X and you're done." It will explain: where X shines, when it actually backfires, and when you should reach for Y instead.

What's next

This article nailed down the problem. The next one gets hands-on—meeting the three tools of vertex compression: quantization, MeshOpt, and Draco.

Support Us