GitHub

How to Render an Anime Character with WebGPU

You've tried three.js or babylon.js and wanted to understand what's happening under the hood. You looked at WebGPU tutorials, saw the "Hello Triangle" example, but still don't know how to render a real 3D character from scratch. This tutorial bridges that gap. In five incremental steps, you'll go from a simple triangle to a fully textured, animated anime character—learning the complete rendering pipeline along the way: geometry buffers, camera transforms, materials and textures, skeletal animation, and the render loop that ties it all together.

We focus on understanding the GPU pipeline itself, not implementation details. The real challenge isn't the math or the shaders—AI can generate those. The challenge is learning a different mental model: you need to know what components exist (buffers, bind groups, pipelines, render passes), how they connect (which data goes where, in what order), and why they're necessary (when to use uniform vs storage buffers, how textures flow from CPU to GPU). By the end, you'll have built a working renderer and understand the architecture behind engines like the Reze Engine. Full source code is available here.

img

Engine v0: Your First Triangle

Let's start with the classic Hello Triangle—not because it's exciting, but because it's the simplest example that shows every essential component of the WebGPU pipeline. Once you understand how these pieces connect here, scaling up to complex models is just adding more data, not learning new concepts.

Think of the GPU as a separate computer with its own memory and instruction set. Unlike JavaScript where you pass data directly to functions, working with the GPU involves cross-boundary communication—you need to be explicit about:

  • The data to process: vertices
  • Where to get the data from: buffer
  • How to process it: shaders and pipeline
  • The main entry point: render pass

Let's look at the first Engine class engines/v0.ts. The code follows the standard WebGPU initialization pattern:

  1. Request a GPU device and set up a rendering context on the canvas
  2. Allocate a GPU buffer and write the positions of our 3 vertices into it using writeBuffer
  3. Define shaders: the vertex shader processes each vertex, and the fragment shader determines the color of each pixel
  4. Bundle these shaders with metadata about the buffer layout into a pipeline
  5. Create a render pass that executes the pipeline and produces the triangle on screen

Engine v1: Add a Camera and Make it 3D

The first example draws a single static frame. To make it 3D, we need two things: a camera and a render loop that generates continuous frames. The camera isn't a 3D object—it's a pair of transformation matrices (view and projection) that convert 3D world coordinates into 2D screen coordinates, creating the illusion of depth. Unlike in three.js or babylon.js, WebGPU doesn't have a built-in camera object, so we manage these matrices ourselves.

Here's the camera class we use throughout the tutorial and in the Reze Engine: lib/camera.ts. The implementation details aren't important (throw to AI)—just know that it calculates view and projection matrices that update in response to mouse events (movements, zooming, and panning).

Now look at the second Engine class engines/v1.ts. To pass camera matrices from JavaScript to the shader, we use a uniform buffer—a chunk of GPU memory that acts like a global variable accessible to all shaders. First, we write the camera data to the buffer:

this.device.queue.writeBuffer(this.cameraUniformBuffer, 0, this.cameraMatrixData)

Next, we create a bind group that tells the GPU where to find this buffer, and attach it to the render pass:

this.bindGroup = this.device.createBindGroup({
  label: "bind group layout",
  layout: this.pipeline.getBindGroupLayout(0),
  entries: [{ binding: 0, resource: { buffer: this.cameraUniformBuffer } }],
})
pass.setBindGroup(0, this.bindGroup);

Finally, in the shader, we define a struct matching the buffer's memory layout:

struct CameraUniforms {
  view: mat4x4f,
  projection: mat4x4f,
  viewPos: vec3f,
  _padding: f32,
};

@group(0) @binding(0) var<uniform> camera: CameraUniforms;

Now the shader can access camera.view and camera.projection directly. In the vertex shader, we multiply each vertex position by these matrices:

@vertex
fn vs(@location(0) position: vec2<f32>) -> @builtin(position) vec4<f32> {
  return camera.projection * camera.view * vec4f(position, 0.0, 1.0);
}            

This uniform buffer pattern is fundamental in WebGPU—you'll use it to pass any data from CPU to GPU, including lighting parameters, material properties, and transformation matrices.

Engine v2: Render Character Geometry

Now we move from a hardcoded triangle to actual model geometry. We're using a pre-parsed PMX model data —the standard format for MMD (MikuMikuDance) anime characters. MMD is widely used for anime-style character modeling, with massive fan communities creating models from popular games like Genshin Impact (原神) and Aether Gazer (深空之眼). The parser itself isn't covered here (any model format works; use AI to generate parsers as needed). What matters is understanding the two key data structures: vertices and indices.

Each vertex in the model contains three types of data, stored sequentially in memory (this is called interleaved vertex data):

  • Position: [x, y, z] coordinates in 3D space
  • Normal: [nx, ny, nz] direction perpendicular to the surface (used for lighting)
  • UV coordinates: [u, v] texture mapping coordinates (tells which part of a texture image to display)

The index buffer specifies which vertices form each triangle—instead of duplicating vertex data, we reference existing vertices by their indices. This dramatically reduces memory usage.

In engines/v2.ts, we create both vertex and index buffers from the model data. Look for the initVertexBuffers method:

private initVertexBuffers() {
  const vertices = Float32Array.from(this.model.vertices)
  this.vertexBuffer = this.device.createBuffer({
    label: "model vertex buffer",
    size: vertices.byteLength,
    usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST,
  })
  this.device.queue.writeBuffer(this.vertexBuffer, 0, vertices.buffer)

  // Create index buffer
  const indices = Uint32Array.from(this.model.indices)
  this.indexBuffer = this.device.createBuffer({
    label: "model index buffer",
    size: indices.byteLength,
    usage: GPUBufferUsage.INDEX | GPUBufferUsage.COPY_DST,
  })
  this.device.queue.writeBuffer(this.indexBuffer, 0, indices.buffer)
}

The key change is using indexed drawing instead of direct drawing. The render pass now calls drawIndexed and specifies the index buffer:

pass.setVertexBuffer(0, this.vertexBuffer)
pass.setIndexBuffer(this.indexBuffer, "uint32")
pass.drawIndexed(this.model.indices.length) // draw all triangles using indices

The result is a red shape of the character. Without textures (coming next), we see only the raw geometry. But this is a major milestone—we've gone from 3 hardcoded vertices to rendering a complex model with thousands of triangles.

Engine v3: Material and Texture

Now we add textures to bring color and detail to the character. This introduces two important concepts: materials and textures.

A material links a group of vertices (by their indices) and specifies which texture and visual parameters to use when drawing those triangles. In a character model, a material can be the face, hair, clothes, or other components.

A texture is an image file that contains color data. Each vertex has UV coordinates that map it to a location in the texture. The fragment shader samples the texture using these coordinates to determine the color for each pixel.

In engines/v3.ts, we first load texture images and create GPU textures. Look for the initTexture method. We fetch each image file, create an ImageBitmap, then create a GPUTexture and upload the image data:

const imageBitmap = await createImageBitmap(await response.blob())
const texture = this.device.createTexture({
  size: [imageBitmap.width, imageBitmap.height],
  format: "rgba8unorm",
  usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST,
})
this.device.queue.copyExternalImageToTexture({ source: imageBitmap }, { texture }, [
  imageBitmap.width,
  imageBitmap.height,
])

Next, we create a sampler that defines how the texture should be sampled (filtering, wrapping, etc.):

this.sampler = this.device.createSampler({
  magFilter: "linear",
  minFilter: "linear",
  addressModeU: "repeat",
  addressModeV: "repeat",
})

In the shader, we need to pass UV coordinates from the vertex shader to the fragment shader. We define a VertexOutput struct to bundle the position and UV together:

struct VertexOutput {
  @builtin(position) position: vec4<f32>,
  @location(0) uv: vec2<f32>,
}

@vertex
fn vs(@location(2) uv: vec2<f32>) -> VertexOutput {
  var output: VertexOutput;
  output.position = camera.projection * camera.view * vec4f(position, 1.0);
  output.uv = uv;
  return output;
}

The fragment shader receives the UV coordinates and samples the texture using textureSample:

@fragment
fn fs(input: VertexOutput) -> @location(0) vec4<f32> {
  return vec4<f32>(textureSample(texture, textureSampler, input.uv).rgb, 1.0);
}

To bind textures to the shader, we create a bind group for each material with its texture and sampler. We add this as a second bind group alongside the camera uniform:

for (const material of this.model.materials) {
  const textureIndex = material.diffuseTextureIndex
  const materialBindGroup = this.device.createBindGroup({
    layout: this.pipeline.getBindGroupLayout(1),
    entries: [
      { binding: 0, resource: this.textures[textureIndex].createView() },
      { binding: 1, resource: this.sampler },
    ],
  })
  this.materialBindGroups.push(materialBindGroup)
}

Finally, we render each material separately. Instead of one drawIndexed call for the entire model, we iterate through materials, set each material's bind group, and draw its triangles:

let firstIndex = 0
for (let i = 0; i < this.model.materials.length; i++) {
  const material = this.model.materials[i]
  if (material.vertexCount === 0) continue

  pass.setBindGroup(1, this.materialBindGroups[i])
  pass.drawIndexed(material.vertexCount, 1, firstIndex)
  firstIndex += material.vertexCount
}

The result transforms our red model into a fully textured character.

However, you might notice the character appears transparent or you can see through to the back faces. This happens because without depth testing, the GPU draws triangles in the order they're submitted—far triangles can draw over near ones. The fix is surprisingly simple—just three steps: create a depth texture, add it to the render pass, and configure the pipeline. No shader changes needed:

// Create depth texture
this.depthTexture = this.device.createTexture({
  size: [width, height],
  format: "depth24plus",
  usage: GPUTextureUsage.RENDER_ATTACHMENT,
})

// Add to render pass
depthStencilAttachment: {
  view: this.depthTexture.createView(),
  depthClearValue: 1.0,
  depthLoadOp: "clear",
  depthStoreOp: "store",
}

// Add to pipeline
depthStencil: {
  depthWriteEnabled: true,
  depthCompare: "less",
  format: "depth24plus",
}

The complete implementation is in engines/v3_2.ts. With materials, textures, and depth testing in place, we now have a complete static rendering pipeline. The character is fully textured and looks solid from any angle.

Engine v4: Bones and Skinning

Bones and Hierarchy

A bone is a transform in a hierarchy. Each bone has a parent (except the root), and moving a parent bone moves all its children. In MMD models, a typical arm chain looks like:

センター (center) → 上半身 (upper_body) → 右肩 (shoulder_R) → 右腕 (arm_R) → 右ひじ (elbow_R) → 右手首 (wrist_R) → finger joints

When you rotate 上半身 (upper_body), the entire upper body—shoulders, arms, elbows, wrists, and fingers—all follow. This cascading effect happens because each bone's transform is relative to its parent.

Skinning: Connecting Bones to Vertices

Skinning is how bones deform the mesh. Each vertex stores up to 4 bone indices and 4 weights that sum to 1.0. When bones move, the vertex's final position is a weighted blend:

// Vertex data
joints:  [15, 16, 0, 0]    // Bone indices
weights: [0.7, 0.3, 0, 0]  // 70% from bone 15, 30% from bone 16

// Final position = weighted sum of each bone's transform
finalPosition = (skinMatrix[15] * position) * 0.7 
              + (skinMatrix[16] * position) * 0.3

The skinMatrix for each bone combines the bone's current pose with its bind pose. This is what allows smooth deformation as bones rotate.

CPU-Side Bone Control

Bones live on the CPU. Animations, physics, and user input all update bone rotations here. When you rotate a bone, the engine recalculates the hierarchy (parent-to-child transforms) and uploads the results to GPU:

// Your game code: rotate the neck bone
engine.rotateBone("首", rotation)

// Internally, this triggers:
// 1. evaluatePose() - recalculate all world matrices from hierarchy
// 2. Upload world matrices to GPU
// 3. Compute pass - calculate skin matrices
// 4. Next render uses updated skinning

Compute Shaders: Parallel Matrix Calculations

With hundreds of bones and thousands of vertices, calculating skin matrices on the CPU is too slow. This is where compute shaders shine—a key WebGPU advantage over WebGL. Compute shaders run massively parallel calculations on the GPU, perfect for matrix operations.

We upload bone matrices to storage buffers, then dispatch a compute shader to calculate all skin matrices in parallel. For a model with 471 bones, this means 471 matrix multiplications happening simultaneously on the GPU:

@group(0) @binding(1) var<storage, read> worldMatrices: array<mat4x4f>;
@group(0) @binding(2) var<storage, read> inverseBindMatrices: array<mat4x4f>;
@group(0) @binding(3) var<storage, read_write> skinMatrices: array<mat4x4f>;

@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) globalId: vec3<u32>) {
  let boneIndex = globalId.x;
  if (boneIndex >= boneCount.count) { return; }
  
  skinMatrices[boneIndex] = worldMatrices[boneIndex] * inverseBindMatrices[boneIndex];
}

Putting It Together

Here's the complete flow each frame (see the full implementation in engines/v4.ts):

  1. CPU: Animation or user input updates bone rotations
  2. CPU: evaluatePose() walks the hierarchy to calculate world matrices
  3. CPU → GPU: Upload world matrices to storage buffer
  4. GPU compute pass: Calculate skinMatrix = world × inverseBind for all bones in parallel
  5. GPU render pass: Vertex shader reads skin matrices and blends each vertex by its bone weights

The vertex shader performs the final skinning calculation:

@group(0) @binding(1) var<storage, read> skinMats: array<mat4x4f>;

@vertex
fn vs(
  @location(0) position: vec3<f32>,
  @location(3) joints: vec4<u32>,
  @location(4) weights: vec4<f32>
) -> VertexOutput {
  // Blend position by bone influences
  var skinnedPos = vec4f(0.0);
  for (var i = 0u; i < 4u; i++) {
    skinnedPos += (skinMats[joints[i]] * vec4f(position, 1.0)) * weights[i];
  }
  
  output.position = camera.projection * camera.view * skinnedPos;
}

Try rotating the waist and neck bones with the sliders above to see skeletal skinning in action.

Conclusion

You've now built a complete WebGPU rendering pipeline—from a simple triangle to a fully textured, skeletal-animated character. You understand the core components (buffers, bind groups, pipelines, render passes), how they connect (CPU to GPU data flow, shader interfaces), and why they're designed this way (uniform vs storage buffers, compute shaders for parallel work).

This tutorial focused on WebGPU fundamentals. Advanced features like physics simulation, inverse kinematics, dynamic lighting, bloom, and post-processing build on these same concepts—they're application-level features, not new WebGPU primitives. You can explore these in the Reze Engine source code, which extends what you've learned here into a full-featured anime character renderer.