sip

Small Image Processor

Ultra low memory WASM image processing for Cloudflare Workers.

Try it

Upload an image or use the sample below. A Cloudflare Worker will process your image and report back the memory used for the operation.

Deploy to Cloudflare
Sample image ready. Upload your own or process this one as a raw request body.
Processed output

What is sip?

sip is an image processing library built specifically for Cloudflare Workers. Workers have a 128 MB memory limit, and most image libraries blow through that the moment you decode a large photo. A 25 megapixel JPEG becomes ~100 MB of buffered pixels in memory.

sip avoids that by processing images one row at a time. It never holds the full decoded image in memory. For JPEG inputs it can even decode at a reduced resolution using DCT scaling, so a 6800px-wide photo might only decode at 850px internally.

The output is always JPEG. You give sip an image (JPEG, PNG, WebP, or AVIF), tell it the max dimensions and quality you want, and it gives you back a resized JPEG.

Why sip?

Cloudflare already has built-in image processing, but it can still be useful to run transforms directly inside your own Worker or Durable Object. That can mean fewer bindings to manage, better isolation inside the code that already owns the request, and easier distribution when you want image processing packaged as part of your application instead of a separate service boundary.

Early Access

From the team behind FormKit, ArrowJS, dmux, Tempo, and AutoAnimate — Standard Agents is an open standard for creating domain-specific agents you can distribute and compose together to form safe, efficient, and effective agents. Join the early access list.

You're on the list! We'll be in touch.

Installation

pnpm add @standardagents/sip

sip ships as ESM with TypeScript types included. You also need the WASM module loaded before processing — see WASM Build for setup.

API

Everything is a named export from @standardagents/sip. Most use cases only need transform and toResponse. The rest are there when you need more control.

ready(options?)

Loads the WASM module. Call this once when your Worker starts up and cache the promise. You can pass a pre-compiled WebAssembly.Module or raw bytes if you need to override the default loader. In Workers and workerd, the normal pattern is just await ready().

The workerd build wires up the bundled WASM for you, and ready() is idempotent, so calling it directly in your request handler is fine.

import { ready } from '@standardagents/sip'

// Normal Worker / workerd usage
await ready()

// Optional escape hatches if you need to wire the module yourself
await ready({ wasm: compiledModule })
await ready({ wasm: wasmArrayBuffer })

inspect(input)

Reads just enough bytes to figure out the format, dimensions, and whether the image has alpha. Doesn't decode the whole thing. Returns the metadata plus a source you can pass into transform or decode later.

import { inspect } from '@standardagents/sip'

// Accepts any ByteInput: Request, Response, ReadableStream,
// ArrayBuffer, Uint8Array, Blob, or AsyncIterable<Uint8Array>
const { info, source } = await inspect(request)

info.format    // 'jpeg' | 'png' | 'webp' | 'avif'
info.width     // pixel width
info.height    // pixel height
info.hasAlpha  // boolean

Useful when you want to validate or reject images before doing the expensive work. Throws if the format isn't recognized.

transform(input, options?)

The main function. Takes any supported input, decodes it, resizes it, and encodes it as JPEG — all in one call. Returns an EncodedImage, which is an async iterable of JPEG chunks. Nothing actually runs until you start consuming it.

import { transform } from '@standardagents/sip'

// One-shot: decode → resize → encode as JPEG
const image = transform(input, {
  width: 2048,   // max output width (aspect ratio preserved)
  height: 2048,  // max output height
  quality: 82,   // JPEG quality 1–100
})

// image is an EncodedImage (AsyncIterable<Uint8Array>)
// with .info and .stats promises

Options

width? Max output width. Aspect ratio is always preserved. Never upscales.
height? Max output height. Aspect ratio is always preserved. Never upscales.
quality? JPEG quality, 1–100. Defaults to 85.

decode(input)

Decodes an image into a PixelStream — an async iterable that yields one row of RGB pixels at a time. Each row is a Scanline with data (a Uint8Array of width * 3 bytes), width, and y.

import { decode } from '@standardagents/sip'

const pixels = decode(input)  // PixelStream (AsyncIterable<Scanline>)

const info = await pixels.info
// { width, height, originalFormat }

for await (const scanline of pixels) {
  scanline.data   // Uint8Array — RGB row (width * 3 bytes)
  scanline.width  // pixel width
  scanline.y      // row index
}

resize(stream, options)

Takes a PixelStream and resizes it row by row using bilinear interpolation. Only keeps two rows in memory at a time. Returns a new PixelStream.

import { decode, resize } from '@standardagents/sip'

const pixels = decode(input)
const resized = resize(pixels, { width: 800, height: 800 })

// resized is a new PixelStream with updated dimensions
const info = await resized.info
// { width: 800, height: 600, originalFormat: 'jpeg' }

encodeJpeg(stream, options?)

Takes a PixelStream and encodes it as JPEG. Returns an EncodedImage that yields chunks as they're ready.

import { decode, encodeJpeg, resize } from '@standardagents/sip'

const pixels = decode(input)
const resized = resize(pixels, { width: 1024, height: 1024 })
const image = encodeJpeg(resized, { quality: 78 })

// image is an EncodedImage (AsyncIterable<Uint8Array>)

collect(image)

Consumes an EncodedImage and gives you the full JPEG as an ArrayBuffer, along with the output dimensions and memory stats. Use this when you need the bytes in memory (e.g. to store in R2). Use toResponse when you just want to send the image back to the client.

import { collect, transform } from '@standardagents/sip'

const image = transform(input, { width: 512, height: 512 })
const { data, info, stats } = await collect(image)

data   // ArrayBuffer — complete JPEG
info   // { width, height, mimeType, originalFormat }
stats  // { peakPipelineBytes, peakCodecBytes, bytesIn, bytesOut, ... }

toResponse(image, init?)

Streams an EncodedImage straight into a Response. Sets the content type to image/jpeg for you. You can pass extra headers or a status code through the optional ResponseInit.

import { toResponse, transform } from '@standardagents/sip'

const image = transform(request, { width: 1600, height: 1600 })

// Streams JPEG chunks directly into the Response body
return toResponse(image, {
  headers: { 'Cache-Control': 'public, max-age=31536000' },
})

toReadableStream(image)

Converts an EncodedImage into a standard ReadableStream. Useful if you need to pipe the output somewhere other than a Response.

import { toReadableStream, transform } from '@standardagents/sip'

const image = transform(input, { width: 1024 })
const stream = toReadableStream(image) // ReadableStream<Uint8Array>

Types

ByteInput Anything sip can read from: ArrayBuffer, Uint8Array, Blob, Request, Response, ReadableStream, or AsyncIterable.
ImageInfo { format, width, height, hasAlpha }
InputSource A handle returned by inspect(). Pass it to transform() or decode() so sip doesn't have to re-read the headers.
InspectResult { info: ImageInfo, source: InputSource }
TransformOptions { width?, height?, quality? }
EncodedImage An async iterable of Uint8Array JPEG chunks with .info and .stats promises.
EncodedImageInfo { width, height, mimeType, originalFormat }
PixelStream An async iterable of Scanline objects with an .info promise.
Scanline { data: Uint8Array, width, y } — one row of RGB pixels.
TransformStats Memory and byte stats: peakPipelineBytes, peakCodecBytes, bytesIn, bytesOut, and more.

Format Support

sip can read four image formats. The output is always JPEG.

Format Decoder Method Notes
JPEG libjpeg-turbo DCT scaling + scanline decode Best path. Can decode large images at 1/2, 1/4, or 1/8 scale.
PNG libspng Row-by-row decode Decodes one row at a time. More efficient than a full pixel buffer.
WebP @jsquash/webp Full decode Works, but decodes the whole image into memory first. Uses more RAM.
AVIF @jsquash/avif Full decode Same as WebP — works but uses more memory than JPEG or PNG.

Example

A single-file Cloudflare Worker that serves an upload page and returns the resized JPEG with metadata headers for the demo UI. The deploy button uses the dedicated standardagents/sip-worker-example template repo so it avoids Cloudflare's monorepo import edge cases.

import { inspect, ready, toResponse, transform } from '@standardagents/sip'

export default {
  async fetch(request: Request) {
    await ready()
    const url = new URL(request.url)

    const HTML = '<!doctype html><html ...'

    // GET / → serve upload page
    if (request.method === 'GET') return new Response(HTML, {
      headers: { 'Content-Type': 'text/html' },
    })

    // POST /api/process → resize and stream back JPEG
    const { source } = await inspect(request)
    return toResponse(transform(source, {
      width:   Number(url.searchParams.get('width'))   || 1024,
      height:  Number(url.searchParams.get('height'))  || 1024,
      quality: Number(url.searchParams.get('quality')) || 82,
    }))
  },
}

Caveats

Output is always JPEG

sip doesn't produce PNG, WebP, or AVIF output. If the input has transparency, it's discarded.

WebP and AVIF use more memory

JPEG and PNG get the efficient scanline path. WebP and AVIF still need to decode the entire image into memory before sip can process it. They work fine, but they use significantly more RAM. Native WASM decoders for these formats are planned.

Memory numbers in the demo

The demo reports the peak memory that sip itself used during processing. That's not the same as total Worker memory — the runtime, your code, and V8 overhead are separate.