Skip to content

Usage Examples for the Revoize SDK in TypeScript

← Back to Examples Overview

Example #1: Batch Audio Processing

This example demonstrates how to process pre-recorded audio data using the Revoize SDK. This is demonstrating batch processing which works with complete audio files or buffers.

The main goal for this example is to familiarize yourself with the basic concepts of using the Revoize SDK for audio enhancement, before diving into more complex, real-time processing.

TIP

For a complete working implementation with file I/O, waveform visualization, performance benchmarking, and Playwright tests, see the example-vite. Reach out to the Revoize support team using our contact form for access.

Architecture Overview

Batch processing follows a straightforward pattern: load audio, split into chunks, process each chunk, and combine results.

Core Concepts

Initialization

The SDK must be initialized once before processing any audio. You select a model (like Capella or Octantis) by importing its configuration:

typescript
import { initialize, processAudio } from "@revoize/sdk";
import { config as capellaConfig } from "@revoize/model-capella";

// Initialize once at application start
await initialize(capellaConfig);

The initialization is asynchronous and loads the model's WebAssembly and weights. This should be done during your application's startup phase.

Audio Format Requirements

The SDK expects audio in a specific format:

  • Sample Rate: 48 kHz (fixed requirement)
  • Chunk Size: 480 samples (10ms at 48 kHz)
  • Format: Number array with values between -1.0 and 1.0 (32-bit float PCM)
  • Channels: Mono (single channel)

If your source audio is in a different format (e.g., 44.1 kHz stereo), you'll need to convert it first to match these requirements.

Chunking Strategy

The model requires exactly 480 samples per chunk. You need to split longer audio into these fixed-size chunks:

typescript
const CHUNK_SAMPLES = 480;

function* audioChunks(audioData: number[]): Generator<number[]> {
  for (let i = 0; i < audioData.length; i += CHUNK_SAMPLES) {
    const chunk = audioData.slice(i, i + CHUNK_SAMPLES);

    // If the chunk is smaller than 480 samples, pad with zeros. This will usually
    // happen at the end of the file.
    if (chunk.length < CHUNK_SAMPLES) {
      const padded = [...chunk, ...Array(CHUNK_SAMPLES - chunk.length).fill(0)];
      yield padded;
    } else {
      yield chunk;
    }
  }
}

Synchronous Processing

Unlike the initialization, audio processing is synchronous:

typescript
const outputChunk = processAudio(inputChunk);
console.log(`Processed ${outputChunk.length} samples`);

The exact processing time will depend on the selected model and target hardware. Please refer to the model technical datasheets for more information.

Complete Processing Example

Here's a complete function that processes an entire audio file:

typescript
import { initialize, processAudio } from "@revoize/sdk";
import { config as capellaConfig } from "@revoize/model-capella";

const CHUNK_SAMPLES = 480;

function processAudioSamples(inputAudioData: number[]): number[] {
  const processedChunks: number[][] = [];

  // Process audio in 480-sample chunks
  for (let i = 0; i < inputAudioData.length; i += CHUNK_SAMPLES) {
    const chunk = inputAudioData.slice(i, i + CHUNK_SAMPLES);

    // Handle incomplete final chunk
    if (chunk.length < CHUNK_SAMPLES) {
      // Pad with zeros to reach 480 samples
      const padded = [...chunk, ...Array(CHUNK_SAMPLES - chunk.length).fill(0)];
      const outputChunk = processAudio(padded);
      // Only keep the non-padded portion of the output
      processedChunks.push(outputChunk.slice(0, chunk.length));
    } else {
      const outputChunk = processAudio(chunk);
      processedChunks.push(outputChunk);
    }
  }

  // Flatten all chunks into single array
  return processedChunks.flat();
}

async function loadWavFile(url: string): Promise<{
  audioData: number[];
  sampleRate: number;
}> {
  // Fetch the file
  const response = await fetch(url);
  const arrayBuffer = await response.arrayBuffer();

  // Decode using Web Audio API
  const audioContext = new AudioContext({ sampleRate: 48000 });
  const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);

  // Extract mono channel
  const channelData = audioBuffer.getChannelData(0);

  return {
    audioData: Array.from(channelData),
    sampleRate: audioBuffer.sampleRate,
  };
}

// Initialize SDK (do this once)
await initialize(capellaConfig);

const { audioData, sampleRate } = await loadWavFile(fileUrl);
console.log(`Loaded ${audioData.length} samples at ${sampleRate}Hz`);

const enhancedAudio = processAudioSamples(audioData);
console.log(`Enhanced ${enhancedAudio.length} samples`);

Use Cases

Batch processing is ideal for:

  • File Enhancement: Clean up recorded audio files before distribution
  • Batch Pipelines: Process multiple files in a queue (e.g., podcast post-production)
  • Audio Analysis: Enhance audio before running speech recognition or analysis
  • Preview/Demo: Process sample audio to demonstrate enhancement quality
  • Testing: Validate model performance with known test files

TIP

The example-vite package provides a complete implementation with automatic initialization, file loading, waveform visualization using Chart.js, performance benchmarking, and automated testing. It's a great starting point for batch processing applications.

Example #2: Real-Time Audio Processing

This example demonstrates how to build a real-time audio enhancement application using the Revoize SDK in a browser environment. Unlike batch processing, real-time processing requires capturing live microphone input, buffering it to the correct chunk size, and processing it with minimal latency.

The examples below use simplified pseudo-code to focus on core concepts and keep the documentation readable. For a complete, production-ready implementation featuring React hooks, device management, real-time visualization, and comprehensive Playwright tests, check out the example-react-real-time package. Contact our support team through our contact form to get access.

Key Differences from Batch Processing

AspectBatch ProcessingReal-Time Processing
Input SourcePre-recorded files/buffersLive microphone stream
BufferingSimple chunkingRequires sample accumulation to match SDK requirements
Use CaseOffline enhancement, pipelinesLive calls, streaming, demos
ComplexitySimpleRequires Web Audio API setup

Architecture Overview

Real-time audio processing with the Revoize SDK involves several key components working together:

Core Concepts

Web Audio API and AudioWorklets

Real-time audio processing in the browser uses the Web Audio API with AudioWorkletProcessor. The worklet runs on a dedicated audio thread (separate from the main thread) to ensure uninterrupted audio capture even if the UI is busy.

In this example, we only use the audio worklet to send audio samples to the main thread.

typescript
// AudioWorkletProcessor runs on audio thread
// Posts audio chunks to main thread via message passing
class MicrophoneProcessor extends AudioWorkletProcessor {
  process(inputs) {
    const audioSamples = inputs[0][0]; // First channel
    this.port.postMessage({ audioSamples });
    return true; // Keep processing
  }
}

Sample Buffering

The Web Audio API delivers audio in 128-sample chunks by default, but the Revoize SDK requires 480 samples (10ms at 48 kHz). You need a buffering mechanism to accumulate smaller chunks until you have enough samples to process.

In this example, we introduce a simple SampleBuffer class which accumulates samples and once a batch is ready, it triggers a callback function:

typescript
// Pseudo-code for sample buffering concept
class SampleBuffer {
  addSamples(newSamples) {
    // Accumulate samples in internal buffer
    this.buffer.push(...newSamples);

    // When we have 480 samples, trigger processing
    if (this.buffer.length >= 480) {
      const chunk = this.buffer.slice(0, 480);
      this.onBatchReady(chunk);
      this.buffer = this.buffer.slice(480); // Keep remaining
    }
  }
}

Processing

Similar as with batch processing, we process the audio chunk-by-chunk. The SDK's processAudio function returns the enhanced audio:

typescript
// Real-time processing loop
function onAudioBatchReady(samples: Float32Array) {
  // Synchronous call - returns enhanced audio immediately
  const enhancedAudio = processAudio(Array.from(samples));

  // Use enhanced audio for playback, visualization, etc.
  onEnhancedAudioReady(enhancedAudio);
}

Algorithmic Latency

Each of our models introduces some algorithmic latency. Please refer to the technical datasheet of the model you are using. This is inherent to the AI model's architecture. When mixing the enhanced signal with the original input for partial enhancement, you must delay the input signal by the same amount to keep them synchronized.

In these examples, we will assume algorithmic latency of 3 frames (30ms total).

typescript
// Conceptual example of handling latency
const MODEL_DELAY_FRAMES = 3;
const DELAY_SAMPLES = MODEL_DELAY_FRAMES * 480;

// Use a circular buffer to delay input
delayBuffer.push(inputSamples);
const delayedInput = delayBuffer.getDelayed(DELAY_SAMPLES);

// Now mix aligned signals
const mixed =
  enhancedOutput * enhancementLevel + delayedInput * (1 - enhancementLevel);

Typical Implementation Flow

A complete real-time application typically follows this pattern:

  1. Initialize the SDK with your selected model configuration
  2. Set up a SampleBuffer to accumulate 480-sample batches and provide an onAudioBatchReady callback
  3. Create an AudioContext at 48 kHz
  4. Load the AudioWorkletProcessor module and create an AudioWorkletNode
  5. Register a message handler to receive audio samples from the worklet and forward them to the SampleBuffer
  6. Request microphone permissions (getUserMedia) with echo cancellation, noise suppression, and auto gain control disabled
  7. Create a MediaStreamSource from the microphone stream
  8. Connect the source to the AudioWorkletNode
  9. Start the AudioContext
  10. Handle enhanced output in onAudioBatchReady (playback, visualization, recording, transmission)

Steps 2-9 are implemented in the useAudioEngine hook in the example application; step 10 happens inside your onAudioBatchReady callback.

ts
// Conceptual example: start audio worklet and buffer to 480-sample batches

const DEFAULT_BATCH_SIZE = 480;

// 2) Configure buffering to 480-sample batches.
//    `onAudioBatchReady` should process with the SDK and handle output.
const sampleBuffer = new SampleBuffer(
  DEFAULT_BATCH_SIZE,
  onAudioBatchReady, // This callback implements speech enhancement
);

// 3) Create an AudioContext at 48 kHz (required by the SDK)
const audioContext = new AudioContext({ sampleRate: 48000 });

// 4) Load the MicrophoneProcessor worklet and create an AudioWorkletNode
//    ./microphone-processor.js implements the `MicrophoneProcessor` described above
const workletUrl = new URL("./microphone-processor.js", import.meta.url);
await audioContext.audioWorklet.addModule(workletUrl);
const micNode = new AudioWorkletNode(audioContext, "microphone-processor");

// 5) Receive ~128-sample frames from the audio thread and forward to the buffer
micNode.port.onmessage = (evt) => {
  const { audioSamples } = evt.data;
  sampleBuffer.addSamples(audioSamples);
};

// 6) Request microphone stream with system DSP disabled
const stream = await navigator.mediaDevices.getUserMedia({
  audio: {
    echoCancellation: false,
    noiseSuppression: false, // Disable system-level processing to avoid conflicts
    autoGainControl: false,
  },
});

// 7) Create a MediaStreamSource from the microphone stream
const source = audioContext.createMediaStreamSource(stream);

// 8) Connect the mic source to the worklet node
source.connect(micNode);

// 9) Start the audio graph
await audioContext.resume();

Key Technical Requirements

  • Sample Rate: Must be 48 kHz (fixed)
  • Chunk Size: Must be 480 samples (10ms at 48 kHz)
  • Format: 32-bit float PCM, mono channel
  • Latency: model-dependant

Performance Considerations

Real-time audio processing is latency-sensitive. Here are key considerations:

  • Process on main thread: The SDK runs in the main JavaScript thread, not in an AudioWorklet. This is because WebAssembly with SIMD (used by the SDK) performs better in the main thread.
  • Sample buffering: Accumulate Web Audio API chunks (128 samples) to SDK chunk size (480 samples) to minimize processing overhead.
  • Avoid blocking operations: Keep other main-thread operations lightweight to ensure consistent audio processing.
  • Device configuration: Disable system-level audio processing (echo cancellation, noise suppression, AGC) as they can interfere with the SDK.

Integration Pattern

For React applications, the typical integration uses two custom hooks:

  • useRevoizeSDK: Wraps SDK initialization and audio processing
  • useAudioEngine: Manages Web Audio API, device enumeration, worklet setup, and sample buffering

The example application provides production-ready implementations of these hooks, along with utilities for:

  • Audio device enumeration and selection
  • Real-time waveform visualization
  • Recording and exporting enhanced audio
  • Handling microphone permissions
  • Proper cleanup of audio resources

TIP

The example-react-real-time package provides a complete, tested implementation you can adapt for your needs. It includes Playwright E2E tests to verify audio processing works correctly across browsers.