Usage Examples for the Revoize SDK in TypeScript
Example #1: Batch Audio Processing
This example demonstrates how to process pre-recorded audio data using the Revoize SDK. This is demonstrating batch processing which works with complete audio files or buffers.
The main goal for this example is to familiarize yourself with the basic concepts of using the Revoize SDK for audio enhancement, before diving into more complex, real-time processing.
TIP
For a complete working implementation with file I/O, waveform visualization, performance benchmarking, and Playwright tests, see the example-vite. Reach out to the Revoize support team using our contact form for access.
Architecture Overview
Batch processing follows a straightforward pattern: load audio, split into chunks, process each chunk, and combine results.
Core Concepts
Initialization
The SDK must be initialized once before processing any audio. You select a model (like Capella or Octantis) by importing its configuration:
import { initialize, processAudio } from "@revoize/sdk";
import { config as capellaConfig } from "@revoize/model-capella";
// Initialize once at application start
await initialize(capellaConfig);The initialization is asynchronous and loads the model's WebAssembly and weights. This should be done during your application's startup phase.
Audio Format Requirements
The SDK expects audio in a specific format:
- Sample Rate: 48 kHz (fixed requirement)
- Chunk Size: 480 samples (10ms at 48 kHz)
- Format: Number array with values between -1.0 and 1.0 (32-bit float PCM)
- Channels: Mono (single channel)
If your source audio is in a different format (e.g., 44.1 kHz stereo), you'll need to convert it first to match these requirements.
Chunking Strategy
The model requires exactly 480 samples per chunk. You need to split longer audio into these fixed-size chunks:
const CHUNK_SAMPLES = 480;
function* audioChunks(audioData: number[]): Generator<number[]> {
for (let i = 0; i < audioData.length; i += CHUNK_SAMPLES) {
const chunk = audioData.slice(i, i + CHUNK_SAMPLES);
// If the chunk is smaller than 480 samples, pad with zeros. This will usually
// happen at the end of the file.
if (chunk.length < CHUNK_SAMPLES) {
const padded = [...chunk, ...Array(CHUNK_SAMPLES - chunk.length).fill(0)];
yield padded;
} else {
yield chunk;
}
}
}Synchronous Processing
Unlike the initialization, audio processing is synchronous:
const outputChunk = processAudio(inputChunk);
console.log(`Processed ${outputChunk.length} samples`);The exact processing time will depend on the selected model and target hardware. Please refer to the model technical datasheets for more information.
Complete Processing Example
Here's a complete function that processes an entire audio file:
import { initialize, processAudio } from "@revoize/sdk";
import { config as capellaConfig } from "@revoize/model-capella";
const CHUNK_SAMPLES = 480;
function processAudioSamples(inputAudioData: number[]): number[] {
const processedChunks: number[][] = [];
// Process audio in 480-sample chunks
for (let i = 0; i < inputAudioData.length; i += CHUNK_SAMPLES) {
const chunk = inputAudioData.slice(i, i + CHUNK_SAMPLES);
// Handle incomplete final chunk
if (chunk.length < CHUNK_SAMPLES) {
// Pad with zeros to reach 480 samples
const padded = [...chunk, ...Array(CHUNK_SAMPLES - chunk.length).fill(0)];
const outputChunk = processAudio(padded);
// Only keep the non-padded portion of the output
processedChunks.push(outputChunk.slice(0, chunk.length));
} else {
const outputChunk = processAudio(chunk);
processedChunks.push(outputChunk);
}
}
// Flatten all chunks into single array
return processedChunks.flat();
}
async function loadWavFile(url: string): Promise<{
audioData: number[];
sampleRate: number;
}> {
// Fetch the file
const response = await fetch(url);
const arrayBuffer = await response.arrayBuffer();
// Decode using Web Audio API
const audioContext = new AudioContext({ sampleRate: 48000 });
const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);
// Extract mono channel
const channelData = audioBuffer.getChannelData(0);
return {
audioData: Array.from(channelData),
sampleRate: audioBuffer.sampleRate,
};
}
// Initialize SDK (do this once)
await initialize(capellaConfig);
const { audioData, sampleRate } = await loadWavFile(fileUrl);
console.log(`Loaded ${audioData.length} samples at ${sampleRate}Hz`);
const enhancedAudio = processAudioSamples(audioData);
console.log(`Enhanced ${enhancedAudio.length} samples`);Use Cases
Batch processing is ideal for:
- File Enhancement: Clean up recorded audio files before distribution
- Batch Pipelines: Process multiple files in a queue (e.g., podcast post-production)
- Audio Analysis: Enhance audio before running speech recognition or analysis
- Preview/Demo: Process sample audio to demonstrate enhancement quality
- Testing: Validate model performance with known test files
TIP
The example-vite package provides a complete implementation with automatic initialization, file loading, waveform visualization using Chart.js, performance benchmarking, and automated testing. It's a great starting point for batch processing applications.
Example #2: Real-Time Audio Processing
This example demonstrates how to build a real-time audio enhancement application using the Revoize SDK in a browser environment. Unlike batch processing, real-time processing requires capturing live microphone input, buffering it to the correct chunk size, and processing it with minimal latency.
The examples below use simplified pseudo-code to focus on core concepts and keep the documentation readable. For a complete, production-ready implementation featuring React hooks, device management, real-time visualization, and comprehensive Playwright tests, check out the example-react-real-time package. Contact our support team through our contact form to get access.
Key Differences from Batch Processing
| Aspect | Batch Processing | Real-Time Processing |
|---|---|---|
| Input Source | Pre-recorded files/buffers | Live microphone stream |
| Buffering | Simple chunking | Requires sample accumulation to match SDK requirements |
| Use Case | Offline enhancement, pipelines | Live calls, streaming, demos |
| Complexity | Simple | Requires Web Audio API setup |
Architecture Overview
Real-time audio processing with the Revoize SDK involves several key components working together:
Core Concepts
Web Audio API and AudioWorklets
Real-time audio processing in the browser uses the Web Audio API with AudioWorkletProcessor. The worklet runs on a dedicated audio thread (separate from the main thread) to ensure uninterrupted audio capture even if the UI is busy.
In this example, we only use the audio worklet to send audio samples to the main thread.
// AudioWorkletProcessor runs on audio thread
// Posts audio chunks to main thread via message passing
class MicrophoneProcessor extends AudioWorkletProcessor {
process(inputs) {
const audioSamples = inputs[0][0]; // First channel
this.port.postMessage({ audioSamples });
return true; // Keep processing
}
}Sample Buffering
The Web Audio API delivers audio in 128-sample chunks by default, but the Revoize SDK requires 480 samples (10ms at 48 kHz). You need a buffering mechanism to accumulate smaller chunks until you have enough samples to process.
In this example, we introduce a simple SampleBuffer class which accumulates samples and once a batch is ready, it triggers a callback function:
// Pseudo-code for sample buffering concept
class SampleBuffer {
addSamples(newSamples) {
// Accumulate samples in internal buffer
this.buffer.push(...newSamples);
// When we have 480 samples, trigger processing
if (this.buffer.length >= 480) {
const chunk = this.buffer.slice(0, 480);
this.onBatchReady(chunk);
this.buffer = this.buffer.slice(480); // Keep remaining
}
}
}Processing
Similar as with batch processing, we process the audio chunk-by-chunk. The SDK's processAudio function returns the enhanced audio:
// Real-time processing loop
function onAudioBatchReady(samples: Float32Array) {
// Synchronous call - returns enhanced audio immediately
const enhancedAudio = processAudio(Array.from(samples));
// Use enhanced audio for playback, visualization, etc.
onEnhancedAudioReady(enhancedAudio);
}Algorithmic Latency
Each of our models introduces some algorithmic latency. Please refer to the technical datasheet of the model you are using. This is inherent to the AI model's architecture. When mixing the enhanced signal with the original input for partial enhancement, you must delay the input signal by the same amount to keep them synchronized.
In these examples, we will assume algorithmic latency of 3 frames (30ms total).
// Conceptual example of handling latency
const MODEL_DELAY_FRAMES = 3;
const DELAY_SAMPLES = MODEL_DELAY_FRAMES * 480;
// Use a circular buffer to delay input
delayBuffer.push(inputSamples);
const delayedInput = delayBuffer.getDelayed(DELAY_SAMPLES);
// Now mix aligned signals
const mixed =
enhancedOutput * enhancementLevel + delayedInput * (1 - enhancementLevel);Typical Implementation Flow
A complete real-time application typically follows this pattern:
- Initialize the SDK with your selected model configuration
- Set up a
SampleBufferto accumulate 480-sample batches and provide anonAudioBatchReadycallback - Create an
AudioContextat 48 kHz - Load the
AudioWorkletProcessormodule and create anAudioWorkletNode - Register a message handler to receive audio samples from the worklet and forward them to the
SampleBuffer - Request microphone permissions (
getUserMedia) with echo cancellation, noise suppression, and auto gain control disabled - Create a
MediaStreamSourcefrom the microphone stream - Connect the source to the
AudioWorkletNode - Start the
AudioContext - Handle enhanced output in
onAudioBatchReady(playback, visualization, recording, transmission)
Steps 2-9 are implemented in the useAudioEngine hook in the example application; step 10 happens inside your onAudioBatchReady callback.
// Conceptual example: start audio worklet and buffer to 480-sample batches
const DEFAULT_BATCH_SIZE = 480;
// 2) Configure buffering to 480-sample batches.
// `onAudioBatchReady` should process with the SDK and handle output.
const sampleBuffer = new SampleBuffer(
DEFAULT_BATCH_SIZE,
onAudioBatchReady, // This callback implements speech enhancement
);
// 3) Create an AudioContext at 48 kHz (required by the SDK)
const audioContext = new AudioContext({ sampleRate: 48000 });
// 4) Load the MicrophoneProcessor worklet and create an AudioWorkletNode
// ./microphone-processor.js implements the `MicrophoneProcessor` described above
const workletUrl = new URL("./microphone-processor.js", import.meta.url);
await audioContext.audioWorklet.addModule(workletUrl);
const micNode = new AudioWorkletNode(audioContext, "microphone-processor");
// 5) Receive ~128-sample frames from the audio thread and forward to the buffer
micNode.port.onmessage = (evt) => {
const { audioSamples } = evt.data;
sampleBuffer.addSamples(audioSamples);
};
// 6) Request microphone stream with system DSP disabled
const stream = await navigator.mediaDevices.getUserMedia({
audio: {
echoCancellation: false,
noiseSuppression: false, // Disable system-level processing to avoid conflicts
autoGainControl: false,
},
});
// 7) Create a MediaStreamSource from the microphone stream
const source = audioContext.createMediaStreamSource(stream);
// 8) Connect the mic source to the worklet node
source.connect(micNode);
// 9) Start the audio graph
await audioContext.resume();Key Technical Requirements
- Sample Rate: Must be 48 kHz (fixed)
- Chunk Size: Must be 480 samples (10ms at 48 kHz)
- Format: 32-bit float PCM, mono channel
- Latency: model-dependant
Performance Considerations
Real-time audio processing is latency-sensitive. Here are key considerations:
- Process on main thread: The SDK runs in the main JavaScript thread, not in an AudioWorklet. This is because WebAssembly with SIMD (used by the SDK) performs better in the main thread.
- Sample buffering: Accumulate Web Audio API chunks (128 samples) to SDK chunk size (480 samples) to minimize processing overhead.
- Avoid blocking operations: Keep other main-thread operations lightweight to ensure consistent audio processing.
- Device configuration: Disable system-level audio processing (echo cancellation, noise suppression, AGC) as they can interfere with the SDK.
Integration Pattern
For React applications, the typical integration uses two custom hooks:
useRevoizeSDK: Wraps SDK initialization and audio processinguseAudioEngine: Manages Web Audio API, device enumeration, worklet setup, and sample buffering
The example application provides production-ready implementations of these hooks, along with utilities for:
- Audio device enumeration and selection
- Real-time waveform visualization
- Recording and exporting enhanced audio
- Handling microphone permissions
- Proper cleanup of audio resources
TIP
The example-react-real-time package provides a complete, tested implementation you can adapt for your needs. It includes Playwright E2E tests to verify audio processing works correctly across browsers.