Skip to content

Getting Started with the Revoize SDK

What is the Revoize SDK?

The Revoize SDK is a Rust-based library that brings state-of-the-art speech enhancement to your applications. Whether you're building real-time voice products or cleaning up recorded audio, the SDK gives you a simple API and a set of models you can plug in and use.

Typical use cases:

  • Real-time speech enhancement (e.g. calls, conferencing, live streams)
  • Offline or batch enhancement of recordings

What it can do:

  • Remove background noise and reduce reverberation
  • Extend speech bandwidth for clearer, fuller sound
  • Mitigate codec artifacts and fix gaps or glitches in the signal

We ship multiple models so you can match performance to your constraints: lighter models for low-latency or resource-limited environments, and more capable models when you need the best quality (including bandwidth extension and speech reconstruction).

Language Support

The SDK is implemented in Rust, and we provide first-class bindings so you can use it from your stack of choice:

  • Rust: The native implementation with full access to all features
  • Python: Native Python bindings for easy integration with Python applications
  • TypeScript: TypeScript/JavaScript bindings for web applications and Node.js
  • C: C-compatible interface for maximum compatibility
  • C++: Modern C++ interface with RAII, exceptions, and standard C++ containers
  • Java: JNI (Java Native Interface) wrapper for Java applications, which can also be used in Kotlin and Android development

Each binding exposes the same capabilities as the Rust API, with an idiomatic interface for that language.

Pick your language below for the full API reference and examples:

Usage Examples

We've put together practical examples for every supported language so you can see the SDK in action and copy what you need:

How model selection works

The SDK lets you choose a model by name and then work with fixed chunk sizes and sample rates. Here's the idea:

  1. List available models – Each language has a function to list model names for your build (e.g. list_names(), revoize_list_names(), SDK.models.listNames()).
  2. Get parameters by name – For the model you want (e.g. Capella, Octantis, Hadar), get a params object (e.g. get_params("Capella")). It tells you:
    • Input chunk size (samples) – Every process call must receive exactly this many samples.
    • Output chunk size (samples) – Every process call returns this many samples.
    • Input and output sample rates (Hz) – Some models change sample rate (e.g. bandwidth extension (BWE) from 8 kHz to 24 kHz). Your pipeline should feed input at the model's input rate and expect output at its output rate.
  3. Initialize with params – Call init(params) once before any processing.
  4. Process fixed-size chunks – Feed chunks of exactly input_chunk_size_samples into process; you get back output_chunk_size_samples (which can differ for BWE models).

Which models show up in the list depends on how you built or packaged the SDK (e.g. Cargo features in Rust, or which model packages you use in TypeScript). Your language's API reference has the exact names and types.

Overall flow

Whether you're doing real-time enhancement or batch processing, the steps are the same:

  1. Get model parameters by name (from the list of available names).
  2. Initialize the SDK with those parameters.
  3. Feed in audio at the model's input sample rate, in chunks of exactly input_chunk_size_samples.
  4. Call process on each chunk and use the enhanced output (length output_chunk_size_samples at the model's output sample rate) however you need—playback, recording, streaming, etc.

In short:

Get the SDK

Ready to integrate? Reach out via our contact form and we'll get you set up with the SDK and the right model assets for your use case.