Skip to content

Usage Examples for the SDK

Introduction

Welcome to the Revoize SDK usage examples! This document is here to help you get started with the SDK by providing practical examples that demonstrate its capabilities in various scenarios.

Example #1: Processing a Single Audio File

In this example, we'll show you how to process a single audio file using the Revoize SDK. We'll load an audio file from disk, process it with the SDK, and save the enhanced audio to a new file.

This example focuses on processing a WAV file using the Revoize SDK in a minimal setup.

WARNING

The input and output file paths, model type, and chunk size are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be recorded at 48 kHz. Audio is processed in 480-sample chunks.

Here's a general sequence diagram for this example:


Import Statements

First, we need to import the necessary modules and libraries to use the Revoize SDK and process audio files. We'll use the error crate to handle errors, the path crate to work with file paths, and the hound crate to read and write WAV files.

rust
use std::error::Error;
use std::path::Path;
use revoize_sdk::{init, process, ModelType};
use hound::{WavReader, WavWriter, WavSpec, SampleFormat};

The most important line from the Revoize SDK usage perspective is:

rust
use std::error::Error;
use std::path::Path;
use revoize_sdk::{init, process, ModelType}; 
use hound::{WavReader, WavWriter, WavSpec, SampleFormat};

This ensures we can use the Revoize SDK functions init and process as well as the ModelType enum.


Main Function Signature

This example is simple, so we only need a main function to run the code.

rust
fn main() -> Result<(), Box<dyn Error>> {
    ...

This function does not take any arguments and returns a Result type that can either be Ok(()) if the operation was successful or an error if something went wrong. In a real-life application, you would probably want to pass a list of input arguments like the path to the input file, the path to the output file, etc., but this is out of scope for this example.


Define Some Hardcoded Values

To keep this example minimalistic, we can hardcode the paths to:

  • the input WAV file
  • the output WAV file
  • the chunk size
rust
    // Input WAV file path
    let input_wav = "input.wav";
    // Output WAV file path
    let output_wav = "output.wav";
    // Chunk size (480 samples)
    let chunk_size = 480;

Initialize the Revoize SDK

Before we can start processing audio, we need to initialize the Revoize SDK by calling the init function with the desired model type.

rust
    // Model Type is hardcoded to Capella.
    init(ModelType::Capella)?;

There are various model types available in the Revoize SDK, but for this example, we are using the Capella model, which is a lightweight discriminative model suitable for general denoising tasks.


Load the Input WAV File

Next, we need to load the input WAV file from disk. We use the WavReader from the hound crate to read the WAV file.

rust
    let mut reader = WavReader::open(&input_wav)?;
    let spec = reader.spec();

The spec variable contains the specifications of the input WAV file, such as the number of channels, sample rate, and sample format. We can use it to get some basic information about the audio file.

Convert Audio Samples to f32

The audio samples are read from the WAV file as integers (either 16-bit or 32-bit) or as floats (32-bit). We need to convert them to f32 format to make them compatible with the Revoize SDK's process function.

rust
    let audio_samples: Vec<f32> = match (spec.sample_format, spec.bits_per_sample) {
        (SampleFormat::Int, 16) => {
            let max_val = i16::MAX as f32;
            reader.samples::<i16>().map(|s| s.unwrap() as f32 / max_val).collect()
        }
        (SampleFormat::Int, 32) => {
            let max_val = i32::MAX as f32;
            reader.samples::<i32>().map(|s| s.unwrap() as f32 / max_val).collect()
        }
        (SampleFormat::Float, 32) => {
            reader.samples::<f32>().map(|s| s.unwrap()).collect()
        }
        _ => return Err("Unsupported WAV format".into())
    };

This code snippet reads the audio samples from the WAV file and converts them to f32 format. It handles different sample formats (integer or float) and different bit depths (16-bit or 32-bit).


Process the Audio in Chunks

Now that we have the audio samples in f32 format, we can process them in chunks using the Revoize SDK. We iterate over the audio samples in chunks of chunk_size and process each chunk using the process function.

rust
    let mut processed_audio = Vec::new();
    let num_chunks = audio_samples.len() / chunk_size;
    for i in 0..num_chunks {
        let start = i * chunk_size;
        let end = start + chunk_size;
        let chunk = &audio_samples[start..end];
        // Process each chunk using Revoize SDK
        let output_chunk = process(chunk)?;
        processed_audio.extend(output_chunk);
    }

The process function takes an input audio chunk and returns the processed audio chunk.

rust
    let mut processed_audio = Vec::new();
    let num_chunks = audio_samples.len() / chunk_size;
    for i in 0..num_chunks {
        let start = i * chunk_size;
        let end = start + chunk_size;
        let chunk = &audio_samples[start..end];
        // Process each chunk using Revoize SDK
        let output_chunk = process(chunk)?; 
        processed_audio.extend(output_chunk);
    }

We store the processed audio chunks in a vector called processed_audio.

rust
    let mut processed_audio = Vec::new();
    let num_chunks = audio_samples.len() / chunk_size;
    for i in 0..num_chunks {
        let start = i * chunk_size;
        let end = start + chunk_size;
        let chunk = &audio_samples[start..end];
        // Process each chunk using Revoize SDK
        let output_chunk = process(chunk)?;
        processed_audio.extend(output_chunk); 
    }

This may not be the most efficient way to process audio since we will be keeping all the processed audio in memory until we go through the entire WAV file. In real-life scenarios, you would probably want to save the processed audio chunks to the output file as soon as they are processed.


Save the Processed Audio to a New WAV File

Finally, we save the processed audio to a new WAV file. We create a WavWriter with the same specifications as the input WAV file and write the processed audio samples to the output file.

rust
    let out_spec = WavSpec {
        channels: spec.channels,
        sample_rate: spec.sample_rate,
        bits_per_sample: 32,
        sample_format: SampleFormat::Float,
    };

    let mut writer = WavWriter::create(&output_wav, out_spec)?;
    for sample in processed_audio {
        writer.write_sample(sample)?;
    }
    writer.finalize()?;

Full Code Example

Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.

rust
use std::error::Error;
use std::path::Path;
use revoize_sdk::{init, process, ModelType};
use hound::{WavReader, WavWriter, WavSpec, SampleFormat};

fn main() -> Result<(), Box<dyn Error>> {
    // -------------------------------------------------
    // 1. Hardcoded parameters and initialization
    // -------------------------------------------------
    // Input WAV file path
    let input_wav = "input.wav";
    // Output WAV file path
    let output_wav = "output.wav";
    // Chunk size (480 samples)
    let chunk_size = 480;
    // Model Type is hardcoded to Capella.
    init(ModelType::Capella)?;

    // -------------------------------------------------
    // 2. Load the input WAV file
    // -------------------------------------------------
    let mut reader = WavReader::open(&input_wav)?;
    let spec = reader.spec();

    // Convert samples to f32 (Assumes sample_rate is 48000)
    let audio_samples: Vec<f32> = match (spec.sample_format, spec.bits_per_sample) {
        (SampleFormat::Int, 16) => {
            let max_val = i16::MAX as f32;
            reader.samples::<i16>().map(|s| s.unwrap() as f32 / max_val).collect()
        }
        (SampleFormat::Int, 32) => {
            let max_val = i32::MAX as f32;
            reader.samples::<i32>().map(|s| s.unwrap() as f32 / max_val).collect()
        }
        (SampleFormat::Float, 32) => {
            reader.samples::<f32>().map(|s| s.unwrap()).collect()
        }
        _ => return Err("Unsupported WAV format".into())
    };

    // -------------------------------------------------
    // 3. Process the audio in chunks
    // -------------------------------------------------
    let mut processed_audio = Vec::new();
    let num_chunks = audio_samples.len() / chunk_size;
    for i in 0..num_chunks {
        let start = i * chunk_size;
        let end = start + chunk_size;
        let chunk = &audio_samples[start..end];
        // **Process each chunk using Revoize SDK**
        let output_chunk = process(chunk)?;
        processed_audio.extend(output_chunk);
    }

    // -------------------------------------------------
    // 4. Save the processed audio to a new WAV file
    // -------------------------------------------------
    let out_spec = WavSpec {
        channels: spec.channels,
        sample_rate: spec.sample_rate,
        bits_per_sample: 32,
        sample_format: SampleFormat::Float,
    };

    let mut writer = WavWriter::create(&output_wav, out_spec)?;
    for sample in processed_audio {
        writer.write_sample(sample)?;
    }
    writer.finalize()?;

    Ok(())
}

Example #2: Real-time Speech Enhancement

Coming Soon.