Usage Examples for the SDK
Introduction
Welcome to the Revoize SDK usage examples! This document is here to help you get started with the SDK by providing practical examples that demonstrate its capabilities in various scenarios.
Example #1: Processing a Single Audio File
In this example, we'll show you how to process a single audio file using the Revoize SDK. We'll load an audio file from disk, process it with the SDK, and save the enhanced audio to a new file.
This example focuses on processing a WAV file using the Revoize SDK in a minimal setup.
WARNING
The input and output file paths, model type, and chunk size are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be recorded at 48 kHz. Audio is processed in 480-sample chunks.
Here's a general sequence diagram for this example:
Import Statements
First, we need to import the necessary modules and libraries to use the Revoize SDK and process audio files. We'll use the error
crate to handle errors, the path
crate to work with file paths, and the hound
crate to read and write WAV files.
use std::error::Error;
use std::path::Path;
use revoize_sdk::{init, process, ModelType};
use hound::{WavReader, WavWriter, WavSpec, SampleFormat};
The most important line from the Revoize SDK usage perspective is:
use std::error::Error;
use std::path::Path;
use revoize_sdk::{init, process, ModelType};
use hound::{WavReader, WavWriter, WavSpec, SampleFormat};
This ensures we can use the Revoize SDK functions init
and process
as well as the ModelType
enum.
Main Function Signature
This example is simple, so we only need a main
function to run the code.
fn main() -> Result<(), Box<dyn Error>> {
...
This function does not take any arguments and returns a Result
type that can either be Ok(())
if the operation was successful or an error if something went wrong. In a real-life application, you would probably want to pass a list of input arguments like the path to the input file, the path to the output file, etc., but this is out of scope for this example.
Define Some Hardcoded Values
To keep this example minimalistic, we can hardcode the paths to:
- the input WAV file
- the output WAV file
- the chunk size
// Input WAV file path
let input_wav = "input.wav";
// Output WAV file path
let output_wav = "output.wav";
// Chunk size (480 samples)
let chunk_size = 480;
Initialize the Revoize SDK
Before we can start processing audio, we need to initialize the Revoize SDK by calling the init
function with the desired model type.
// Model Type is hardcoded to Capella.
init(ModelType::Capella)?;
There are various model types available in the Revoize SDK, but for this example, we are using the Capella
model, which is a lightweight discriminative model suitable for general denoising tasks.
Load the Input WAV File
Next, we need to load the input WAV file from disk. We use the WavReader
from the hound
crate to read the WAV file.
let mut reader = WavReader::open(&input_wav)?;
let spec = reader.spec();
The spec
variable contains the specifications of the input WAV file, such as the number of channels, sample rate, and sample format. We can use it to get some basic information about the audio file.
Convert Audio Samples to f32
The audio samples are read from the WAV file as integers (either 16-bit or 32-bit) or as floats (32-bit). We need to convert them to f32
format to make them compatible with the Revoize SDK's process
function.
let audio_samples: Vec<f32> = match (spec.sample_format, spec.bits_per_sample) {
(SampleFormat::Int, 16) => {
let max_val = i16::MAX as f32;
reader.samples::<i16>().map(|s| s.unwrap() as f32 / max_val).collect()
}
(SampleFormat::Int, 32) => {
let max_val = i32::MAX as f32;
reader.samples::<i32>().map(|s| s.unwrap() as f32 / max_val).collect()
}
(SampleFormat::Float, 32) => {
reader.samples::<f32>().map(|s| s.unwrap()).collect()
}
_ => return Err("Unsupported WAV format".into())
};
This code snippet reads the audio samples from the WAV file and converts them to f32
format. It handles different sample formats (integer or float) and different bit depths (16-bit or 32-bit).
Process the Audio in Chunks
Now that we have the audio samples in f32
format, we can process them in chunks using the Revoize SDK. We iterate over the audio samples in chunks of chunk_size
and process each chunk using the process
function.
let mut processed_audio = Vec::new();
let num_chunks = audio_samples.len() / chunk_size;
for i in 0..num_chunks {
let start = i * chunk_size;
let end = start + chunk_size;
let chunk = &audio_samples[start..end];
// Process each chunk using Revoize SDK
let output_chunk = process(chunk)?;
processed_audio.extend(output_chunk);
}
The process
function takes an input audio chunk and returns the processed audio chunk.
let mut processed_audio = Vec::new();
let num_chunks = audio_samples.len() / chunk_size;
for i in 0..num_chunks {
let start = i * chunk_size;
let end = start + chunk_size;
let chunk = &audio_samples[start..end];
// Process each chunk using Revoize SDK
let output_chunk = process(chunk)?;
processed_audio.extend(output_chunk);
}
We store the processed audio chunks in a vector called processed_audio
.
let mut processed_audio = Vec::new();
let num_chunks = audio_samples.len() / chunk_size;
for i in 0..num_chunks {
let start = i * chunk_size;
let end = start + chunk_size;
let chunk = &audio_samples[start..end];
// Process each chunk using Revoize SDK
let output_chunk = process(chunk)?;
processed_audio.extend(output_chunk);
}
This may not be the most efficient way to process audio since we will be keeping all the processed audio in memory until we go through the entire WAV file. In real-life scenarios, you would probably want to save the processed audio chunks to the output file as soon as they are processed.
Save the Processed Audio to a New WAV File
Finally, we save the processed audio to a new WAV file. We create a WavWriter
with the same specifications as the input WAV file and write the processed audio samples to the output file.
let out_spec = WavSpec {
channels: spec.channels,
sample_rate: spec.sample_rate,
bits_per_sample: 32,
sample_format: SampleFormat::Float,
};
let mut writer = WavWriter::create(&output_wav, out_spec)?;
for sample in processed_audio {
writer.write_sample(sample)?;
}
writer.finalize()?;
Full Code Example
Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.
use std::error::Error;
use std::path::Path;
use revoize_sdk::{init, process, ModelType};
use hound::{WavReader, WavWriter, WavSpec, SampleFormat};
fn main() -> Result<(), Box<dyn Error>> {
// -------------------------------------------------
// 1. Hardcoded parameters and initialization
// -------------------------------------------------
// Input WAV file path
let input_wav = "input.wav";
// Output WAV file path
let output_wav = "output.wav";
// Chunk size (480 samples)
let chunk_size = 480;
// Model Type is hardcoded to Capella.
init(ModelType::Capella)?;
// -------------------------------------------------
// 2. Load the input WAV file
// -------------------------------------------------
let mut reader = WavReader::open(&input_wav)?;
let spec = reader.spec();
// Convert samples to f32 (Assumes sample_rate is 48000)
let audio_samples: Vec<f32> = match (spec.sample_format, spec.bits_per_sample) {
(SampleFormat::Int, 16) => {
let max_val = i16::MAX as f32;
reader.samples::<i16>().map(|s| s.unwrap() as f32 / max_val).collect()
}
(SampleFormat::Int, 32) => {
let max_val = i32::MAX as f32;
reader.samples::<i32>().map(|s| s.unwrap() as f32 / max_val).collect()
}
(SampleFormat::Float, 32) => {
reader.samples::<f32>().map(|s| s.unwrap()).collect()
}
_ => return Err("Unsupported WAV format".into())
};
// -------------------------------------------------
// 3. Process the audio in chunks
// -------------------------------------------------
let mut processed_audio = Vec::new();
let num_chunks = audio_samples.len() / chunk_size;
for i in 0..num_chunks {
let start = i * chunk_size;
let end = start + chunk_size;
let chunk = &audio_samples[start..end];
// **Process each chunk using Revoize SDK**
let output_chunk = process(chunk)?;
processed_audio.extend(output_chunk);
}
// -------------------------------------------------
// 4. Save the processed audio to a new WAV file
// -------------------------------------------------
let out_spec = WavSpec {
channels: spec.channels,
sample_rate: spec.sample_rate,
bits_per_sample: 32,
sample_format: SampleFormat::Float,
};
let mut writer = WavWriter::create(&output_wav, out_spec)?;
for sample in processed_audio {
writer.write_sample(sample)?;
}
writer.finalize()?;
Ok(())
}
Example #2: Real-time Speech Enhancement
Coming Soon.