Usage Examples for the Revoize SDK in C++
Introduction
Welcome to the Revoize SDK usage examples! This document is here to help you get started with the SDK by providing practical examples that demonstrate its capabilities in various scenarios.
Example #1: Processing a Single Audio File
In this example, we'll show you how to process a single audio file using the Revoize SDK. We'll load an audio file from disk, process it with the SDK, and save the enhanced audio to a new file.
This example focuses on processing a WAV file using the Revoize SDK in a minimal setup.
WARNING
The input and output file paths, model type, and chunk size are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be recorded at 48 kHz. Audio is processed in 480-sample chunks.
Here's a general sequence diagram for this example:
Include Statements
First, we need to include the necessary header files to use the Revoize SDK and process audio files. We'll use sndfile.h
for reading and writing WAV files.
#include "revoize_sdk.hpp"
#include <vector>
#include <string>
#include <sndfile.h>
The most important line from the Revoize SDK usage perspective is:
#include "revoize_sdk.hpp"
#include <vector>
#include <string>
#include <sndfile.h>
This ensures we can use the Revoize SDK functions init
and process
as well as the ModelType
enum.
Helper Functions
Before we start processing audio, we need some helper functions to read and write WAV files.
std::vector<float> readWavFile(const std::string& filename) {
SF_INFO sf_info;
SNDFILE* file = sf_open(filename.c_str(), SFM_READ, &sf_info);
if (!file) {
throw std::runtime_error(std::string("Could not open file: ") + filename +
" Error: " + sf_strerror(NULL));
}
std::vector<float> samples(sf_info.frames);
sf_count_t frames_read = sf_readf_float(file, samples.data(), sf_info.frames);
if (frames_read != sf_info.frames) {
sf_close(file);
throw std::runtime_error("Failed to read audio data");
}
sf_close(file);
return samples;
}
void writeWavFile(const std::string& filename, const std::vector<float>& samples) {
SF_INFO sf_info;
sf_info.samplerate = 48000;
sf_info.channels = 1;
sf_info.format = SF_FORMAT_WAV | SF_FORMAT_FLOAT;
SNDFILE* file = sf_open(filename.c_str(), SFM_WRITE, &sf_info);
if (!file) {
throw std::runtime_error(std::string("Could not create file: ") + filename +
" Error: " + sf_strerror(NULL));
}
sf_count_t frames_written = sf_writef_float(file, samples.data(), samples.size());
if (frames_written != samples.size()) {
sf_close(file);
throw std::runtime_error("Failed to write audio data");
}
sf_close(file);
}
Define Some Hardcoded Values
To keep this example minimalistic, we can hardcode the paths to:
- the input WAV file
- the output WAV file
- the chunk size
const std::string input_file = "input.wav";
const std::string output_file = "output.wav";
const size_t chunk_size = 480;
Initialize the Revoize SDK
Before we can start processing audio, we need to initialize the Revoize SDK by calling the init
function with the desired model type.
// Initialize with Capella
revoize_sdk::init(revoize_sdk::ModelType::Capella);
There are various model types available in the Revoize SDK, but for this example, we are using the Capella
model, which is a lightweight discriminative model suitable for general denoising tasks.
Load the Input WAV File
Next, we need to load the input WAV file from disk. We use the readWavFile
helper function to read the WAV file.
// Read input WAV file
std::vector<float> input_samples = readWavFile(input_file);
The readWavFile
function returns a vector of audio samples. The samples are stored as float
values to make them compatible with the Revoize SDK's process
function.
Process the Audio in Chunks
Now that we have the audio samples, we can process them in chunks using the Revoize SDK. We iterate over the audio samples in chunks of 480 samples and process each chunk using the process
function.
// Process the audio in chunks
std::vector<float> processed_audio;
processed_audio.reserve(input_samples.size());
for (size_t i = 0; i < input_samples.size(); i += chunk_size) {
// If we have less than 480 samples left, skip them
if (i + chunk_size > input_samples.size()) {
break;
}
// Process this chunk
std::vector<float> chunk(input_samples.begin() + i,
input_samples.begin() + i + chunk_size);
std::vector<float> output_chunk = revoize_sdk::process(chunk);
// Append to output
processed_audio.insert(processed_audio.end(),
output_chunk.begin(),
output_chunk.end());
}
The process
function takes an input audio chunk, processes it, and returns the enhanced samples. We store all processed chunks in a vector called processed_audio
.
Save the Processed Audio to a New WAV File
Finally, we save the processed audio to a new WAV file using the writeWavFile
helper function.
// Write output WAV file
writeWavFile(output_file, processed_audio);
Full Code Example
Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.
#include "revoize_sdk.hpp"
#include <vector>
#include <string>
#include <sndfile.h>
std::vector<float> readWavFile(const std::string& filename) {
SF_INFO sf_info;
SNDFILE* file = sf_open(filename.c_str(), SFM_READ, &sf_info);
if (!file) {
throw std::runtime_error(std::string("Could not open file: ") + filename +
" Error: " + sf_strerror(NULL));
}
std::vector<float> samples(sf_info.frames);
sf_count_t frames_read = sf_readf_float(file, samples.data(), sf_info.frames);
if (frames_read != sf_info.frames) {
sf_close(file);
throw std::runtime_error("Failed to read audio data");
}
sf_close(file);
return samples;
}
void writeWavFile(const std::string& filename, const std::vector<float>& samples) {
SF_INFO sf_info;
sf_info.samplerate = 48000;
sf_info.channels = 1;
sf_info.format = SF_FORMAT_WAV | SF_FORMAT_FLOAT;
SNDFILE* file = sf_open(filename.c_str(), SFM_WRITE, &sf_info);
if (!file) {
throw std::runtime_error(std::string("Could not create file: ") + filename +
" Error: " + sf_strerror(NULL));
}
sf_count_t frames_written = sf_writef_float(file, samples.data(), samples.size());
if (frames_written != samples.size()) {
sf_close(file);
throw std::runtime_error("Failed to write audio data");
}
sf_close(file);
}
int main() {
try {
// -------------------------------------------------
// 1. Hardcoded parameters and initialization
// -------------------------------------------------
const std::string input_file = "input.wav";
const std::string output_file = "output.wav";
const size_t chunk_size = 480;
// Initialize with Capella
revoize_sdk::init(revoize_sdk::ModelType::Capella);
// -------------------------------------------------
// 2. Load the input WAV file
// -------------------------------------------------
std::vector<float> input_samples = readWavFile(input_file);
// -------------------------------------------------
// 3. Process the audio in chunks
// -------------------------------------------------
std::vector<float> processed_audio;
processed_audio.reserve(input_samples.size());
for (size_t i = 0; i < input_samples.size(); i += chunk_size) {
// If we have less than 480 samples left, skip them
if (i + chunk_size > input_samples.size()) {
break;
}
// Process this chunk
std::vector<float> chunk(input_samples.begin() + i,
input_samples.begin() + i + chunk_size);
std::vector<float> output_chunk = revoize_sdk::process(chunk);
// Append to output
processed_audio.insert(processed_audio.end(),
output_chunk.begin(),
output_chunk.end());
}
// -------------------------------------------------
// 4. Save the processed audio to a new WAV file
// -------------------------------------------------
writeWavFile(output_file, processed_audio);
return 0;
} catch (const std::exception& e) {
std::cerr << "Error: " << e.what() << std::endl;
return 1;
}
}
Example #2: Real-time Speech Enhancement
Coming Soon.