Skip to content

Usage Examples for the Revoize SDK in Python

← Back to Examples Overview

Example #1: Processing a Single Audio File

In this example, we'll show you how to process a single audio file using the Revoize SDK. We'll load an audio file from disk, process it with the SDK, and save the enhanced audio to a new file.

This example focuses on processing a WAV file using the Revoize SDK in a minimal setup.

WARNING

The input and output file paths, model type, and chunk size are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be recorded at 48 kHz. Audio is processed in 480-sample chunks.

Here's a general sequence diagram for this example:


Import Statements

First, we need to import the necessary modules and libraries to use the Revoize SDK and process audio files. We'll use numpy for array operations and soundfile for reading and writing WAV files.

python
import numpy as np
import soundfile as sf
import revoize_sdk

The most important line from the Revoize SDK usage perspective is:

python
import numpy as np
import soundfile as sf
import revoize_sdk  

This ensures we can use the Revoize SDK functions init and process as well as the ModelType enum.


Main Function

This example is simple, so we'll write it as a main function that can be run directly.

python
def main():
    ...

This function does not take any arguments and returns nothing. In a real-life application, you would probably want to pass a list of input arguments like the path to the input file, the path to the output file, etc., but this is out of scope for this example.


Define Some Hardcoded Values

To keep this example minimalistic, we can hardcode the paths to:

  • the input WAV file
  • the output WAV file
  • the chunk size
python
    # Input WAV file path
    input_wav = "input.wav"
    # Output WAV file path
    output_wav = "output.wav"
    # Chunk size (480 samples)
    chunk_size = 480

Initialize the Revoize SDK

Before we can start processing audio, we need to initialize the Revoize SDK by calling the init function with the desired model type.

python
    # Model Type is hardcoded to Capella
    revoize_sdk.init(revoize_sdk.ModelType.CAPELLA)

There are various model types available in the Revoize SDK, but for this example, we are using the CAPELLA model, which is a lightweight discriminative model suitable for general denoising tasks.


Load the Input WAV File

Next, we need to load the input WAV file from disk. We use the soundfile library to read the WAV file.

python
    # Read the WAV file
    audio_samples, sample_rate = sf.read(input_wav)

    # Ensure the audio is mono and float32
    if len(audio_samples.shape) > 1:
        audio_samples = audio_samples.mean(axis=1)
    audio_samples = audio_samples.astype(np.float32)

The soundfile.read() function returns both the audio samples and the sample rate. We ensure the audio is mono by averaging channels if necessary, and convert the samples to float32 format to make them compatible with the Revoize SDK's process function.


Process the Audio in Chunks

Now that we have the audio samples in float32 format, we can process them in chunks using the Revoize SDK. We iterate over the audio samples in chunks of chunk_size and process each chunk using the process function.

python
    processed_audio = []
    num_chunks = len(audio_samples) // chunk_size

    for i in range(num_chunks):
        start = i * chunk_size
        end = start + chunk_size
        chunk = audio_samples[start:end]
        # Process each chunk using Revoize SDK
        output_chunk = revoize_sdk.process(chunk)
        processed_audio.extend(output_chunk)

The process function takes an input audio chunk and returns the processed audio chunk. We store the processed audio chunks in a list called processed_audio.


Save the Processed Audio to a New WAV File

Finally, we save the processed audio to a new WAV file. We convert the list of processed samples to a NumPy array and use soundfile.write() to save it as a WAV file.

python
    processed_audio = np.array(processed_audio, dtype=np.float32)
    sf.write(output_wav, processed_audio, sample_rate)

Full Code Example

Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.

python
import numpy as np
import soundfile as sf
import revoize_sdk

def main():
    # -------------------------------------------------
    # 1. Hardcoded parameters and initialization
    # -------------------------------------------------
    # Input WAV file path
    input_wav = "input.wav"
    # Output WAV file path
    output_wav = "output.wav"
    # Chunk size (480 samples)
    chunk_size = 480
    # Model Type is hardcoded to Capella
    revoize_sdk.init(revoize_sdk.ModelType.CAPELLA)

    # -------------------------------------------------
    # 2. Load the input WAV file
    # -------------------------------------------------
    # Read the WAV file
    audio_samples, sample_rate = sf.read(input_wav)

    # Ensure the audio is mono and float32
    if len(audio_samples.shape) > 1:
        audio_samples = audio_samples.mean(axis=1)
    audio_samples = audio_samples.astype(np.float32)

    # -------------------------------------------------
    # 3. Process the audio in chunks
    # -------------------------------------------------
    processed_audio = []
    num_chunks = len(audio_samples) // chunk_size

    for i in range(num_chunks):
        start = i * chunk_size
        end = start + chunk_size
        chunk = audio_samples[start:end]
        # Process each chunk using Revoize SDK
        output_chunk = revoize_sdk.process(chunk)
        processed_audio.extend(output_chunk)

    # -------------------------------------------------
    # 4. Save the processed audio to a new WAV file
    # -------------------------------------------------
    processed_audio = np.array(processed_audio, dtype=np.float32)
    sf.write(output_wav, processed_audio, sample_rate)

if __name__ == "__main__":
    main()

Example #2: Real-time Speech Enhancement

Coming Soon.