Skip to content

Usage Examples for the Revoize SDK in Java

← Back to Examples Overview

Example #1: Processing a Single Audio File

In this example, we'll show you how to process a single audio file using the Revoize SDK. We'll load an audio file from disk, process it with the SDK, and save the enhanced audio to a new file.

This example focuses on processing a WAV file using the Revoize SDK in a minimal setup.

WARNING

The input and output file paths, model type, and chunk size are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be recorded at 48 kHz. Audio is processed in 480-sample chunks.

Here's a general sequence diagram for this example:


Import Statements

First, we need to import the necessary classes to use the Revoize SDK and process audio files. We'll use javax.sound.sampled for reading and writing WAV files.

java
import com.revoize.sdk.RevoizeSDK;
import com.revoize.sdk.ModelType;
import javax.sound.sampled.*;
import java.io.*;

The most important lines from the Revoize SDK usage perspective are:

java
import com.revoize.sdk.RevoizeSDK;  
import com.revoize.sdk.ModelType;  
import javax.sound.sampled.*;
import java.io.*;

This ensures we can use the Revoize SDK functions init and process as well as the ModelType enum.


Helper Functions

Before we start processing audio, we need some helper functions to read and write WAV files.

java
private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
    AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
    AudioFormat format = audioInputStream.getFormat();

    // Convert to mono if necessary
    if (format.getChannels() > 1) {
        AudioFormat monoFormat = new AudioFormat(
            format.getSampleRate(),
            format.getSampleSizeInBits(),
            1,
            true,
            format.isBigEndian()
        );
        audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
    }

    // Read the audio data
    byte[] audioData = audioInputStream.readAllBytes();
    float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio

    // Convert bytes to float samples
    for (int i = 0; i < samples.length; i++) {
        short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
        samples[i] = sample / 32768.0f;
    }

    audioInputStream.close();
    return samples;
}

private static void writeWavFile(String filename, float[] samples) throws IOException {
    AudioFormat format = new AudioFormat(48000, 16, 1, true, false);
    byte[] audioData = new byte[samples.length * 2];

    // Convert float samples to bytes
    for (int i = 0; i < samples.length; i++) {
        short sample = (short) (samples[i] * 32767);
        audioData[i * 2] = (byte) (sample & 0xFF);
        audioData[i * 2 + 1] = (byte) (sample >> 8);
    }

    ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
    AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
    AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
    ais.close();
}

Define Some Hardcoded Values

To keep this example minimalistic, we can hardcode the paths to:

  • the input WAV file
  • the output WAV file
  • the chunk size
java
    private static final String INPUT_FILE = "input.wav";
    private static final String OUTPUT_FILE = "output.wav";
    private static final int CHUNK_SIZE = 480;

Initialize the Revoize SDK

Before we can start processing audio, we need to initialize the Revoize SDK by calling the init function with the desired model type.

java
    // Initialize with Capella
    RevoizeSDK.init(ModelType.CAPELLA);

There are various model types available in the Revoize SDK, but for this example, we are using the CAPELLA model, which is a lightweight discriminative model suitable for general denoising tasks.


Load the Input WAV File

Next, we need to load the input WAV file from disk. We use the readWavFile helper function to read the WAV file.

java
    // Read input WAV file
    float[] inputSamples = readWavFile(INPUT_FILE);

The readWavFile function returns an array of audio samples. The samples are stored as float values to make them compatible with the Revoize SDK's process function.


Process the Audio in Chunks

Now that we have the audio samples, we can process them in chunks using the Revoize SDK. We iterate over the audio samples in chunks of 480 samples and process each chunk using the process function.

java
    // Process the audio in chunks
    List<Float> processedAudio = new ArrayList<>();

    for (int i = 0; i < inputSamples.length; i += CHUNK_SIZE) {
        // If we have less than 480 samples left, skip them
        if (i + CHUNK_SIZE > inputSamples.length) {
            break;
        }

        // Process this chunk
        float[] chunk = Arrays.copyOfRange(inputSamples, i, i + CHUNK_SIZE);
        float[] outputChunk = RevoizeSDK.process(chunk);

        // Append to output
        for (float sample : outputChunk) {
            processedAudio.add(sample);
        }
    }

The process function takes an input audio chunk, processes it, and returns the enhanced samples. We store all processed chunks in a list called processedAudio.


Save the Processed Audio to a New WAV File

Finally, we save the processed audio to a new WAV file using the writeWavFile helper function.

java
    // Convert List<Float> to float[]
    float[] processedAudioArray = new float[processedAudio.size()];
    for (int i = 0; i < processedAudio.size(); i++) {
        processedAudioArray[i] = processedAudio.get(i);
    }

    // Write output WAV file
    writeWavFile(OUTPUT_FILE, processedAudioArray);

Full Code Example

Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.

java
import com.revoize.sdk.RevoizeSDK;
import com.revoize.sdk.ModelType;
import javax.sound.sampled.*;
import java.io.*;
import java.util.*;

public class AudioProcessor {
    private static final String INPUT_FILE = "input.wav";
    private static final String OUTPUT_FILE = "output.wav";
    private static final int CHUNK_SIZE = 480;

    private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
        AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
        AudioFormat format = audioInputStream.getFormat();

        // Convert to mono if necessary
        if (format.getChannels() > 1) {
            AudioFormat monoFormat = new AudioFormat(
                format.getSampleRate(),
                format.getSampleSizeInBits(),
                1,
                true,
                format.isBigEndian()
            );
            audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
        }

        // Read the audio data
        byte[] audioData = audioInputStream.readAllBytes();
        float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio

        // Convert bytes to float samples
        for (int i = 0; i < samples.length; i++) {
            short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
            samples[i] = sample / 32768.0f;
        }

        audioInputStream.close();
        return samples;
    }

    private static void writeWavFile(String filename, float[] samples) throws IOException {
        AudioFormat format = new AudioFormat(48000, 16, 1, true, false);
        byte[] audioData = new byte[samples.length * 2];

        // Convert float samples to bytes
        for (int i = 0; i < samples.length; i++) {
            short sample = (short) (samples[i] * 32767);
            audioData[i * 2] = (byte) (sample & 0xFF);
            audioData[i * 2 + 1] = (byte) (sample >> 8);
        }

        ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
        AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
        AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
        ais.close();
    }

    public static void main(String[] args) {
        try {
            // -------------------------------------------------
            // 1. Hardcoded parameters and initialization
            // -------------------------------------------------
            // Initialize with Capella
            RevoizeSDK.init(ModelType.CAPELLA);

            // -------------------------------------------------
            // 2. Load the input WAV file
            // -------------------------------------------------
            float[] inputSamples = readWavFile(INPUT_FILE);

            // -------------------------------------------------
            // 3. Process the audio in chunks
            // -------------------------------------------------
            List<Float> processedAudio = new ArrayList<>();

            for (int i = 0; i < inputSamples.length; i += CHUNK_SIZE) {
                // If we have less than 480 samples left, skip them
                if (i + CHUNK_SIZE > inputSamples.length) {
                    break;
                }

                // Process this chunk
                float[] chunk = Arrays.copyOfRange(inputSamples, i, i + CHUNK_SIZE);
                float[] outputChunk = RevoizeSDK.process(chunk);

                // Append to output
                for (float sample : outputChunk) {
                    processedAudio.add(sample);
                }
            }

            // -------------------------------------------------
            // 4. Save the processed audio to a new WAV file
            // -------------------------------------------------
            // Convert List<Float> to float[]
            float[] processedAudioArray = new float[processedAudio.size()];
            for (int i = 0; i < processedAudio.size(); i++) {
                processedAudioArray[i] = processedAudio.get(i);
            }

            // Write output WAV file
            writeWavFile(OUTPUT_FILE, processedAudioArray);

        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            e.printStackTrace();
            System.exit(1);
        }
    }
}

Example #2: Real-time Speech Enhancement

Coming Soon.