Skip to content

Usage Examples for the Revoize SDK in Java

← Back to Examples Overview

Example #1: Processing a Single Audio File

This example shows the Java flow: get model params with SDK.models.getParams("Capella"), initialize the SDK, load a WAV file, process in chunks, and save the enhanced audio. Minimal setup so you can plug in your own file paths and model choice.

WARNING

The input and output file paths and model name are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be at the sample rate required by the chosen model (see params.inputSampleRate). Audio is processed in chunks whose size is params.inputChunkSizeSamples (this varies by model; e.g. Capella uses 480).

Here's a general sequence diagram for this example:


Import Statements

First, we need to import the necessary classes to use the Revoize SDK and process audio files. We'll use javax.sound.sampled for reading and writing WAV files.

java
import revoize.SDK;
import javax.sound.sampled.*;
import java.io.*;

The most important line from the Revoize SDK usage perspective is:

java
import revoize.SDK;  

We use SDK.models.getParams(name) to get model parameters (chunk sizes and sample rates), then SDK.init(params) and SDK.process(input).


Helper Functions

Before we start processing audio, we need some helper functions to read and write WAV files.

java
private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
    AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
    AudioFormat format = audioInputStream.getFormat();

    // Convert to mono if necessary
    if (format.getChannels() > 1) {
        AudioFormat monoFormat = new AudioFormat(
            format.getSampleRate(),
            format.getSampleSizeInBits(),
            1,
            true,
            format.isBigEndian()
        );
        audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
    }

    // Read the audio data
    byte[] audioData = audioInputStream.readAllBytes();
    float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio

    // Convert bytes to float samples
    for (int i = 0; i < samples.length; i++) {
        short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
        samples[i] = sample / 32768.0f;
    }

    audioInputStream.close();
    return samples;
}

private static void writeWavFile(String filename, float[] samples, int sampleRate) throws IOException {
    AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);
    byte[] audioData = new byte[samples.length * 2];

    // Convert float samples to bytes
    for (int i = 0; i < samples.length; i++) {
        short sample = (short) (samples[i] * 32767);
        audioData[i * 2] = (byte) (sample & 0xFF);
        audioData[i * 2 + 1] = (byte) (sample >> 8);
    }

    ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
    AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
    AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
    ais.close();
}

Define Some Hardcoded Values

We get model parameters by name with SDK.models.getParams("Capella"). You can use SDK.models.listNames() to see available names. The parameters define input/output chunk sizes and sample rates.

java
    private static final String INPUT_FILE = "input.wav";
    private static final String OUTPUT_FILE = "output.wav";
    // Get model parameters by name
    SDK.ModelParams params = SDK.models.getParams("Capella");
    if (params == null) {
        throw new IllegalStateException("Model not available");
    }
    int chunkSize = params.getInputChunkSizeSamples();

Initialize the Revoize SDK

We initialize the SDK with the model parameters.

java
    if (SDK.init(params) != 0) {
        throw new IllegalStateException("Failed to initialize SDK");
    }

The input WAV file should be at params.getInputSampleRate(). Output will have params.getOutputChunkSizeSamples() per chunk at params.getOutputSampleRate().


Load the Input WAV File

Next, we need to load the input WAV file from disk. We use the readWavFile helper function to read the WAV file.

java
    // Read input WAV file
    float[] inputSamples = readWavFile(INPUT_FILE);

The readWavFile function returns an array of audio samples. The samples are stored as float values to make them compatible with the Revoize SDK's process function.


Process the Audio in Chunks

We process the audio in chunks of params.getInputChunkSizeSamples(). Each chunk produces params.getOutputChunkSizeSamples() output samples.

java
    // Process the audio in chunks
    List<Float> processedAudio = new ArrayList<>();

    for (int i = 0; i < inputSamples.length; i += chunkSize) {
        if (i + chunkSize > inputSamples.length) {
            break;
        }

        float[] chunk = Arrays.copyOfRange(inputSamples, i, i + chunkSize);
        float[] outputChunk = SDK.process(chunk);

        // Append to output
        for (float sample : outputChunk) {
            processedAudio.add(sample);
        }
    }

The process function takes an input audio chunk, processes it, and returns the enhanced samples. We store all processed chunks in a list called processedAudio.


Save the Processed Audio to a New WAV File

Finally, we save the processed audio to a new WAV file using the writeWavFile helper function.

java
    // Convert List<Float> to float[]
    float[] processedAudioArray = new float[processedAudio.size()];
    for (int i = 0; i < processedAudio.size(); i++) {
        processedAudioArray[i] = processedAudio.get(i);
    }

    // Write output WAV file (use model's output sample rate)
    writeWavFile(OUTPUT_FILE, processedAudioArray, params.getOutputSampleRate());

Full Code Example

Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.

java
import revoize.SDK;
import javax.sound.sampled.*;
import java.io.*;
import java.util.*;

public class AudioProcessor {
    private static final String INPUT_FILE = "input.wav";
    private static final String OUTPUT_FILE = "output.wav";

    private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
        AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
        AudioFormat format = audioInputStream.getFormat();

        // Convert to mono if necessary
        if (format.getChannels() > 1) {
            AudioFormat monoFormat = new AudioFormat(
                format.getSampleRate(),
                format.getSampleSizeInBits(),
                1,
                true,
                format.isBigEndian()
            );
            audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
        }

        // Read the audio data
        byte[] audioData = audioInputStream.readAllBytes();
        float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio

        // Convert bytes to float samples
        for (int i = 0; i < samples.length; i++) {
            short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
            samples[i] = sample / 32768.0f;
        }

        audioInputStream.close();
        return samples;
    }

    private static void writeWavFile(String filename, float[] samples, int sampleRate) throws IOException {
        AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);
        byte[] audioData = new byte[samples.length * 2];

        // Convert float samples to bytes
        for (int i = 0; i < samples.length; i++) {
            short sample = (short) (samples[i] * 32767);
            audioData[i * 2] = (byte) (sample & 0xFF);
            audioData[i * 2 + 1] = (byte) (sample >> 8);
        }

        ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
        AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
        AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
        ais.close();
    }

    public static void main(String[] args) {
        try {
            // -------------------------------------------------
            // 1. Get model parameters and initialize
            // -------------------------------------------------
            SDK.ModelParams params = SDK.models.getParams("Capella");
            if (params == null || SDK.init(params) != 0) {
                throw new IllegalStateException("Failed to initialize SDK");
            }
            int chunkSize = params.getInputChunkSizeSamples();

            // -------------------------------------------------
            // 2. Load the input WAV file
            // -------------------------------------------------
            float[] inputSamples = readWavFile(INPUT_FILE);

            // -------------------------------------------------
            // 3. Process the audio in chunks
            // -------------------------------------------------
            List<Float> processedAudio = new ArrayList<>();

            for (int i = 0; i < inputSamples.length; i += chunkSize) {
                if (i + chunkSize > inputSamples.length) {
                    break;
                }

                float[] chunk = Arrays.copyOfRange(inputSamples, i, i + chunkSize);
                float[] outputChunk = SDK.process(chunk);

                // Append to output
                for (float sample : outputChunk) {
                    processedAudio.add(sample);
                }
            }

            // -------------------------------------------------
            // 4. Save the processed audio to a new WAV file
            // -------------------------------------------------
            // Convert List<Float> to float[]
            float[] processedAudioArray = new float[processedAudio.size()];
            for (int i = 0; i < processedAudio.size(); i++) {
                processedAudioArray[i] = processedAudio.get(i);
            }

            writeWavFile(OUTPUT_FILE, processedAudioArray, params.getOutputSampleRate());

        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            e.printStackTrace();
            System.exit(1);
        }
    }
}

Example #2: Real-time Speech Enhancement

Coming Soon.