Usage Examples for the Revoize SDK in Java
Example #1: Processing a Single Audio File
This example shows the Java flow: get model params with SDK.models.getParams("Capella"), initialize the SDK, load a WAV file, process in chunks, and save the enhanced audio. Minimal setup so you can plug in your own file paths and model choice.
WARNING
The input and output file paths and model name are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be at the sample rate required by the chosen model (see params.inputSampleRate). Audio is processed in chunks whose size is params.inputChunkSizeSamples (this varies by model; e.g. Capella uses 480).
Here's a general sequence diagram for this example:
Import Statements
First, we need to import the necessary classes to use the Revoize SDK and process audio files. We'll use javax.sound.sampled for reading and writing WAV files.
import revoize.SDK;
import javax.sound.sampled.*;
import java.io.*;The most important line from the Revoize SDK usage perspective is:
import revoize.SDK; We use SDK.models.getParams(name) to get model parameters (chunk sizes and sample rates), then SDK.init(params) and SDK.process(input).
Helper Functions
Before we start processing audio, we need some helper functions to read and write WAV files.
private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
AudioFormat format = audioInputStream.getFormat();
// Convert to mono if necessary
if (format.getChannels() > 1) {
AudioFormat monoFormat = new AudioFormat(
format.getSampleRate(),
format.getSampleSizeInBits(),
1,
true,
format.isBigEndian()
);
audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
}
// Read the audio data
byte[] audioData = audioInputStream.readAllBytes();
float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio
// Convert bytes to float samples
for (int i = 0; i < samples.length; i++) {
short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
samples[i] = sample / 32768.0f;
}
audioInputStream.close();
return samples;
}
private static void writeWavFile(String filename, float[] samples, int sampleRate) throws IOException {
AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);
byte[] audioData = new byte[samples.length * 2];
// Convert float samples to bytes
for (int i = 0; i < samples.length; i++) {
short sample = (short) (samples[i] * 32767);
audioData[i * 2] = (byte) (sample & 0xFF);
audioData[i * 2 + 1] = (byte) (sample >> 8);
}
ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
ais.close();
}Define Some Hardcoded Values
We get model parameters by name with SDK.models.getParams("Capella"). You can use SDK.models.listNames() to see available names. The parameters define input/output chunk sizes and sample rates.
private static final String INPUT_FILE = "input.wav";
private static final String OUTPUT_FILE = "output.wav";
// Get model parameters by name
SDK.ModelParams params = SDK.models.getParams("Capella");
if (params == null) {
throw new IllegalStateException("Model not available");
}
int chunkSize = params.getInputChunkSizeSamples();Initialize the Revoize SDK
We initialize the SDK with the model parameters.
if (SDK.init(params) != 0) {
throw new IllegalStateException("Failed to initialize SDK");
}The input WAV file should be at params.getInputSampleRate(). Output will have params.getOutputChunkSizeSamples() per chunk at params.getOutputSampleRate().
Load the Input WAV File
Next, we need to load the input WAV file from disk. We use the readWavFile helper function to read the WAV file.
// Read input WAV file
float[] inputSamples = readWavFile(INPUT_FILE);The readWavFile function returns an array of audio samples. The samples are stored as float values to make them compatible with the Revoize SDK's process function.
Process the Audio in Chunks
We process the audio in chunks of params.getInputChunkSizeSamples(). Each chunk produces params.getOutputChunkSizeSamples() output samples.
// Process the audio in chunks
List<Float> processedAudio = new ArrayList<>();
for (int i = 0; i < inputSamples.length; i += chunkSize) {
if (i + chunkSize > inputSamples.length) {
break;
}
float[] chunk = Arrays.copyOfRange(inputSamples, i, i + chunkSize);
float[] outputChunk = SDK.process(chunk);
// Append to output
for (float sample : outputChunk) {
processedAudio.add(sample);
}
}The process function takes an input audio chunk, processes it, and returns the enhanced samples. We store all processed chunks in a list called processedAudio.
Save the Processed Audio to a New WAV File
Finally, we save the processed audio to a new WAV file using the writeWavFile helper function.
// Convert List<Float> to float[]
float[] processedAudioArray = new float[processedAudio.size()];
for (int i = 0; i < processedAudio.size(); i++) {
processedAudioArray[i] = processedAudio.get(i);
}
// Write output WAV file (use model's output sample rate)
writeWavFile(OUTPUT_FILE, processedAudioArray, params.getOutputSampleRate());Full Code Example
Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.
import revoize.SDK;
import javax.sound.sampled.*;
import java.io.*;
import java.util.*;
public class AudioProcessor {
private static final String INPUT_FILE = "input.wav";
private static final String OUTPUT_FILE = "output.wav";
private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
AudioFormat format = audioInputStream.getFormat();
// Convert to mono if necessary
if (format.getChannels() > 1) {
AudioFormat monoFormat = new AudioFormat(
format.getSampleRate(),
format.getSampleSizeInBits(),
1,
true,
format.isBigEndian()
);
audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
}
// Read the audio data
byte[] audioData = audioInputStream.readAllBytes();
float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio
// Convert bytes to float samples
for (int i = 0; i < samples.length; i++) {
short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
samples[i] = sample / 32768.0f;
}
audioInputStream.close();
return samples;
}
private static void writeWavFile(String filename, float[] samples, int sampleRate) throws IOException {
AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);
byte[] audioData = new byte[samples.length * 2];
// Convert float samples to bytes
for (int i = 0; i < samples.length; i++) {
short sample = (short) (samples[i] * 32767);
audioData[i * 2] = (byte) (sample & 0xFF);
audioData[i * 2 + 1] = (byte) (sample >> 8);
}
ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
ais.close();
}
public static void main(String[] args) {
try {
// -------------------------------------------------
// 1. Get model parameters and initialize
// -------------------------------------------------
SDK.ModelParams params = SDK.models.getParams("Capella");
if (params == null || SDK.init(params) != 0) {
throw new IllegalStateException("Failed to initialize SDK");
}
int chunkSize = params.getInputChunkSizeSamples();
// -------------------------------------------------
// 2. Load the input WAV file
// -------------------------------------------------
float[] inputSamples = readWavFile(INPUT_FILE);
// -------------------------------------------------
// 3. Process the audio in chunks
// -------------------------------------------------
List<Float> processedAudio = new ArrayList<>();
for (int i = 0; i < inputSamples.length; i += chunkSize) {
if (i + chunkSize > inputSamples.length) {
break;
}
float[] chunk = Arrays.copyOfRange(inputSamples, i, i + chunkSize);
float[] outputChunk = SDK.process(chunk);
// Append to output
for (float sample : outputChunk) {
processedAudio.add(sample);
}
}
// -------------------------------------------------
// 4. Save the processed audio to a new WAV file
// -------------------------------------------------
// Convert List<Float> to float[]
float[] processedAudioArray = new float[processedAudio.size()];
for (int i = 0; i < processedAudio.size(); i++) {
processedAudioArray[i] = processedAudio.get(i);
}
writeWavFile(OUTPUT_FILE, processedAudioArray, params.getOutputSampleRate());
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
e.printStackTrace();
System.exit(1);
}
}
}Example #2: Real-time Speech Enhancement
Coming Soon.