Usage Examples for the Revoize SDK in Java
Example #1: Processing a Single Audio File
In this example, we'll show you how to process a single audio file using the Revoize SDK. We'll load an audio file from disk, process it with the SDK, and save the enhanced audio to a new file.
This example focuses on processing a WAV file using the Revoize SDK in a minimal setup.
WARNING
The input and output file paths, model type, and chunk size are hardcoded in this example. You may need to modify the input file path to match the location of your audio file. The input WAV file must be recorded at 48 kHz. Audio is processed in 480-sample chunks.
Here's a general sequence diagram for this example:
Import Statements
First, we need to import the necessary classes to use the Revoize SDK and process audio files. We'll use javax.sound.sampled
for reading and writing WAV files.
import com.revoize.sdk.RevoizeSDK;
import com.revoize.sdk.ModelType;
import javax.sound.sampled.*;
import java.io.*;
The most important lines from the Revoize SDK usage perspective are:
import com.revoize.sdk.RevoizeSDK;
import com.revoize.sdk.ModelType;
import javax.sound.sampled.*;
import java.io.*;
This ensures we can use the Revoize SDK functions init
and process
as well as the ModelType
enum.
Helper Functions
Before we start processing audio, we need some helper functions to read and write WAV files.
private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
AudioFormat format = audioInputStream.getFormat();
// Convert to mono if necessary
if (format.getChannels() > 1) {
AudioFormat monoFormat = new AudioFormat(
format.getSampleRate(),
format.getSampleSizeInBits(),
1,
true,
format.isBigEndian()
);
audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
}
// Read the audio data
byte[] audioData = audioInputStream.readAllBytes();
float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio
// Convert bytes to float samples
for (int i = 0; i < samples.length; i++) {
short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
samples[i] = sample / 32768.0f;
}
audioInputStream.close();
return samples;
}
private static void writeWavFile(String filename, float[] samples) throws IOException {
AudioFormat format = new AudioFormat(48000, 16, 1, true, false);
byte[] audioData = new byte[samples.length * 2];
// Convert float samples to bytes
for (int i = 0; i < samples.length; i++) {
short sample = (short) (samples[i] * 32767);
audioData[i * 2] = (byte) (sample & 0xFF);
audioData[i * 2 + 1] = (byte) (sample >> 8);
}
ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
ais.close();
}
Define Some Hardcoded Values
To keep this example minimalistic, we can hardcode the paths to:
- the input WAV file
- the output WAV file
- the chunk size
private static final String INPUT_FILE = "input.wav";
private static final String OUTPUT_FILE = "output.wav";
private static final int CHUNK_SIZE = 480;
Initialize the Revoize SDK
Before we can start processing audio, we need to initialize the Revoize SDK by calling the init
function with the desired model type.
// Initialize with Capella
RevoizeSDK.init(ModelType.CAPELLA);
There are various model types available in the Revoize SDK, but for this example, we are using the CAPELLA
model, which is a lightweight discriminative model suitable for general denoising tasks.
Load the Input WAV File
Next, we need to load the input WAV file from disk. We use the readWavFile
helper function to read the WAV file.
// Read input WAV file
float[] inputSamples = readWavFile(INPUT_FILE);
The readWavFile
function returns an array of audio samples. The samples are stored as float
values to make them compatible with the Revoize SDK's process
function.
Process the Audio in Chunks
Now that we have the audio samples, we can process them in chunks using the Revoize SDK. We iterate over the audio samples in chunks of 480 samples and process each chunk using the process
function.
// Process the audio in chunks
List<Float> processedAudio = new ArrayList<>();
for (int i = 0; i < inputSamples.length; i += CHUNK_SIZE) {
// If we have less than 480 samples left, skip them
if (i + CHUNK_SIZE > inputSamples.length) {
break;
}
// Process this chunk
float[] chunk = Arrays.copyOfRange(inputSamples, i, i + CHUNK_SIZE);
float[] outputChunk = RevoizeSDK.process(chunk);
// Append to output
for (float sample : outputChunk) {
processedAudio.add(sample);
}
}
The process
function takes an input audio chunk, processes it, and returns the enhanced samples. We store all processed chunks in a list called processedAudio
.
Save the Processed Audio to a New WAV File
Finally, we save the processed audio to a new WAV file using the writeWavFile
helper function.
// Convert List<Float> to float[]
float[] processedAudioArray = new float[processedAudio.size()];
for (int i = 0; i < processedAudio.size(); i++) {
processedAudioArray[i] = processedAudio.get(i);
}
// Write output WAV file
writeWavFile(OUTPUT_FILE, processedAudioArray);
Full Code Example
Below is the complete minimal source code example that demonstrates how to process a single audio file using the Revoize SDK.
import com.revoize.sdk.RevoizeSDK;
import com.revoize.sdk.ModelType;
import javax.sound.sampled.*;
import java.io.*;
import java.util.*;
public class AudioProcessor {
private static final String INPUT_FILE = "input.wav";
private static final String OUTPUT_FILE = "output.wav";
private static final int CHUNK_SIZE = 480;
private static float[] readWavFile(String filename) throws UnsupportedAudioFileException, IOException {
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filename));
AudioFormat format = audioInputStream.getFormat();
// Convert to mono if necessary
if (format.getChannels() > 1) {
AudioFormat monoFormat = new AudioFormat(
format.getSampleRate(),
format.getSampleSizeInBits(),
1,
true,
format.isBigEndian()
);
audioInputStream = AudioSystem.getAudioInputStream(monoFormat, audioInputStream);
}
// Read the audio data
byte[] audioData = audioInputStream.readAllBytes();
float[] samples = new float[audioData.length / 2]; // Assuming 16-bit audio
// Convert bytes to float samples
for (int i = 0; i < samples.length; i++) {
short sample = (short) ((audioData[i * 2] & 0xFF) | (audioData[i * 2 + 1] << 8));
samples[i] = sample / 32768.0f;
}
audioInputStream.close();
return samples;
}
private static void writeWavFile(String filename, float[] samples) throws IOException {
AudioFormat format = new AudioFormat(48000, 16, 1, true, false);
byte[] audioData = new byte[samples.length * 2];
// Convert float samples to bytes
for (int i = 0; i < samples.length; i++) {
short sample = (short) (samples[i] * 32767);
audioData[i * 2] = (byte) (sample & 0xFF);
audioData[i * 2 + 1] = (byte) (sample >> 8);
}
ByteArrayInputStream bais = new ByteArrayInputStream(audioData);
AudioInputStream ais = new AudioInputStream(bais, format, audioData.length / format.getFrameSize());
AudioSystem.write(ais, AudioFileFormat.Type.WAVE, new File(filename));
ais.close();
}
public static void main(String[] args) {
try {
// -------------------------------------------------
// 1. Hardcoded parameters and initialization
// -------------------------------------------------
// Initialize with Capella
RevoizeSDK.init(ModelType.CAPELLA);
// -------------------------------------------------
// 2. Load the input WAV file
// -------------------------------------------------
float[] inputSamples = readWavFile(INPUT_FILE);
// -------------------------------------------------
// 3. Process the audio in chunks
// -------------------------------------------------
List<Float> processedAudio = new ArrayList<>();
for (int i = 0; i < inputSamples.length; i += CHUNK_SIZE) {
// If we have less than 480 samples left, skip them
if (i + CHUNK_SIZE > inputSamples.length) {
break;
}
// Process this chunk
float[] chunk = Arrays.copyOfRange(inputSamples, i, i + CHUNK_SIZE);
float[] outputChunk = RevoizeSDK.process(chunk);
// Append to output
for (float sample : outputChunk) {
processedAudio.add(sample);
}
}
// -------------------------------------------------
// 4. Save the processed audio to a new WAV file
// -------------------------------------------------
// Convert List<Float> to float[]
float[] processedAudioArray = new float[processedAudio.size()];
for (int i = 0; i < processedAudio.size(); i++) {
processedAudioArray[i] = processedAudio.get(i);
}
// Write output WAV file
writeWavFile(OUTPUT_FILE, processedAudioArray);
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
e.printStackTrace();
System.exit(1);
}
}
}
Example #2: Real-time Speech Enhancement
Coming Soon.