Stable Audio Open Online

An open source text-to-audio model developed by Stability AI for generating up to 47 seconds of audio samples.Use Stable Audio Open online for free.

Stable Audio Open Generator

Stable Audio Open Overview

Generates variable-length stereo audio at 44.1kHz from text prompts, up to 47 seconds long
Specialized for creating drum beats, instrument riffs, ambient sounds, foley recordings, and other audio samples for music production and sound design
Not optimized for generating full songs, melodies, or vocals

Stable Audio Open Sample Audio

Rock beat played in a treated studio, session drumming on an acoustic kit

Blackbird song, summer, dusk in the forest

Warm arpeggios on an analog synthesizer with a gradually rising filter cutoff and a reverb tail

Stable Audio Open Model Architecture

Comprises three components: an autoencoder that compresses waveforms, a T5-based text embedding for conditioning, and a transformer-based diffusion model operating in the autoencoder's latent space.

Based on a transformer architecture and trained using a latent diffusion model approach.

Stable Audio Open Training Data

Trained on 486,492 audio recordings from FreeSound (472,618) and Free Music Archive (13,874), all licensed under CC0, CC BY, or CC Sampling+.

Rigorous process to ensure no copyrighted music was included in the training data.

Stable Audio Open Usage and Fine-tuning

Designed to be used with the open source stable-audio-tools library for inference and fine-tuning.

Users can fine-tune the model on their own custom audio data, e.g., a drummer fine-tuning on their own drum recordings.

Stable Audio Open Licensing and Access

Available under Stability AI's non-commercial research community agreement license.

Model weights accessible on Hugging Face after agreeing to the license.