HTML5 Audio – Implementation notes.

Recently i’ve been messing around with the idea of adding sound capabilities to CAAT. The idea is pretty straightforward, play a sound at any given time, or simply have a looping audio channel playing forever. Here are some lessons i’ve learnt from my sound implementation.

First of all, not all major browser vendors support audio tags. Microsoft’s internet explorer, while giving a neat and always improving canvas support lacks absolutely of audio support. Even worse, neither the object Audio nor HTMLAudioElement are known, so either wrap between try/catch your code to avoid exception throwing, or perform a document.createElement(‘HTMLAudioElement’) and check for null return values.

The other Audio supporting browsers do not support all sound formats. There’s a function to realize whether your browser will be able to play a sound format. Well, not really. Depending on the browser, return values from this method could be: ‘yes’, ‘no’, ‘maybe’, … Yes, ‘maybe’ is a possible value. So the bad news is that you better have your sounds in all available formats, so you feed your audio elements until the call to audio.canPlayType( extension {string} ) returns ‘yes’. My implementation deals with these formats:

  • ‘mp3’: ‘audio/mpeg;’
  • ‘ogg’: ‘audio/ogg; codecs=”vorbis”‘
  • ‘wav’: ‘audio/wav; codecs=”1″‘
  • ‘mp4’: ‘audio/mp4; codecs=”mp4a.40.2″‘

In example, safari is well known for not giving support for ogg/vorbis sound format. Something unacceptable btw.
Wav format seems to be accepted by all browsers able to play sound I’ve tested so far.

Another problem arises when trying to loop a sound. To any given audio tag or Audio/HTMLAudioElement object you can set the property ‘loop’ to true, and the sound is supposed to loop forever. Not really, at least in Firefox 3.x and 4.x Beta. Firefox will loop the sound exactly once, and stop playing. Fortunately, there’re some audio events out-of-the-box which come to the rescue. So to make this audio loop, I execute the following code:

  // on sound end, trigger this function
  function(audioEvent) {
    // get the audioNode element
    var target=;
    // and make it loop by setting audio time to 0.

This code sets audio time to 0 on audio end which has (more or less) the effect of looping the sound. I manage to attach that function only for Firefox browsers though. The other browsers that honor loop attribute, loop sound with different grades of accuracy. In my Mac OSX 10.6+, chrome seems to loop little time before it ends playing the sound, and FF4+ sometimes has a noticeable pause before looping with the end event function method.

Another issue i found was the possibility to play multiple instances of the same audio element. To do so, I must have to create new Audio elements on the fly, assign sources and related resources to it and then play the sound. This operation led to some misbehaviors, making the browser (chrome concretely) to lag and pause up to unacceptable levels. Then, I decided to pre-create an Audio elements pool to borrow from pre-cached audio elements. The drawback is that this pool has a limited number of potential concurrent sounds, but I’ve set the number to 8. I think eight concurrent sounds should be enough.
When playing a sound, an element from this pool is borrowed, and added to a temporary pool of working audio elements. After ending audio play the audio element is pushed back from working to available pool for future reuse. Looping audio elements, create their own audio channels, and are stored and managed in a totally different structure than that of regular audio elements.
This is a code snippet for my AudioManager class initialization:

// Allocate 'numChannels' audio channels. An audio channel is just an Audio or 
// HTMLAudioElement object.
for( var i=0; i<numChannels; i++ ) {
    // create an audio channel.
    var channel= document.createElement('audio');

    if ( null!=channel ) {
        // store this channel in the available audio channels.
        this.channels.push( channel );
        // hack to reference AudioManager instance inside audio's ended 
        // event listener function.
        // scope in javascript deserves a whole thesis by itself.
        var me= this;

                // on audio end
                function(audioEvent) {
                    // identify ending audio element
                    var target=;
                    var i;

                    // identify and remove from workingChannels
                    for( i=0; i<me.workingChannels.length; i++ ) {
                        if (me.workingChannels[i]==target ) {

                    // and get it back to available channels for reuse.

With this audio element reusing scheeme, I can manage to easily assign the same sound to different channels and concurrently play the very same sound. I found out that in example, chrome would crash or get irresponsive when creating many concurrent audio elements. With this pooling strategy, I managed to get a consistent Audio Manager.

Another thing to take into account is that of creating Audio elements (either by executing new Audio() or by document.createElement(‘audio’) ) or by integrating audio tags in the html file by using . Personally i’d prefer embedding the audio tags into the html and marking them as preload=’auto’ so that the browser will start the job of preloading sounds. This will be saved time since by the time your javascript’s been loaded some audio can have been preloaded as well. As it is a preference, in CAAT’s AudioManager implementation, you can pass in either an html or a DOM audio node to the adding sound function. The function needs an id, url_or_node and an optional end playing callback function. This function won’t add the sound to the available catalog of playable audio files unless the browser responds something different to ‘no’ on a call to the audio object’s function canPlayType( audioTypes[extension] ). This is how my AudioManager implementation handles the two audio sources:

addAudio : function( id, url_or_node, endplaying_callback ) {
    // if a string, interpret the url_or_node as a url.
    if ( typeof url_or_node == "string" ) {
        audio= document.createElement('audio');
        if ( null!=audio ) {
    } else {
        // not all browsers have the HTMLAudioElement or Audio object 
        // present, so better wrap it up.
        try {
            // url_or_node is a html node ?. Then treat it as an embedded 
            // audio tag.
            if ( url_or_node instanceof HTMLAudioElement ) {
        } catch(e) {

Apart from all that hacking with the sound API, the api itself some strange behaviors. In example there’s no stop playing method, just a pause method. If embedding an audio tag into the html, you can set some properties up like
autoplay, loop (not all browsers will honor that), volume, etc.

So in the end, and as a conclusion imho, the audio API is buggy and incomplete, with very different support flavours, so better wrap it all into a hacky class, and pray for future improved support.

Here’s it is my final AudioManager implementation. If you something to improve, please let me know.

 * @author  Hyperandroid  ||
 * Sound implementation.

(function() {

     * This class is a sound manager implementation which can play at least 'numChannels' sounds at the same time.
     * By default, CAAT.Director instances will set eight channels to play sound.

* If more than 'numChannels' sounds want to be played at the same time the requests will be dropped, * so no more than 'numChannels' sounds can be concurrently played. *

* Available sounds to be played must be supplied to every CAAT.Director instance by calling addSound * method. The default implementation will accept a URL/URI or a HTMLAudioElement as source. *

* The cached elements can be played, or looped. The loop method will return a handler to * give the opportunity of cancelling the sound. *

* Be aware of Audio.canPlay, is able to return 'yes', 'no', 'maybe', ..., so anything different from * '' and 'no' will do. * * @constructor * */ CAAT.AudioManager= function() { this.browserInfo= new CAAT.BrowserDetect(); return this; }; CAAT.AudioManager.prototype= { browserInfo: null, musicEnabled: true, fxEnabled: true, audioCache: null, // audio elements. channels: null, // available playing channels. workingChannels: null, // currently playing channels. loopingChannels: [], audioTypes: { // supported audio formats. Don't remember where i took them from :S "mp3": "audio/mpeg;", "ogg": "audio/ogg; codecs='vorbis'", "wav": "audio/wav; codecs='1'", "mp4": "audio/mp4; codecs='mp4a.40.2'" }, /**' * Initializes the sound subsystem by creating a fixed number of Audio channels. * Every channel registers a handler for sound playing finalization. If a callback is set, the * callback function will be called with the associated sound id in the cache. * * @param numChannels {number} number of channels to pre-create. 8 by default. * * @return this. */ initialize : function(numChannels) { this.audioCache= []; this.channels= []; this.workingChannels= []; for( var i=0; i<numChannels; i++ ) { var channel= document.createElement('audio'); if ( null!=channel ) { channel.finished= -1; this.channels.push( channel ); var me= this; channel.addEventListener( 'ended', // on sound end, set channel to available channels list. function(audioEvent) { var target=; var i; // remove from workingChannels for( i=0; i<me.workingChannels.length; i++ ) { if (me.workingChannels[i]==target ) { me.workingChannels.splice(i,1); break; } } if ( target.caat_callback ) { target.caat_callback(target.caat_id); } // set back to channels. me.channels.push(target); }, false ); } } return this; }, /** * creates an Audio object and adds it to the audio cache in case the url points to a * suitable audio file to be played. *

* If this method returns false, that sound won't be available for play, and you should try with another * sound format. * * @param id {object} an object to reference the audio object. Tipically a string. * @param url {string|HTMLElement} an url pointing to an audio resource or an HTMLAudioElement * object. * @param endplaying_callback {function} a callback function to notify on audio finalization. The * function is of the form function(id{string}). The id parameter is the associated id * in the cache. * * @return {boolean} a boolean value indicating whether the sound's been added to the catalog of playable * audio elements. A value of false means either the browser does not support audio elements, or responds 'no' * to audio object's function 'canPlayType()'. */ addAudio : function( id, url, endplaying_callback ) { var audio= null; var extension= null; if ( typeof url == "string" ) { audio= document.createElement("audio"); if ( null!=audio ) { if(!audio.canPlayType) { return false; } extension= url.substr(url.lastIndexOf(".")+1); var canplay= audio.canPlayType(this.audioTypes[extension]); if(canplay!=="" && canplay!=="no") { audio.src= url; audio.preload = "auto"; audio.load(); if ( endplaying_callback ) { audio.caat_callback= endplaying_callback; audio.caat_id= id; } this.audioCache.push( { id:id, audio:audio } ); return true; } } } else { try { if ( url instanceof HTMLAudioElement ) { audio= url; extension= audio.src.substr(audio.src.lastIndexOf(".")+1); if ( audio.canPlayType(this.audioTypes[extension]) ) { if ( endplaying_callback ) { audio.caat_callback= endplaying_callback; audio.caat_id= id; } this.audioCache.push( { id:id, audio:audio } ); return true; } } } catch(e) { } } return false; }, /** * Returns an audio object. * @param aId {object} the id associated to the target Audio object. * @return {object} the HTMLAudioElement addociated to the given id. */ getAudio : function(aId) { for( var i=0; i<this.audioCache.length; i++ ) { if ( this.audioCache[i].id==aId ) { return this.audioCache[i].audio; } } return null; }, /** * Plays an audio file from the cache if any sound channel is available. * The playing sound will occupy a sound channel and when ends playing will leave * the channel free for any other sound to be played in. * @param id {object} an object identifying a sound in the sound cache. * @return this. */ play : function( id ) { if ( !this.fxEnabled ) { return this; } var audio= this.getAudio(id); // existe el audio, y ademas hay un canal de audio disponible. if ( null!=audio && this.channels.length>0 ) { var channel= this.channels.shift(); channel.src= audio.src; channel.load();; this.workingChannels.push(channel); } return this; }, /** * This method creates a new AudioChannel to loop the sound with. * It returns an Audio object so that the developer can cancel the sound loop at will. * The user must call pause() method to stop playing a loop. *

* Firefox does not honor the loop property, so looping is performed by attending end playing * event on audio elements. * * @return {HTMLElement|null} an Audio instance if a valid sound id is supplied. Null otherwise */ loop : function( id ) { if (!this.musicEnabled) { return this; } var audio_in_cache= this.getAudio(id); // existe el audio, y ademas hay un canal de audio disponible. if ( null!=audio_in_cache ) { var audio= document.createElement('audio'); if ( null!=audio ) { audio.src= audio_in_cache.src; audio.preload = "auto"; if ( this.browserInfo.browser=='Firefox') { audio.addEventListener( 'ended', // on sound end, set channel to available channels list. function(audioEvent) { var target=; target.currentTime=0; }, false ); } else { audio.loop= true; } audio.load();; this.loopingChannels.push(audio); return audio; } } return null; }, /** * Cancel all playing audio channels * Get back the playing channels to available channel list. * * @return this */ endSound : function() { for( var i=0; i<this.workingChannels.length; i++ ) { this.workingChannels[i].pause(); this.channels.push( this.workingChannels[i] ); } for( var i=0; i<this.loopingChannels.length; i++ ) { this.loopingChannels[i].pause(); } return this; }, setSoundEffectsEnabled : function( enable ) { this.fxEnabled= enable; return this; }, isSoundEffectsEnabled : function() { return this.fxEnabled; }, setMusicEnabled : function( enable ) { this.musicEnabled= enable; for( var i=0; i<this.loopingChannels.length; i++ ) { if ( enable ) { this.loopingChannels[i].play(); } else { this.loopingChannels[i].pause(); } } return this; }, isMusicEnabled : function() { return this.musicEnabled; } }; })();


3 thoughts on “HTML5 Audio – Implementation notes.”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s