7. Adding Audio to Your Ebitengine Game

7. Adding Audio to Your Ebitengine Game

Watch the associated video on YouTube or MakerTube

Watch on YouTube

Introduction #

In this tutorial we will learn how to play sound effects and music using the Ebitengine game engine.

The code for this tutorial is available here. Please consider donating if you find this tutorial helpful.

Pulse-Code Modulation (PCM) #

Pulse-code modulation is a method of encoding an analog signal into a digital represenation.

The signal is recorded (and replayed) at a specific sample rate, such as 44100Hz.

The range of values available for each individual sample is the bit depth.

The sample rate and bit depth of an audio stream both influence the fidelity of the reproduced sound.

Audio Encoding Formats #

Raw PCM data can be stored as an uncompressed WAV file, which will reproduce the signal exactly.

As a result, lossless audio files tend to be much larger in size than uncompressed files.

It’s best to make the size of the game as small as possible to speed up downloading and initializing.

Lossy formats can help us achieve this, but care must be taken to preserve the quality of the sound.

Common audio formats include:

  • Lossy:
  • Lossless:
    • FLAC (Free Lossless Audio Codec)
    • WAV

Typically I will save audio for use in games in OGG format at quality levels 4-7.

Then I will listen to them from best to worst quality, comparing each with the original.

This method produces small audio files without noticeable losses in fidelity.

Ebitengine Audio Context #

The Ebitengine audio context specifies the sample rate of the application. A single context is created and used.

The sample rate of the context and audio assets must all match. If different rates are used, garbled output will result.

Ebitengine Audio Streams #

Ebitengine audio streams are PCM data streams encoded using either of the following data formats:

  • 32 bit float, 2 channel stereo
  • Signed 16 bit integer, 2 channel stereo

Any package which decodes files into a compatible stream may be used with Ebitengine. Common packages include:

Here we use the vorbis package to decode an OGG file, supplied as a byte slice, into a PCM stream:

import (
	"bytes"

	"github.com/hajimehoshi/ebiten/v2/audio/vorbis"
)

const sampleRate = 44100

// loadOGG loads OGG audio data.
func loadOGG(buf []byte) (audioStream, error) {
	stream, err := vorbis.DecodeWithSampleRate(sampleRate, bytes.NewReader(buf))
	if err != nil {
		return nil, err
	}

	return stream, nil
}

Ebitengine Audio Players #

Ebitengine audio players play audio streams. Players are lightweight and may be created on the fly.

Once a player is started, we do not need to maintain a reference to it, unless we intend to later pause or seek the stream.

To play a sound twice using one player we must pause, rewind and then restart the player.

To loop a sound infinitely we wrap the sound’s stream with audio.NewInfiniteLoop:

type game struct {
	// Audio context.
	audioContext *audio.Context

	// Sound files.
	sounds map[string][]byte

	// Sound players.
	players map[string]chan *audio.Player

    // ...
}

// newPlayer returns a new player for the specified sound.
func (g *game) newPlayer(name string, loop bool) (*audio.Player, error) {
	// Create new stream.
	stream, err := g.newStream(name)
	if err != nil {
		return nil, err
	}

	// Loop stream when enabled.
	var s io.ReadSeeker
	if loop {
		s = audio.NewInfiniteLoop(stream, stream.Length())
	} else {
		s = stream
	}

	// Create new player.
	player, err := g.audioContext.NewPlayer(s)
	if err != nil {
		return nil, err
	}

	// ...
	return player, nil
}

Reducing Audio Player Latency #

To reduce audio latency we will design our game to do the following:

  • Perform as much work upfront as possible
  • Run using as few of resources as possible

To perform as much work upfront as possible, we will store direct references to the byte slice of each sound file:

// loadAudio loads all audio assets and starts goroutines which generate sound
// players on-demand.
func (g *game) loadAudio() {
	// Create 44100Hz audio context.
	g.audioContext = audio.NewContext(sampleRate)

	// Load audio assets.
	g.sounds = make(map[string][]byte)
	err := fs.WalkDir(assetFS, filepath.Join("asset", "audio"), func(path string, d fs.DirEntry, err error) error {
		if d.IsDir() {
			return nil
		}
		ext := filepath.Ext(d.Name())
		if ext != ".ogg" {
			return nil
		}
		buf, err := assetFS.ReadFile(path)
		if err != nil {
			return err
		}
		name := strings.TrimSuffix(filepath.Base(d.Name()), ext)
		g.sounds[name] = buf
		return nil
	})
	if err != nil {
		log.Fatal(err)
	}

    // ...
}

Instead of creating sound players on the fly, we will generate multiple ready-to-use players in the background:

// generatePlayers generates ready-to-use players for the specified sound.
func (g *game) generatePlayers(name string, loop bool, players chan *audio.Player) {
	for {
		player, err := g.newPlayer(name, loop)
		if err != nil {
			log.Fatal(err)
		}
		players <- player
	}
}

// loadAudio loads all audio assets and starts goroutines which generate sound
// players on-demand.
func (g *game) loadAudio() {
	// ...

	// Create a player buffer for each audio asset holding ready-to-use players.
	const oneShotBufferSize = 2 // Generate two ready-to-use players for all non-looping sound effects.
	g.players = make(map[string]chan *audio.Player)
	for name := range g.sounds {
		loop := name == "music"
		bufferSize := oneShotBufferSize
		if loop {
			bufferSize = 0 // Reduce player buffer size for looping sounds, because we will only play a single looping sound.
		}
		g.players[name] = make(chan *audio.Player, bufferSize)
		go g.generatePlayers(name, loop, g.players[name])
	}
}

To futher reduce latency, we also adjust the audio buffer size of non-looping players:

// newPlayer returns a new player for the specified sound.
func (g *game) newPlayer(name string, loop bool) (*audio.Player, error) {
	// ...

	// Create new player.
	player, err := g.audioContext.NewPlayer(s)
	if err != nil {
		return nil, err
	}

	// Set buffer size. Non-looping sounds use a small buffer for instant playback.
	// Looping sounds use a large buffer for better performance.
	if !loop {
		player.SetBufferSize(50 * time.Millisecond)
	} else {
		player.SetBufferSize(500 * time.Millisecond)
	}
	return player, nil
}

Using as few of resources as possible will require optimizing the performance of our code and the size of our assets.

We are just getting started, so we don’t need to worry about performance too much for now.

However, the habits we build now will carry over as we build increasingly large and complex games.

It is therefore important to seriously examine and understand the size and performance of our games sooner than later.

Playing Sounds #

We define a playSound method which retrieves a ready-to-use player and starts playing it:

// playSound plays a sound and returns the sound player.
func (g *game) playSound(name string, volume float64) *audio.Player {
	// Obtain a pre-loaded player from the associated Players channel.
	player := <-g.players[name]

	// Set volume.
	player.SetVolume(volume)

	// Start playing sound.
	player.Play()
	return player
}

In initialize we call playSound to start playing the background music:

// initialize sets up the initial state of the game.
func (g *game) initialize() {
	// Load sound files and initialize players.
	g.loadAudio()

    // ...

	// Start playing background music.
	g.playSound("music", 0.25)

	// ...
}

And in Update we call playSound when the player presses jump/flap to play the “flap” sound effect:

// Update is where we update the game state.
func (g *game) Update() error {
	// ...

	// Handle jump.
	if inpututil.IsKeyJustPressed(ebiten.KeySpace) {
		g.playSound("flap", 1)
		g.playerVelocity = -0.8
	}

    // ...
}

Stay tuned for the next tutorial, Handling Mouse Input Using Ebitengine.

Please consider donating if you found this tutorial helpful.