Raytraced Audio System

To improve audio realism, raytracing is used to determine the size of the room around the listener, as well as how occluded each sound is from the listener's position. Sounds on the other side of a wall should sound muffled, gunfire in direct line of sight should be clear, and sounds in a long tunnel should echo.

High Level Overview

Hundreds of rays are sent outwards from the listener, bouncing off up to eight surfaces while checking for line of sight with each active sound source.

A low pass filter is used to muffle sound sources that only a few rays can reach, while sounds that are easily found remain clear. This is referred to as occlusion.

The amount of blocks between the sound source and each ray is also used to determine the strength of the low pass filter. This is referred to as permeation, e.g. thin walls will slightly muffle sounds, and 6 layers of concrete will strongly muffle sounds.

The total distance that each ray travels after bouncing eight times is used to determine the size of the room, which affects the strength of reverb effects.

Each time a ray bounces off a surface the game checks if there is open sky directly above, or if the surface is underwater. This determines the direction that ambient wind and flowing water sounds come from.

Note that this system does not require an RTX graphics card. A multi-core CPU is all you need!

Muffling / Occlusion

When an enemy fires a gun, rays are cast outwards from the listener to determine how occluded the enemy is from the listener. This happens before the sound starts playing, so the correct low pass filter can be applied from the very beginning.

The scenario below is a top-down view of an indoor room, which has:

  • A listener (blue)
  • Three enemies (red) that are making noise, e.g. shooting a gun, walking around or talking in proximity voice chat
  • Walls (light grey)

Enemy 1 should be clearly audible, enemy 2 should be slightly muffled, and enemy 3 should be moderately muffled.

A line-of-sight test would equally muffle both enemies 2 and 3, which is not accurate in this scenario. Instead, 484 rays are cast outwards in different directions to determine a more accurate occlusion value.

The image below shows the full path of one ray (white) as it bounces three times, each time checking for line of sight (LOS) with the third enemy. The red lines indicate failed LOS checks, and the green line indicates a successful LOS check.

Note that it does not check LOS on the 3rd bounce (indicated by the dotted yellow line) since the surface of the wall is facing away from the enemy, which guarantees that a LOS check will fail.

Calculating Muffling Strength

It is rare that all rays will achieve LOS with a sound source, so the occlusion value must scale exponentially with the total amount of ray 'energy' that reaches the sound source. Rays lose energy each time they bounce off a surface:

energy = sqrt(1.0 / max(1, bounceCount))

The percentage of energy that reaches the sound source is transformed into an occlusion value using this formula:

occlusion = 2.718 ^ (40 * (1 - energy) - 40)

This graph shows that a sound source only needs to receive 20% of the initial energy of all rays to sound clear. Below 20%, as less energy reaches the sound source, the sound becomes more muffled.

This occlusion value is stored as a rolling average and is continually updated on another thread as the environment around the player changes. This means a long explosion sound will become clear when it destroys a wall that was previously blocking line of sight with the listener.

Reverb

The size of the room around the player affects the strength of reverb, which applies to all sounds. As the room gets smaller, the echo becomes louder and lasts longer.

To determine the room size, the distance between the position of each ray's 8th bounce and the listener is calculated, then raised to the power of 4, then stored in a rolling average. The average of all ray distances is then divided by 6,000,000 (maximum room size) to determine a value between 0 and 1, where 0 = max echo.

The distance is raised to the power of 4 to present false 'small room' positives, e.g. a small room with a doorway shouldn't have strong reverb. If a few rays leave through the doorway and travel a large distance, they will increase the calculated room size, which reduces the final reverb strength.

Directional Ambience

Each time a ray bounces off a surface, the game checks if the surface is 'outside' by casting another ray upwards. If it reaches the skybox (the top of the map), the player should hear ambient wind from the direction that the ray was initially fired.

The more times a ray bounces and the further it travels before reaching the sky, the smaller an impact it should have on the final direction of ambient sounds.

The diagram below shows a side-view of a player inside a house. The player has direct line-of-sight to the ground on the left (green), however the ground on the right requires one bounce to reach it (yellow):

Although two rays on either side reached the skybox, the player should hear ambient sounds slightly louder on the left because the green rays travelled less distance and didn't bounce.

In the example below, the rays only achieve line of sight with the skybox on the left, meaning the player will hear wind in that direction only.

Next Steps

The next articles in this series will cover:

  • loading sound files from disk in background threads
  • linking OpenAL sound sources with moving game objects
  • directional ambience - calculating rolling averages of weighted vectors
  • multithreading - ensuring all heavy processing occurs on background threads
  • the raycasting function