XNA Game Studio 4.0 : Dynamic Sound Effects - Recording Audio with a Microphone, Generating Dynamic Sound Effects

7/24/2012 5:48:50 PM

Not all sounds have to come from a prerecorded file. New to XNA Game Studio 4.0 are the capabilities to record microphone data using the new Microphone class and playback of dynamic sound effect data using the new DynamicSoundEffectInstance class.

Recording Audio with a Microphone

The new Microphone type is supported on all of the XNA Game Studio 4.0 platforms. On Windows, this is any of the recording devices recognized by Windows including the Xbox 360 headset. Make sure the microphone is set up in Windows before trying to record using the XNA APIs. Xbox 360 supports the headsets along with the Xbox 360 wireless microphone. Windows Phone 7 supports reading the built in microphone as well as Bluetooth headsets.

Enumerating Microphones

To enumerate all of the recording devices available to your game, you can use the Microphone.All property. This returns a ReadOnlyCollection of Microphone objects. Use the Microphone.Name field to determine the friendly name of the recording device. For example, the Xbox 360 wired headset connected to the Xbox 360 wired controller on a Windows 7 machine returns Headset Microphone (Headset (XBOX 360 For Windows)).

You can use the Microphone.IsHeadset property to determine whether a recording device is a wired headset or a Bluetooth hands-free headset.

Most of the time, you will not want to enumerate all of the recording devices and will just want to use the default Microphone that the user or platform has selected. You can use the Microphone.Default property to easily select the preferred default recording device for the particular platform.

Microphone microphone = Microphone.Default;

The Microphone has a specific state, which is returned by the Microphone.State property. The state is returned as a MicrophoneState enumeration, which has the values of Started and Stopped. The Microphone starts its State in the Stopped state. To start the recording process, you must call the Microphone.Start method. After the Microphone starts recording, it gathers buffered data that can then be used for playback. Use the Microphone.Stop method to stop the gathering of recording data.

Reading Microphone Data

After the Microphone starts, the internal buffer starts to fill up with the recorded values received from the recording device. The internal buffer size of the Microphone is exposed by the Microphone.BufferDuration property. This property can be used to read and set the buffer duration to use for recording. The value is specified as a TimeSpan and must be between 100 and 1000 milliseconds.

Note

The Microphone.GetSampleDuration method can be used to determine the TimeSpan a given sample buffer size supports.

The game needs a byte array buffer to store the data returned from the Microphone. To determine the size of the buffer, the Microphone.GetSampleSizeInBytes method calculates the required buffer size given a TimeSpan. The buffer should be large enough to hold all of the data the Microphone returns. If you use the Microphone.BufferDuration property as the parameter passed to GetSampleSizeInBytes, the size returned is equal to the internal buffer size:

int bufferSize = microphone.GetSampleSizeInBytes(microphone.BufferDuration);
byte[] buffer = new byte[bufferSize];

Now you have a buffer to read the Microphone data. While the Microphone is recording, an internal buffer continually updates with the latest data from the physical recording device. Access this internal buffer by using the Microphone.GetData method. This method returns the current buffer since the last GetData call. You can access the buffer in two different ways. You can read the current buffer each frame, which returns a varying amount of data depending on how long the last frame took:

int returnedData = microphone.GetData(buffer);

In this case, the GetData method returns the size of data updated in the buffer.

Note

A GetData overload enables you to specify an offset and size to use when filling the provided buffer.

The other way to use the GetData method is to only request the data after the internal buffer is full. The Microphone.BufferReady event is raised after the internal buffer is full and a full read can occur. In the BufferReady event handler, you can call GetData to access the full internal buffer, which is the same size as your created buffer if you used the BufferDuration as the parameter to the GetSampleSizeInBytes method:

microphone.BufferReady += new EventHandler<EventArgs>(OnBufferReady);
public void OnBufferReady(object sender, EventArgs args)
{
    // Read entire buffer
    microphone.GetData(buffer);

    // Process data
}

Playback Using DynamicSoundEffectInstance

Now that you know how to select a microphone, start recording, and read the recorded buffer data, let’s play back the audio as it is recorded from the Microphone.

To play the recorded buffer data, use the new DynamicSoundEffectInstance class. DynamicSoundEffectInstance is derived from SoundEffectInstance. Like SoundEffectInstance, it allows for basic playback controls such as Play, Pause, Stop, and Resume. It also has properties for the Pitch, Pan, and Volume.

Add the following member variables to your game class to store the Microphone, buffer to store the recorded data, and DynamicSoundEffectInstance to play the recorded data:

Microphone microphone;
byte[] buffer;
DynamicSoundEffectInstance dynamicSoundEffectInstance;

In the game’s Initialize method, add the following lines of code:

// Request the default recording device
microphone = Microphone.Default;
// Calculate the size of the recording buffer
int bufferSize = microphone.GetSampleSizeInBytes(microphone.BufferDuration);
buffer = new byte[bufferSize];
// Subscribe to ready event
microphone.BufferReady += new EventHandler<EventArgs>(OnBufferReady);
// Start microphone recording
microphone.Start();

// Create new dynamic sound effect to playback microphone data
dynamicSoundEffectInstance =
    new DynamicSoundEffectInstance(microphone.SampleRate,
                                    AudioChannels.Mono);
// Start playing
dynamicSoundEffectInstance.Play();

After DynamicSoundEffectInstance constructor takes two parameters, the first sampleRate takes the sample rate in Hertz of the audio content that passes to the DynamicSoundEffectInstance. The second parameter channel takes an AudioChannels enumeration value of Mono or Stereo. Because the microphone records in mono, use the Mono enumeration.

After you create the DynamicSoundEffectInstance, play the sound. The sound continues to play as you keep passing new buffer data to the instance.

Finally, you need to implement the OnBufferReady method that you passed to the Microphone.BufferReady event:

public void OnBufferReady(object sender, EventArgs args)
{
    // Read entire buffer
    microphone.GetData(buffer);
    // Send latest buffer to the dynamic sound effect
    dynamicSoundEffectInstance.SubmitBuffer(buffer);
}

This event will be called whenever the internal Microphone buffer is full and is ready to be read. The buffer is then read using the Microphone.GetData method, which fills the buffer you created. The new buffer then needs to be passed to the DynamicSoundEffect.SubmitBuffer method so playback continues.

Note

If you change the Pitch of the DynamicSoundEffectInstance while using the microphone as the source, the recording rate differs from the playback rate. Playback falls further and further behind if you lower the Pitch, or it stops and waits for more data, causing a pop sound if you raise the Pitch. Either way, it makes your voice sound cool—so try it out!

Generating Dynamic Sound Effects

Along with using the Microphone to generate dynamic data to playback, you can also create the buffer dynamically. This means your game can create new and interactive sounds that are driven by user input or interesting algorithms. Because creating dynamic audio data could be a book by itself, we cover just the basics of how to generate some simple data. Feel free to experiment and research how to create new and interesting dynamic sound effects.

What we perceive as sound is actually changes in air pressure that cause bones in our inner ear to vibrate. This vibration is what we call sound. To generate sound, we need to create this change in air pressure. Most electronics use speakers to generate the change. Speakers convert a digital signal into sound by vibrating a diaphragm, which in turn vibrates the air around us.

The data that you generate also needs to cause a vibration. There are many different ways to generate vibration waves, but one of the easiest is using the sine wave.

Now let’s generate some tones. Add the following member variables to your game:

const int SampleRate = 48000;
DynamicSoundEffectInstance dynamicSoundEffectInstance;
byte[] buffer;
int bufferSize;

// Frequency to generate
double frequency = 200;
// Counter to mark where we are in a wave
int totalTime = 0;

The SampleRate is the amount of samples that play in a second. The frequency is the amount of times the speaker vibrates per second. The faster the frequency, the higher pitch the tone sounds. Use the totalTime value to store where in the sine wave you are currently. The sine wave oscillates between –1 and 1 over the source of a period of two Pi.

Next in your games Initialize method, add the following lines of code:

// Create new dynamic sound effect and start playback
dynamicSoundEffectInstance = new DynamicSoundEffectInstance(SampleRate,
AudioChannels.Mono);
dynamicSoundEffectInstance.BufferNeeded += new
EventHandler<EventArgs>(OnBufferNeeded);
dynamicSoundEffectInstance.Play();

// Calculate the buffer size to hold 1 second
bufferSize =
dynamicSoundEffectInstance.GetSampleSizeInBytes(TimeSpan.FromSeconds(1));
buffer = new byte[bufferSize];

The previous code creates a new DynamicSoundEffectInstance with the previously defined SampleRate and only a single Mono audio channel. Use the BufferNeeded event to signal when the DynamicSoundEffectInstance is in need of more data to playback. Finally, start the instance by using the Play method.

Note

Stereo requires twice the amount of data to feed the left and right channels.

Calculate the bufferSize by using the GetSampleSizeInBytes method to determine how much data is needed for a single second.

The final section of code is what generates the tone buffer to play. Add the following method to your game:

// Generate sound when needed
void OnBufferNeeded(object sender, EventArgs e)
{
    // Loop over entire buffer
    for (int i = 0; i < bufferSize - 1; i += 2)
    {
        // Calculate where we are in the wave
        double time = (double)totalTime / (double)SampleRate;
        // Generate the tone using a sine wave
        short currentSample = (short)(Math.Sin(2 * Math.PI * frequency * time) *
                                     (double)short.MaxValue);

        // Store the generated short value in byte array
        buffer[i] = (byte)(currentSample & 0xFF);
        buffer[i + 1] = (byte)(currentSample >> 8);

        // Incrament the current time
        totalTime += 2;
    }

    // Submit the buffer for playback
    dynamicSoundEffectInstance.SubmitBuffer(buffer);
}

To generate the tone, loop over the entire buffer. Although the buffer is in bytes, the data for each channel of the DynamicSoundEffectInstance is 16 bits or a short. Because you use Mono, each loop covers two of the bytes in the buffer.

In each iteration of the loop, the short value for the sine wave is calculated. This calculation takes into account both the current time and the frequency. The time has to take into account how many samples occur per second because you want the frequency value to also be in hertz. The short value calculated from the Math.Sin method is in the range of –1 to 1. Because you want the value between the minimum and maximum values for a short, multiply the value by short.MaxValue.

The generated short value then needs to be broken into the high and low order bytes so it can be added to the buffer array. The time value is then incremented by two because you move through the array by two.

The final step is to call SubmitBuffer on the DynamicSoundEffectInstance to supply it with the latest buffer. If you run the sample code, you hear a nice solid tone.

Although the sine wave is a simple example of how to generate dynamic sound effects, it is the building block for many more complex effects. Spend some time trying to generate other types of sound waves and even mix them together.

Other -----------------

- XNA Game Studio 4.0 : Playing Sound Effects (part 2) - Microsoft Cross-Platform Audio Creations Tool

- XNA Game Studio 4.0 : Playing Sound Effects (part 1) - Using SoundEffect for Audio Playback

- Windows Phone 7 : Using MVVM and Performing Unit Testing

- Windows Phone 7 : Implementing MVVM on Windows Phone by Using MVVMLight

- Windows Phone 7 : In the Cloud - Creating a Feed Reader

- Windows Phone 7 : In the Cloud - Interacting with WCF

- Windows Phone 7 : Isolated Storage - Saving a Photo in Isolated Storage (part 2)

- Windows Phone 7 : Isolated Storage - Saving a Photo in Isolated Storage (part 1)

- Windows Phone 7 : Isolated Storage - Modifying Settings of Your Application

- Windows Phone 7 : Isolated Storage - Saving Serialized Data