Friday 17 October 2008

Sample to Sample Accuracy

In audio terms a sample refers to a single element of an audio signal. It contains a value and a duration to represent that signal at that given moment. This Harmony Central article explains the need for sample to sample accuracy in great detail.

To summarise it quickly I have put together a simple example.


The image above shows two audio signals, one on top in green and one on the bottom in yellow. Lets say these signals play together as a section of a song, the top being the drums and the bottom being the bass, continuously looping. It is obvious from the image that the bottom signal is shorter than the top but if you look at the timeline it is actually only shorter by five hundredths of a second - inaudible!

While not noticeable on its own, after these samples have looped a hundred times that discrepancy (shown by the red ) will become one millisecond. Slowly the more each signal loops they will grow further and further out of time. If the signals had the exact same amount of samples then they would NEVER go out of time. This is why sample to sample accuracy is required when dealing with looping audio.

Wednesday 15 October 2008

Programming Progress

Over the past couple of days I've been tinkering with some programming, trying to get to grips with certain processes. Since I decided to look into tool development for this project I turned to C# and windows forms. I had a bash with them and it was so easy to knock some stuff together - the new guide system that helps you position your elements on the form is fantastic!

I know some stuff about C# but to develop any sort of audio application I really want to use FMOD as I'm so comfortable with it. There is an FMOD wrapper for C# but my familiarity with it lies in C++. After having a glance over FMOD# and its appalling lack of documentation I started to go off the idea of using C# at all until I read about using unmanaged code in it (Visual C# Unleashed 2005). This book explained how to load an unmanaged DLL into C# which would allow me to write my audio engine completely in C++, abstracting all but the necessary data which could be accessed through the exported DLL functions.

Sounded too good to be true so I began to do some simple tests. While I learned the basics quickly these simple tests didn't really give much to work with so I decided to take my growing networking module code and adapt it round a DLL wrapper. This was perfect. The code was complicated enough to give a good result (multiple threads, dynamic memory allocation etc.) and didn't need any extra work so I could get straight into C#.

So I defined my DLL export functions, loaded it into C# and Bobs your uncle, we have C++ being called from C#. I even managed to figure out how to debug in mixed mode so I could see what the values were in the C# and C++ code in the same project. All good things must come to an end though as I soon discovered a threading problem. Windows forms are very picky about threads. From what I've discovered, any thread you want to run in a windows form that requires access to UI elements needs to be created by the thread that made them ( see here ). This is also known as a single threaded apartment environment.

I deduced that this must be related to the thread I was running in my DLL to check for received packets over the network. I was wrong obviously as this thread doesn't need access to the forms ui. It turned out to be an issue with using BeginInvoke instead of just Invoke in the C# check-for-new-packets thread. I don't quite understand yet why BeginInvoke causes a deadlock until a new packet is received as all the research I've done on the two seems to indicate that it shouldn't. Hopefully I will find the answer someday.

Worksheet 2 Explained

I like the way this year is set up in regards to peer review. It really helps you to refine an idea when you get so much feedback. On the other hand, giving your own feedback to others certainly opens the mind letting you think "outside the box".

So what was worksheet 2 about? In worksheet 2 I had to formulate a research question for my project and detail how I would go about answering it. Now, surprisingly (or perhaps unsurprisingly, depending on your age and outlook) adaptive music in computer games has definitely been around a long time. Looking back to what people have produced not just in the industry but also in previous years of CGT I have decided it would be wise to approach this subject indirectly. Still keeping the theme of adaptive music but not having that as the only focus. Without further delay, here is my second initial question ( first was rubbish :P )

"What aspects present in traditional audio engineering applications can be tailored for use in tools specifically designed for developing adaptive music in computer games?"

... say what?

Let's run through it.

"What aspects present in traditional audio engineering applications ... ". Skipping the aspects portion for a minute let us look at the traditional audio engineering applications bit. What is a "traditional" audio engineering application? Back in the day, audio engineers would use giant mixing desks, complicated systems of wiring and controllers etc. but with modern processing power the average joe can simulate this with a digital audio workstation. These workstations would run all the hardware related tasks of audio engineering but in software. According to Wikipedia this concept was made popular in 1987 with a package created by Digidesign that later went on to become the industry standard, Pro Tools.

What aspects of applications such as Pro Tools, Sound Forge, ACID pro etc. am I talking about? There are lots of features in these programs that make them successful (too many to look into, list or even think about), so which ones would benefit the development of adaptive music? It would be silly to try and recreate something like Logic Audio, considering that it has been around for years, gone through many iterations and is incredibly stable. These applications are generally used for music generation and because of their long running development are very effective at this. Leading me to believe that computer game development tools tend not to be a replacement for these programs but more of an extension, trying to make the content to engine pipeline a lot smoother. People still use Max and Maya, even though they are not solely geared towards games. So to ascertain what aspects of audio engineering applications would benefit the development of adaptive music a look into current game audio tools is needed.

There are a fair few tools out there for game audio design, although from what I've seen very few actually "create" the music from scratch (extension of existing applications, remember). One that I hope to single out for my project is a tool mentioned in the previous post, Wwise. Wwise is a fantastic design tool that once the programmer has integrated the framework into the engine the audio content creator has full control over every part of the audio in game. Wwise is a complete audio design tool but the majority of it will be disregarded as I am only interested in the adaptive music section. Here you can load in wavs, structure them, assign their playback to game triggers, control their volume etc. all really cool stuff. Aside from assigning playback to game triggers this functionality is very familiar to programs such as ACID pro, Pro Tools, Sound Forge. The crunch here is what can tools like Wwise learn from these well developed applications to make producing adaptive music more accessible/efficient?

I cannot say exactly at this moment in time what would or wouldn't benefit tools like Wwise but from personal experiences here is what I feel is a drawback. Currently, any sample brought into Wwise must be of the same length as any other sample that it is to loop with to ensure sample to sample accuracy. Reason3 certainly does not have this problem. As long as the samples are in the same time signature, bbm and their length is in multiples of a bar length then you're good to go. Surely this should be able to work in Wwise. ACID pro bypasses the bbm constraint by dynamically time stretching the sample to fit and Pro Tools can even bypass the time signature by dynamically finding the beats and moving them! Surely these aspects could be put to good use in adaptive audio development as they would remove a fair amount of constraints on the composer. These are just personal examples that I feel are important but obviously more research will need to be completed before a more educated example can be given. There might be a very good reason for the sample to sample accuracy :D