Being a cowgirl at heart, I know that in some areas of the country ranchers can run their cattle herds anywhere they want unless a property owner fences them out, otherwise known as ‘Open Range.’ So you may be thinking to yourself… what in the heck does this have to do with NPC’s or better known as Non-Player Characters? Well… in a virtual world one can have free roaming (otherwise referred to as ‘open range’) NPC’s wandering around. The approach of a controlled or intelligent virtual world environment offers the use of Non-Player Characters to guide the trainee through the simulation exercises while NOT allowing these NPC’s to wander around in the ‘open range’ if you will. Rather these NPC’s have specific purposes to their actions and interactions with the trainees. Virtual world training inside a smart fence, so to speak.
By Ed Heinbockel, President and CEO, Visual Purple, LLC
I keep seeing a myriad of speculation on what Phillip Rosedale is working on within his newly founded company, The Love Machine. Out of the three projects mentioned on the company website, perhaps the third project is the most intriguing “The Brain. Can 10,000 computers become a person?” With little other information forthcoming one can only conjecture what Phillip and his team are actually up to with respect to Artificial Intelligence (AI) in virtual worlds.
The following are a few short remarks I would make on the subject: Hopefully more personality will be able to be infused into VW’s rather than just the run of the mill boring NPC’s and bots that exist out there today. Sure some NPC’s that are stationed in VW’s today provide no more stimuli to the player than talking to a brick wall. I say AI very well could be the next masterpiece to be infused into the virtual world. The caliber of content could be brought to new levels with the revelations that artificially intelligent NPC’s could bring to the table. Although many skeptics remain…Would AI in a virtual world feel more like a glorified ‘smart’ chat room? Is Artificial Intelligence really possible in a virtual world environment? Is there a real possibility of actually confusing another player that is non-human for a real human being? Will these potential AI characters actually be believable enough? Is there a timeline horizon when implementation may take place?
Sure the future of AI in virtual worlds may lie in the hands of Non-Player Characters (NPCs), but what this will actually look and play like remains unknown. This currently untapped market will more than likely see some interesting developments here in the near future. It is my belief that there are currently some covert operations of bringing AI into virtual worlds that just aren’t ready to be publicized yet. Will this new form of ‘humanoid’ be believable enough? Or will humans and AI “lifeforms” exist in the same worldly sphere. Although the inherent complexities of the virtual world are nothing like the real world it still brings a sense of awe to realize that Artificial Intelligence just may in-fact be possible, ultimately increasing the learning opportunities with virtual human interaction. We may someday ask ourselves, is that a real person or a ‘Smart’ NPC…hmmm, now wouldn’t THAT be interesting. Stay tuned…
By Rudy Helm, Audio and Quality Assurance Tech, Visual Purple, LLC.
Another element to this task is to lengthen or shorten the TTS words to match the blobs of the human model. Figure 5 depicts the effort to make the TTS utterance of ‘…was a…’ (pronounced as though a contraction, ‘whuzza’) line up on the timeline with John’s clip. Use your DAW’s stretch tool to accomplish this.
Figure 5a- First, make your split points
Figure 5b- Next, use a stretch tool
Let’s continue splitting the TTS clip’s timelines so that we can move each corresponding sound blobs to match, and stretch the words right down to the syllable (Figure 6 shows what it looks like when all words have been synced). Listen to the whole joke, both voices lined up properly.
Here’s where some of you are thinking: Well, the blobs are lined up very nicely, but what about nuances regarding stress and pitch? Isn’t the word ‘lawyer’ as expressed by our human friend, John, not being expressed similarly? John’s lawyer blob is larger (i.e., louder) than the TTS blob. Also, isn’t the word ‘seen’ as expressed by John (in this case the stress is caused not by volume but by its pitch being higher, relatively, from the rest of the phrase) not being emulated by the synthetic actor?
Yes, indeed, so let’s try to fix these two issues. We’ll tackle the loudness point first. Figure 7 shows a Volume Envelope (the horizontal blue-ish line running through the center of the TTS clip in the timeline). With most DAWs with this feature, you can bend the volume envelope to cause increases or decreases in the audio.
Figure 7 – Creating break points within the line bends the envelope
Now let’s tackle the pitch issue with that word, ‘seen’. Figure 8 shows the clip properties dialog box specific to the split-off region of our seen-blob. The highlighted value indicates that the word pitch has been raised four half steps.
Listen to the resulting TTS clip with the treatments per ‘lawyer’ and ‘seen’.
Earlier I mentioned that this is a voice for a talking fish. This fish is contained within a fish tank in a hotel bar. Listen to our talking fish enveloped in a bubbling sound effect. Figure 9 shows the TTS clip, sans John’s clip, and with the fish tank noise clip added.
Figure 9 – Note that a volume envelope has been applied to the bubbles as well.
So, is that it, then? Maybe – maybe not. As if we really did want to add some reality to a talking fish environment, we might consider what we know about how a fish tank effects sound. Occlusion happens. There is a glass barrier between the sound emitter (the talking fish) and the sound receiver (the avatar). So, we could elect to shave off some of the high frequencies from our talking fish. We can accomplish this by choosing the appropriate reverb effect. If you have presets at your disposal, start with a bathroom preset or similar. Try placing the reverb effect before any equalization effects (EQ). We use EQ here to bring out the hi-mid frequencies of the voice to ensure that it is intelligible (you may need to reduce high frequencies as well if you choose a reverb preset that sounds too bright). In this case we are also deploying EQ to remove extreme low frequency rumble (artifacts that commonly get accidentally introduced when using filters in the digital domain). Figure 9 shows this idea. Have a listen to the result.
Figure 10a – Software ‘bathroom’ reverb
Figure 10b – Software EQ module
Can synthetic voice-actors make funny? Humor is a very subjective aspect of human emotion. What’s funny to Samuel isn’t so funny to Mary, and so forth. So maybe the jury is still out on that one. To improve our NPC’s delivery, we’ve had to rely on 3rd-party software to ensure that techniques were carefully deployed. Markup language deployment probably won’t be sufficient for specific tasks like this, where real-time interaction is not a requirement. That’s my best guess, anyway.
You may wonder what to do if you have a project that requires an ensemble of funny voices. Well, as long as you have at least one funny human available to you, that person can be your model for all voices. Then your cast of synthetic actors can be molded to conform to your model’s comedic timing.
How about this scenario: you have a cinematic cut-scene where there are several actors in the movie (or trailer). But your budget can only afford one human voice-actor. Consider recording your one voice actor doing the roles of the entire cast. Then, using the techniques discussed above, create an ensemble of TTS voices and synchronize them in your video editor (NLE) to the synthetic voices to the phrasings and expressions of your one human actor.
In fact, maybe we’ll try to tackle an example of that in my next blog entry. Stay tuned!