Tutorial: Other ways to chipmunkify your voice

There are dozens of “Talking” apps on the iPhone app store, as I’ve mentioned before. Basically what those apps do is, you say something, and then the animal will repeat it, in this chipmunk like voice. But even they are different apps, and they are certainly different animals (hippo, bird, cat, giraffe, duh), some of them share the same voices! Why does that adorable hippo sound like the cat?!

The solution I posted in my previous blog is simply use CocosDenshion to manipulate the recorded voice (your voice), to a higher or lower pitch to produce the voice of the animal (chipmunk). But the flaw of tha solution is that if you change the pitch, you are also changing the speed. So if you lower the pitch, you get this really low voice that is being played really sloooow.

And I don’t want that I want to change the pitch but not change the speed. So I need a different solution. And the solution is Dirac3.

According to its website:

DIRAC redefines the limits of what todays’ technology can do for your application if you want to change the speed and pitch of music independently without sacrificing quality. Used by leading hi-end audio processing applications and custom built solutions in studios around the world, DIRAC is our critically acclaimed time stretching and pitch shifting technology that can be applied both musically monophonic and polyphonic signals with equal ease and success. Its straightforward API makes it an ideal tool for any software project that requires time stretching and pitch shifting.

Basically Dirac allows you to change the pitch of your audio, without speeding it up or slowing it down.

Dirac 3 has a free version, called Dirac LE, which you can simply download from their website: http://dirac.dspdimension.com/files/DIRAC3LE.zip Dirac LE is also available iPhone/iPad ARM 6 and 7 compliant (Xcode, iOS 3.2+ and iOS4+).

Okay, download Dirac LE, and then let’s get started (oh, I am setting up mine as I write this blog post, as well).

According to the “iPhone ReadMe (or suffer).rtf” that came with the zip file, we need to include the vecLib/Accelerate frameworks to your project. Go to Frameworks, right click, add Existing Frameworks, and then look for “Accelerate.framework”, add. Oh, and any file that will contain Dirac calls need to be .mm, instead of .m.

And then Add Exsiting Files, add “Dirac.h” and “libDIRAC_iOS4-fat.a” to your project.

I will be using the Time Stretching Example as my guide, (it’s also in the zip file). Oh, the zip file also contains a 32 page pdf file explaining Dirac.

From the Time Stretching Example, copy the files in ExtAudioFile folder, EAFRead.h, EAFRead.mm, EAFWrite.h, EAFWrite.mm. These are the files Dirac will use to read and write audio files.

And then we create a new file, I’m calling it AudioProcessor.mm, take note it’s .mm, because it will be calling Dirac. And basically I just copy pasted most of the code from iPhoneTestAppDelegate of the example. (Guilty for being a copy paster coder).

And then edit some stuff, so AudioProcessor.h:

#import <Foundation/Foundation.h>
#import “EAFRead.h”
#import “EAFWrite.h”

@interface AudioProcessor : NSObject <AVAudioPlayerDelegate>
{    AVAudioPlayer *player;
    float percent;

    NSURL *inUrl;
    NSURL *outUrl;
    EAFRead *reader;
    EAFWrite *writer;
}

@property (readonly) EAFRead *reader;

@end

And then edit some more stuff, AudioProcessor.mm:

#include “Dirac.h”
#include <stdio.h>
#include <sys/time.h>

#import <AVFoundation/AVAudioPlayer.h>
#import <AVFoundation/AVFoundation.h>

#import “AudioProcessor.h”
#import “EAFRead.h”
#import “EAFWrite.h”

double gExecTimeTotal = 0.;

void DeallocateAudioBuffer(float **audio, int numChannels)
{
    if (!audio) return;
    for (long v = 0; v < numChannels; v++) {
        if (audio[v]) {
            free(audio[v]);
            audio[v] = NULL;
        }
    }
    free(audio);
    audio = NULL;
}

float **AllocateAudioBuffer(int numChannels, int numFrames)
{
    // Allocate buffer for output
    float **audio = (float**)malloc(numChannels*sizeof(float*));
    if (!audio) return NULL;
    memset(audio, 0, numChannels*sizeof(float*));
    for (long v = 0; v < numChannels; v++) {
        audio[v] = (float*)malloc(numFrames*sizeof(float));
        if (!audio[v]) {
            DeallocateAudioBuffer(audio, numChannels);
            return NULL;
        }
        else memset(audio[v], 0, numFrames*sizeof(float));
    }
    return audio;
}

/*
This is the callback function that supplies data from the input stream/file whenever needed.
It should be implemented in your software by a routine that gets data from the input/buffers.
The read requests are *always* consecutive, ie. the routine will never have to supply data out
of order.
*/
long myReadData(float **chdata, long numFrames, void *userData)
{
    // The userData parameter can be used to pass information about the caller (for example, “self”) to
    // the callback so it can manage its audio streams.
    if (!chdata)    return 0;

    AudioProcessor *Self = (AudioProcessor*)userData;
    if (!Self)    return 0;

    // we want to exclude the time it takes to read in the data from disk or memory, so we stop the clock until
    // we’ve read in the requested amount of data
    gExecTimeTotal += DiracClockTimeSeconds();

// ……………………….. stop timer ……………………………………

    OSStatus err = [Self.reader readFloatsConsecutive:numFrames intoArray:chdata];

    DiracStartClock();

// ……………………….. start timer ……………………………………

    return err;

}

@implementation AudioProcessor

@synthesize reader;

-(void)playOnMainThread:(id)param
{
    NSError *error = nil;

    player = [[AVAudioPlayer alloc] initWithContentsOfURL:outUrl error:&error];
    if (error)
        NSLog(@”AVAudioPlayer error %@, %@”, error, [error userInfo]);

    player.delegate = self;
    [player play];
}

-(void)processThread:(id)param
{
    NSLog(@”processThread”);
    NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

    long numChannels = 1;        // DIRAC LE allows mono only
    float sampleRate = 44100.;

    // open input file
    [reader openFileForRead:inUrl sr:sampleRate channels:numChannels];

    // create output file (overwrite if exists)
    [writer openFileForWrite:outUrl sr:sampleRate channels:numChannels wordLength:16 type:kAudioFileAIFFType];

    // DIRAC parameters
    // Here we set our time an pitch manipulation values
    float time      = 1.0;
    float pitch     = 1.0;
    float formant   = 1.0;

    // First we set up DIRAC to process numChannels of audio at 44.1kHz
    // N.b.: The fastest option is kDiracLambdaPreview / kDiracQualityPreview, best is kDiracLambda3, kDiracQualityBest
    // The probably best *default* option for general purpose signals is kDiracLambda3 / kDiracQualityGood
    void *dirac = DiracCreate(kDiracLambdaPreview, kDiracQualityPreview, numChannels, 44100., &myReadData, (void*)self);
    //    void *dirac = DiracCreate(kDiracLambda3, kDiracQualityBest, numChannels, 44100., &myReadData);
    if (!dirac) {
        printf(“!! ERROR !!nntCould not create DIRAC instancentCheck number of channels and sample rate!n”);
        printf(“ntNote that the free DIRAC LE library supports onlyntone channel per instancennn”);
        exit(-1);
    }

    // Pass the values to our DIRAC instance
    DiracSetProperty(kDiracPropertyTimeFactor, time, dirac);
    DiracSetProperty(kDiracPropertyPitchFactor, pitch, dirac);
    DiracSetProperty(kDiracPropertyFormantFactor, formant, dirac);

    // upshifting pitch will be slower, so in this case we’ll enable constant CPU pitch shifting
    if (pitch > 1.0)
        DiracSetProperty(kDiracPropertyUseConstantCpuPitchShift, 1, dirac);

    // Print our settings to the console
    DiracPrintSettings(dirac);

    NSLog(@”Running DIRAC version %snStarting processing”, DiracVersion());

    // Get the number of frames from the file to display our simplistic progress bar
    SInt64 numf = [reader fileNumFrames];
    SInt64 outframes = 0;
    SInt64 newOutframe = numf*time;
    long lastPercent = -1;
    percent = 0;

    // This is an arbitrary number of frames per call. Change as you see fit
    long numFrames = 8192;

    // Allocate buffer for output
    float **audio = AllocateAudioBuffer(numChannels, numFrames);

    double bavg = 0;

    // MAIN PROCESSING LOOP STARTS HERE
    for(;;) {

        // Display ASCII style “progress bar”
        percent = 100.f*(double)outframes / (double)newOutframe;
        long ipercent = percent;
        if (lastPercent != percent) {
            //[self performSelectorOnMainThread:@selector(updateBarOnMainThread:) withObject:self waitUntilDone:NO];
            printf(“rProgress: %3i%% [%-40s] “, ipercent, &”||||||||||||||||||||||||||||||||||||||||”[40 – ((ipercent>100)?40:(2*ipercent/5))] );
            lastPercent = ipercent;
            fflush(stdout);
        }

        DiracStartClock();

// ……………………….. start timer ……………………………………

        // Call the DIRAC process function with current time and pitch settings
        // Returns: the number of frames in audio
        long ret = DiracProcess(audio, numFrames, dirac);
        bavg += (numFrames/sampleRate);
        gExecTimeTotal += DiracClockTimeSeconds();

// ……………………….. stop timer ……………………………………

        printf(“x realtime = %3.3f : 1 (DSP only), CPU load (peak, DSP+disk): %3.2f%%n”, bavg/gExecTimeTotal, DiracPeakCpuUsagePercent(dirac));

        // Process only as many frames as needed
        long framesToWrite = numFrames;
        unsigned long nextWrite = outframes + numFrames;
        if (nextWrite > newOutframe) framesToWrite = numFrames – nextWrite + newOutframe;
        if (framesToWrite < 0) framesToWrite = 0;

        // Write the data to the output file
        [writer writeFloats:framesToWrite fromArray:audio];

        // Increase our counter for the progress bar
        outframes += numFrames;

        // As soon as we’ve written enough frames we exit the main loop
        if (ret <= 0) break;
    }

    percent = 100;
    //[self performSelectorOnMainThread:@selector(updateBarOnMainThread:) withObject:self waitUntilDone:NO];


    // Free buffer for output
    DeallocateAudioBuffer(audio, numChannels);

    // destroy DIRAC instance
    DiracDestroy( dirac );

    // Done!
    NSLog(@”nDone!”);

    [reader release];
    [writer release]; // important – flushes data to file

    // start playback on main thread
    [self performSelectorOnMainThread:@selector(playOnMainThread:) withObject:self waitUntilDone:NO];

    [pool release];
}

– (void)audioPlayerDidFinishPlaying:(AVAudioPlayer *)player successfully:(BOOL)flag
{
}

– (void) initAudioProcessor: (NSURL*) filePath
{   NSLog(@”initAudioProcessor”);

    //NSString *inputSound = [[[NSBundle mainBundle] pathForResource: @”voice” ofType: @”aif”] retain];
    NSString *outputSound = [[[NSHomeDirectory() stringByAppendingString:@”/Documents/”] stringByAppendingString:@”out.aif”] retain];
    inUrl = [filePath retain];
    outUrl = [[NSURL fileURLWithPath:outputSound] retain];
    reader = [[EAFRead alloc] init];
    writer = [[EAFWrite alloc] init];

    // this thread does the processing
    [NSThread detachNewThreadSelector:@selector(processThread:) toTarget:self withObject:nil];
}

– (void)dealloc
{
    [player release];
    [inUrl release];
    [outUrl release];

    [super dealloc];
}

In my AudioController (code in previous blog), create a AudioProcessor object:

AudioProcessor *audioProcessor = [[AudioProcessor alloc] init];
[audioProcessor initAudioProcessor: [NSString stringWithFormat: @”%@”, recordedTmpFiles[recordedTmpFileIdx]]];

recordedTmpFiles[recordedTmpFileIdx] is the filepath of the recorded audio.

Just adjust float pitch, to change the well, pitch of your voice. For a chipmunky voice, set the pitch to > 1, for a low voice set the pitch to < 1.

And that’s it 🙂

Tutorial: Other ways to chipmunkify your voice

Published by Michelle Chen

Leave a comment Cancel reply

Share this:

Related

Published by Michelle Chen

Leave a comment Cancel reply