KVR Audio

Ap0C552 · Post by **Ap0C552** » Mon Jan 27, 2014 12:25 am

So what does AudioTrack do to improve performance. I am looking at this thread, and apparently this guy got bad performance using audiotrack as well.

http://stackoverflow.com/questions/1879 ... ack-arrghh

EDIT: More important question!

Regarding this....

The only proper way to sequence audio samples is to measure the time by the processed samples, or in other words by the audio buffers you're filling. Then send the final byte buffers to the AudioTrack class (when you're using java). E.g. when you process 192 samples in an environment that runs on 44100hz, you can calculate the passed time by 1000ms / 44100 * 192 = 4.354ms

How can I write this 192 size buffer every 4.353ms when I can only retrieve system time in format long. Do I use System.nanoTime()?

I wrote this code to play a sample continuously using audiotrack. I calculate the time using nanoSecond. Its plays the sample back quite distorted.

Code: Select all

public void run() {
							// TODO Auto-generated method stub
							
							int bufferSize=1024;
							byte[] output = new  byte[bufferSize];
							long period =bufferSize*(1000000000/44100);//in nanoSeconds
							int pos1=0;
							int pos2=0;
							
							time=System.nanoTime();
							while(play)
							{
								if(pos1<bufferSize)
								{
								  output[pos1]=sample[pos2];
								  pos1++;
								  pos2++;
								  
								  if(pos2>=sample.length)
								  {
									  pos2=0;
								  }
								}
								
								
								if(System.nanoTime()-time>period)
								{
									time=System.nanoTime();
									track.write(output, 0, bufferSize);	
									pos1=0;
								}
							}
						}

planeth · Post by **planeth** » Mon Jan 27, 2014 11:10 am

Ap0C552 wrote: How can I write this 192 size buffer every 4.353ms when I can only retrieve system time in format long. Do I use System.nanoTime()?

Forget about the time at all. You cannot use SystemTime (either in millis or nanos) to calculate the sequencer timing.

What you're trying to do is to loop over the time (process in every cycle), then to calculate the amount of samples to process. << this is the wrong approach and isn't working on any platform.

What you need to do is to define an internal buffer size (lets say 192 samples, that's just for your application and has nothing to do with the audiotrack). Then create an audio loop, and in every cycle, process your internal buffer size (192 samples) and calculate the time from them (1000ms / 44100 * 192 = 4.354ms) and update your sequencer playback time. Once you have the time calculated, you can check if you want to play anything (triggers, notes, whatever).

As you see, it's exactly the opposite from what you're doing now. And yes, the sequencer core gets a lot more complicated that way, but it's the only way to go. The one and only accurate timing is your audio stream, NOT the system time.

Finally, push your internalbuffer to your outputbytebuffer (and split your samples into bytes in advance) and then write the bytebuffer, once it's filled, to the audiotrack (the size may be much bigger than your internal buffer).

It doesn't matter if your processing cycle is finished in less time than these 4.354ms, audiotrack does only care if your cycle is taking longer than these 4.354ms.

I wished someone told me that when I started

Ap0C552 · Post by **Ap0C552** » Mon Jan 27, 2014 4:57 pm

As for my example code above, right now I was just trying to get familiar with using audio track, with a contrived example. But we can forget about that.

As for the sequencer method..

I am super appreciative of your help. I think I might understand LOL

So I want to process a buffer of some size, say 192 samples. And when you say process, you mean calculate the sound waveform based on whatever section of samples should be playing then.

Then I write that buffer to the audiotrack. Once that buffer is finished playing, I will know time has passed 4.354ms? Then I check what should be played 4.354ms later?

PS. I previously started a thread asking similar question in the DSP plugin development forum, since I thought it better belonged there.

planeth · Post by **planeth** » Tue Jan 28, 2014 7:50 am

Ap0C552 wrote: So I want to process a buffer of some size, say 192 samples. And when you say process, you mean calculate the sound waveform based on whatever section of samples should be playing then.

Exactly! What ever section of all samples together, the complete mix.

Ap0C552 wrote: Then I write that buffer to the audiotrack. Once that buffer is finished playing, I will know time has passed 4.354ms?

Yes. You'll know how much time has passed depending on the processed sample amount.

Ap0C552 wrote: Then I check what should be played 4.354ms later?

Simply said, yes.
But it's a bit more than that. If you have a sequencer running, you usually have some triggers or notes at specific time ticks (ideally calculated in nanos rather than millis). You'll definitely come to the situation when you passed these 192 samples, that the next sequencer event (e.g. a trigger) should be fired just 2.224ms later for example. If this is case, don't calculate the usual buffersize, calculate only the required amount of samples until the next event tick, in this case "samplerate / 1000 * 2.224". If you do so, you'll never have events delayed or missed by overrunning a event tick.

Ap0C552 · Post by **Ap0C552** » Tue Jan 28, 2014 3:43 pm

What about writing such a small buffer to audiotrack? Is this going to sound fine. Just wondering because AudioTrack has getMinBuffer() that returns the min buffer size to perform properly....I think.

planeth · Post by **planeth** » Tue Jan 28, 2014 5:51 pm

Ap0C552 wrote:What about writing such a small buffer to audiotrack? Is this going to sound fine. Just wondering because AudioTrack has getMinBuffer() that returns the min buffer size to perform properly....I think.

The AudioTrack itself has usually a longer buffer (its own buffer, internally), actually the size returned by AudioTrack.getMinBufferSize(...) or bigger. That's why you have to initialize it.

Your internal buffer can have a different size. The AudioTrack class takes care about that. Just write your potions, and when the AudioTrack buffer (the one inside AudioTrack) is full, it gets flushed to the native audio driver or what ever.

Ap0C552 · Post by **Ap0C552** » Tue Jan 28, 2014 8:46 pm

Here is my first attempt to implement the proper technique. I have one boolean array of length 16 full of true values, and one sample. So It should be playing 16th notes.

TEMPO=120

My period is one 16th note in nanoseconds..

Code: Select all

private void setPeriod()
	{
		
		period=(int)((1/(((double)TEMPO)/60))*1000);
		period=(period*1000000)/4;
		
		Log.i("test",String.valueOf(period));
		
	}

By bufferTime in nanoseconds is ..

Code: Select all

long bufferTime=(1000000000/SAMPLE_RATE)*buffSize;

Yet my tempo playback is extremely fast. Like 4-8 times fasters (obviously hard to tell).

This is what the play loop looks like

Code: Select all

@Override
						public void run() {
							// TODO Auto-generated method stub
							
							int buffSize=256;
							byte[] output = new  byte[buffSize];
							int pos1=0;//index for output array
							int pos2=0;//index for sample array
							long bufferTime=(1000000000/SAMPLE_RATE)*buffSize;
							long elapsed=0;
							
							currTrigger=trigger[triggerPointer];
							
							
							while(play)
							{
								//fill up the buffer
								while(pos1<buffSize)
								{
									output[pos1]=0;
									
									if(currTrigger&&pos2<sample.length)
									{
										output[pos1]=sample[pos2];
										pos2++;
									}
									pos1++;
									
									
								}
								track.write(output, 0, buffSize);
								elapsed=elapsed+bufferTime;
								
								//time passed is more than one 16th note
								if(elapsed>=period)
								{
									Log.i("test",String.valueOf(elapsed));
									elapsed=0;
									triggerPointer++;
									if(triggerPointer==16)
										triggerPointer=0;
									currTrigger=trigger[triggerPointer];
									pos2=0;
									
								}
								
								pos1=0;
							}
						}
						
					}
				).start();

EDIT:

I have been tracking System time in nano seconds just to debug, and there is a massive discrepancy between the two.

Here is what the log statement is showing

01-28 19:40:47.394: I/test(20896): elapsed A.T.=126254400 elapsed S.T.=59942012

I have set my audiotrack to run at 44100 and deriving the time it would take to run a buffer seems correct..

Code: Select all

long bufferTime=(1000000000/SAMPLE_RATE)*buffSize;

I instantiate the AudioTrack as follows..

Code: Select all

buffSize = AudioTrack.getMinBufferSize(SAMPLE_RATE, AudioFormat.CHANNEL_OUT_MONO, 
                AudioFormat.ENCODING_PCM_16BIT);
		track = new AudioTrack(AudioManager.STREAM_MUSIC, SAMPLE_RATE, 
                AudioFormat.CHANNEL_OUT_MONO, 
                AudioFormat.ENCODING_PCM_16BIT, 
                buffSize, 
                AudioTrack.MODE_STREAM);

planeth · Post by **planeth** » Wed Jan 29, 2014 6:25 am

Your period calculation results in the right value: 125000000
I personally would calculate it without converting to a float point number, that's much less cpu intensive. something like that:

long periodInNanos = (long)(1000(ms) * 1000000(to_nanos) * 60(sec) / 120(bpm) / 4(steps/beat);

Also your calculated bufferTime is correct: 5804988.66 nanos

The only mistake I see so far, is that you're storing your sample direcly into the output byte buffer without any conversion. Keep in mind that you have 16bit samples, which are in form of a 2-byte complement per sample (on a mono stream).. 16bit == 2byte

So if you have your samples internally stored as short values (means from -32767 to +32767), you have to split them into bytes first, something like that:

Code: Select all

byte low0 = (byte) (sample & 0xFF);
byte high0 = (byte) ((sample >> 8) & 0xFF);
            
outputBuffer[posInOutBuffer++] = low0;
outputBuffer[posInOutBuffer++] = high0;

Guess this is what makes it playing too fast.

Check out this site, I'm sure it'll help to understand:
http://www.jsresources.org/faq_audio.html#short_to_byte

Ap0C552 · Post by **Ap0C552** » Wed Jan 29, 2014 7:52 am

Can I just store it in a short array then?

I would love to see a simple tutorial for reading in a 16 bit PCM file. When I search for that all I get is converting wave to PCM. I pieced together a lot of my code from example s I could find and I don't understand it all. Mainly the part about reading in the file.

planeth · Post by **planeth** » Wed Jan 29, 2014 10:08 am

Ap0C552 wrote:Can I just store it in a short array then?

I don't think so. I don't know what the write(short[]... method in the AudioTrack is for. There's no proper description in the docs for that.
You need to split your samples into bytes, and I think there's no way around it.

Ap0C552 wrote:I would love to see a simple tutorial for reading in a 16 bit PCM file. When I search for that all I get is converting wave to PCM. I pieced together a lot of my code from example s I could find and I don't understand it all. Mainly the part about reading in the file.

Actually, since this is a complex task, there is no simple tutorial. I mean, of course, if we were in a blue sky scenario, where all wav and aif files meet the standards, it may be simple. But unfortunately there are thousands of variations of wav and aif file headers (which you have to take care of while the reading the files).

Simply said, you have to read the wave header first (there are plenty of resources about the header specs). Then at the end, you must read the pcm data (the samples in form of bytes)

Then you have to do the opposite like when you write samples to the AudioTrack. You must reconstruct the 16bit samples from the the bytes.
http://www.jsresources.org/faq_audio.ht ... ct_samples
Take your time to understand it, this isn't possible to learn within an hour I guess. Just go forward step by step.

There's some library available for java, not sure if you can use it on android, it's called tritonus. But reading their code will help you to write a proper wav reader.

Ap0C552 · Post by **Ap0C552** » Wed Jan 29, 2014 6:00 pm

But I thought if you convert your file to 16 bit PCM, it has not wave file header or anything. I used Audacity to convert to 16 bit pcm first.

Also. Looking at the resource, am I to assume "high byte" means the upper 8 bits?

So in a 16 bit sample = 0000 0000 1111 1111

it would be store

byteArray[0]=1111 1111
byteArray[1]=0000 0000

EDIT: I am not completely sure if the sample data is the culprit in this fast playback.

This is the code I have for reading in sample. Why would I have to covert it to short when AudioTrack takes a byte[]?

Code: Select all

in1=getResources().openRawResource(R.raw.snare);
			sample=convertStreamToByteArray(in1);
			in1.close();

Code: Select all

public static byte[] convertStreamToByteArray(InputStream is) throws IOException {

	    ByteArrayOutputStream baos = new ByteArrayOutputStream();
	    byte[] buff = new byte[10240];
	    int i = Integer.MAX_VALUE;
	    while ((i = is.read(buff, 0, buff.length)) > 0) {
	        baos.write(buff, 0, i);
	    }

	    return baos.toByteArray(); // be sure to close InputStream in calling function

	}

planeth · Post by **planeth** » Thu Jan 30, 2014 6:07 am

But I thought if you convert your file to 16 bit PCM, it has not wave file header or anything. I used Audacity to convert to 16 bit pcm first.

No, there "should" always a header, which provides the file type, encoding, channels, bitrate, samplerate, etc.

http://www-mmsp.ece.mcgill.ca/Documents ... /WAVE.html

Also. Looking at the resource, am I to assume "high byte" means the upper 8 bits?

So in a 16 bit sample = 0000 0000 1111 1111

it would be store

byteArray[0]=1111 1111
byteArray[1]=0000 0000

Yes, exactly. the high byte means the upper 8 bits. So what you say is correct, since you have to write the bytes in little endian order (lowbyte first).

When you read pcm data from a wav file, it's also little endian (rather than aiff, which is big endian with some rare exceptions). So you have to reconstruct the a sample like this (once you have the bytes):

Code: Select all

short sample = (short)
    (  (byteArray[0] & 0xFF)
     | (byteArray[1] << 8) );

Your code to read the bytes is correct, but you have to read the header first, since the pcm data does not start at the beginning. Where it starts and where it ends (often not at end of file) is stored in the header.

I recommend to analyze the wave reader of the tritonus library. Or just check out the source code of the latest jdk by oracle, there you'll find a wave reader as well. As far as I know, in the jdk it's the AudioSystem.getAudioInputStream(...) which parses the header and reads the format of it.

Ap0C552 · Post by **Ap0C552** » Thu Jan 30, 2014 6:29 am

Ok I will looking into properly reading the header.

But I don't think any of this would be causing this weird tempo problem.

Why would I change the endianess of the bytes? Long as I write them out the same order I read them in, AudioTrack is taking the same format...right? I mean the sample sounds correct to me on playback.

EDIT: Actually the timbre of the sample is changed a little when I play it back, as compared to being played in sample pool.

planeth · Post by **planeth** » Thu Jan 30, 2014 11:42 am

Ap0C552 wrote:But I don't think any of this would be causing this weird tempo problem
...
EDIT: Actually the timbre of the sample is changed a little when I play it back, as compared to being played in sample pool.

I would say this is both caused by the same thing.

Could it be that either your sample is not 44.1khz or your AudioTrack is not initialized with 44.1khz?
Or are the number of channels different (file and AudioTrack)? Or maybe the bit rate?
I'm pretty sure, that it's something like that. Therefore you need to read the header

Check your file in Audacity for its header information (samplerate, bitrate, channels, and encoding (must be uncompressed PCM). And also check the configuration of your AudioTrack. Some Android devices like the nexus 7 have a native samplerate of 48kHz, rather than 44.1.

Ap0C552 wrote:Why would I change the endianess of the bytes? Long as I write them out the same order I read them in, AudioTrack is taking the same format...right? I mean the sample sounds correct to me on playback.

What if you read an aiff file?
And apart from that, in a drum sequencer like the one you're creating, you'll never stream the single (one shot) samples directly from the file system. You'll rather load them into memory and then stream them from there. And since you have to mix them in some way, or maybe pass them through some dsp modules, also simple stuff like adjusting the volume, you have to keep the samples as 16bit short[] or 32bit float[] in memory. Therefor you must know what endianness a particular file has, when you read it.

Once you passed all the playing samples through your dsp modules, you have to split them back into bytes and write them to the AudioTrack. Writing to audio track is always little endian.

Ap0C552 · Post by **Ap0C552** » Thu Jan 30, 2014 6:44 pm

What if you read an aiff file?
And apart from that, in a drum sequencer like the one you're creating, you'll never stream the single (one shot) samples directly from the file system. You'll rather load them into memory and then stream them from there. And since you have to mix them in some way, or maybe pass them through some dsp modules, also simple stuff like adjusting the volume, you have to keep the samples as 16bit short[] or 32bit float[] in memory. Therefor you must know what endianness a particular file has, when you read it.

OK. I meant in regards to this specific contrived example. But yes ultimately I will need to be able to do that. I like to take things one step at a time.

As for the samplerate, my Nexus has a native sample rate of 48000. But I don't know if that is the problem. That wouldn't explain the halving of the time it take for audiotrack to play sample. And I highly doubt the audiotrack is playing 96khz.

Also my sample timber change is not a really a pitch change. It might be caused by distortion, which obviously blends nicely into a noise heavy snare sample.

I am going to create a sine tone pcm sample. Then play it into a mic and check if there is any pitch change. That should be conclusive!

EDIT: Using a sine sample it was easy to discern that the sample was being played back at half the frequency.The sample is confirmed to be at 44100khz.

So this suggests the audiotrack is playing at half the rate. But if this was the case, the tempo would be slower not faster. I manually entered 22050 in for sample rate in my bufferTime calculation, and now tempo is 4 times faster than it should be. I am so confused lol

It seems like

a. Audiotrack is taking half the time that calculations would suggest it should take to play a sample, suggesting its playing at twice the rate of 44100.

b.The sound being played indicates that Audiotrack is playing at half the rate of 44100. :S

Android Drum machine