macos, Core Audio, AVFoundation
-
- KVRist
- 76 posts since 7 Dec, 2020
Hi folks - can I ask an off-topic question, or is that discouraged? I'm trying to read an output stream from my audio interface in a homegrown swift app for fun, and I've been stuck for a few days and unable to google up any solutions and/or sample code. I thought there might be people reading here with the know-how but wasn't sure it would be welcome. Please delete the post if this is frowned upon.
- u-he
- 30188 posts since 8 Aug, 2002 from Berlin
All good, I just have no idea if anyone at u-he has ever written a line of code in Swift. It sure look like a great language, but we're pretty much all old school C/C++ with a few peeps sprinkled in who like Python.
-
- KVRist
- 66 posts since 4 Sep, 2011
You can use AVAudioEngine on macOS. Apple's docs: https://developer.apple.com/documentati ... udioengine
Goes a little like this:
let audioEngine = AVAudioEngine()
let input = audioEngine.inputNode
let bus = 0
let inputFormat = input.inputFormat(forBus: bus)
input.installTap(onBus: bus, bufferSize: 512, format: inputFormat) { (buffer, time) in
// do something with the data
}
You can use AVAudioFile to write data to a file if you need to.
Goes a little like this:
let audioEngine = AVAudioEngine()
let input = audioEngine.inputNode
let bus = 0
let inputFormat = input.inputFormat(forBus: bus)
input.installTap(onBus: bus, bufferSize: 512, format: inputFormat) { (buffer, time) in
// do something with the data
}
You can use AVAudioFile to write data to a file if you need to.
-
- KVRist
- Topic Starter
- 76 posts since 7 Dec, 2020
Thanks jooster, Urs. I'll give that a try, jooster.
I've almost got it working with AVCaptureSession() - it's recognizing my audio interface and calling the callback delegate with the audio buffers, but I'm having trouble interpreting/accessing that data. I want to read it as an array of Floats or whatever (so that I can e.g. use it to manipulate textures or draw an oscilloscope), but all the samples I've found somewhat reasonably assume you're going to continue to treat it as audio.
I've made it this far: https://developer.apple.com/documentati ... dataoutput, but I've got some problems. I can't tell if the sampleBufferDelegate needs to clear the received buffer. It seems to work for a few cycles and then stop. The challenges I'm facing on peeking into that buffer are preventing me from theorizing about what the problem is.
Are all USB standard-driver audio interfaces interleaving their channels in the same way? I'm trying to read 16 channels from an Expert Sleepers ES-9 and it would help if I knew whether it was one sample per channel, sequentially, or what.
And would be the stream be stored as Floats? I can't find any reference to the word length, making me think it's decided on a system level. (Whereas I can verify from the device properties that the sample rate is 48kHz.)
I've almost got it working with AVCaptureSession() - it's recognizing my audio interface and calling the callback delegate with the audio buffers, but I'm having trouble interpreting/accessing that data. I want to read it as an array of Floats or whatever (so that I can e.g. use it to manipulate textures or draw an oscilloscope), but all the samples I've found somewhat reasonably assume you're going to continue to treat it as audio.
I've made it this far: https://developer.apple.com/documentati ... dataoutput, but I've got some problems. I can't tell if the sampleBufferDelegate needs to clear the received buffer. It seems to work for a few cycles and then stop. The challenges I'm facing on peeking into that buffer are preventing me from theorizing about what the problem is.
Are all USB standard-driver audio interfaces interleaving their channels in the same way? I'm trying to read 16 channels from an Expert Sleepers ES-9 and it would help if I knew whether it was one sample per channel, sequentially, or what.
And would be the stream be stored as Floats? I can't find any reference to the word length, making me think it's decided on a system level. (Whereas I can verify from the device properties that the sample rate is 48kHz.)
-
- KVRist
- 66 posts since 4 Sep, 2011
Many of the ADC/DAC chips used these days define an interleaved format. That then gets packaged up by the driver code. But in general apple’s libraries hide all that hw specific detail, just read their docs.
-
- KVRist
- Topic Starter
- 76 posts since 7 Dec, 2020
Making progress. It's 24 bit. AVAudioPCMBuffer has floatChannelData, int16ChannelData and int32ChannelData. Given that none of those are 3-byte, how does that work? (I will continue to dig...)
Hmm, I see how it's interleaved for 24bit DVD audio...
Okay, I'm getting this from the audio device:
'soun'/'lpcm'SR=48000, FF=12, BPP=48, FPP=1, BPF=48, CH=16, BPC=24
and this from the buffer:
'soun'/'lpcm'SR=48000, FF=0x29, BPP=4, FPP=1, BPF=4, CH=16, BPC=32
(with the abbreviations spelled out, so that I know that's "format flags, bytes per packet, frames per packet, bytes per frame, channels per frame, and bits per channel)
So the audio device's parameters make sense to me - 3 bytes per channel times 16 channels is 48 bytes per frame. And I expected they might convert 24 bit to 32 bit so I'm not surprised to see bits per channel be 32 in the buffer ... but why isn't the bytes per frame then 64? Is it only a single channel per frame?
Okay, I'm off to look up these flags, that might help.
Hmm, I see how it's interleaved for 24bit DVD audio...
Okay, I'm getting this from the audio device:
'soun'/'lpcm'SR=48000, FF=12, BPP=48, FPP=1, BPF=48, CH=16, BPC=24
and this from the buffer:
'soun'/'lpcm'SR=48000, FF=0x29, BPP=4, FPP=1, BPF=4, CH=16, BPC=32
(with the abbreviations spelled out, so that I know that's "format flags, bytes per packet, frames per packet, bytes per frame, channels per frame, and bits per channel)
So the audio device's parameters make sense to me - 3 bytes per channel times 16 channels is 48 bytes per frame. And I expected they might convert 24 bit to 32 bit so I'm not surprised to see bits per channel be 32 in the buffer ... but why isn't the bytes per frame then 64? Is it only a single channel per frame?
Okay, I'm off to look up these flags, that might help.
-
- KVRist
- Topic Starter
- 76 posts since 7 Dec, 2020
Ah, I think I got your method working, jooster. That was a bit tricky - it's delivering the channels in contiguous blocks (so when the buffer is 4800 frames, it's returning 0-4799 ch1, 4800-9599 ch2, etc. It took me a minute to figure that out.
I'm convinced the AVFoundation method I was trying above is borked, it would change formats after a certain number of calls...
I'm convinced the AVFoundation method I was trying above is borked, it would change formats after a certain number of calls...
-
- KVRist
- 66 posts since 4 Sep, 2011
There are lots of projects on github that work with audio. This link takes you to an extension for AVAudioPCMBuffer you might like: https://github.com/AudioKit/AudioKit/bl ... sing.swift. The function “ toFloatChannelData()” seems handy and explains how audio data is unwrapped.
-
- KVRist
- Topic Starter
- 76 posts since 7 Dec, 2020
I think I'm going to adapt this code, actually:
https://developer.apple.com/documentati ... audio-taps
The AVAudioEngine method was producing audible glitches when starting and stopping, and the relationship between buffer size and frequency of delivery seemed obfuscated and difficult to parse. This code doesn't produce glitches, I'm successfully peeking into the buffers and seeing the values I'm expecting, and it's delivering buffers slightly faster than the 60fps I'm aiming to draw at, so this seems like the way. Tonight I'll start trying to strip away everything in the demo code that is superfluous to my purposes. I'm having fun!
https://developer.apple.com/documentati ... audio-taps
The AVAudioEngine method was producing audible glitches when starting and stopping, and the relationship between buffer size and frequency of delivery seemed obfuscated and difficult to parse. This code doesn't produce glitches, I'm successfully peeking into the buffers and seeing the values I'm expecting, and it's delivering buffers slightly faster than the 60fps I'm aiming to draw at, so this seems like the way. Tonight I'll start trying to strip away everything in the demo code that is superfluous to my purposes. I'm having fun!
-
- KVRist
- Topic Starter
- 76 posts since 7 Dec, 2020
Okay, oscilloscope implementation question! I've got a naive (i.e. draw X most recent samples) oscilloscope working, and of course it looks a bit messy. And what I've been able to find is I need some edge detection and maybe a variable threshold or at least zero-crossing to shift the window around and hopefully bring subsequent draws into something like on-screen phase coherence, but when I look at the oscilloscopes that I have in various plug-ins, I can't see that they're definitely doing that. Zero-crossings or what have you don't seem to be in the same on-screen location (I had expected to maybe always see a rising-from-zero aligned with the right edge of the draw area, or something like that). If anyone has any tips for me, that would be amazing.
(The background is some signed distance field geometry parameterized by various eurorack signals. Because this is a single frame it of course does not demonstrate the issue in question, but jooster suggested I share, so here's some WIP...)
(The background is some signed distance field geometry parameterized by various eurorack signals. Because this is a single frame it of course does not demonstrate the issue in question, but jooster suggested I share, so here's some WIP...)
You do not have the required permissions to view the files attached to this post.
- u-he
- 30188 posts since 8 Aug, 2002 from Berlin
I'm quite happy using autocorrelation. So you basically try to guess the pitch of the incoming signal, and then you move your read pointer for the next update only in multiples of the sample length of the detected pitch. So if your pitch is 440hz and your sample rate is 44.1k, your cycle length is 100 samples. So hopefully you have buffered your input for a good number of samples, and you move your read position as close to "now" as it gets by increasing it in steps of 100 samples.
Autocorrelation often gets the wrong octave, but that does not matter for an oscilloscope.
Autocorrelation often gets the wrong octave, but that does not matter for an oscilloscope.
