Don't sync your threads, your audio thread can never wait for a synchronization object anyway.
What you want to do is use wait-free structures (eg. message queues) to send the data around.
Don't sync your threads, your audio thread can never wait for a synchronization object anyway.
Code: Select all
// in process
volatile size_t* ptr_ring_csr_proc = &(disp->ring_csr_proc); // pointer to volatile value
volatile size_t* ptr_flag_copy = &(disp->flag_copy);
const size_t local_ring_csr = *ptr_ring_csr_proc; // hopefully this loads the volatile value as a local copy?
// barriers here??
*ptr_flag_copy = 1; // pointer to volatile value
... memcpy using local_ring_csr ...
*ptr_ring_csr_proc++;
*ptr_ring_csr_proc %= RING_SIZE;
*ptr_flag_copy = 0;
// in paint
volatile size_t* ptr_flag_copy = &(disp->flag_copy);
// this has to be guaranteed to execute at only this position in the function
COMPILER_BARRIER();
while(*ptr_flag_copy); // blocking loop, flag set in run_processes
COMPILER_BARRIER();
// then make access to memory with these cursors
volatile size_t* ptr_ring_csr_proc = &(disp->ring_csr_proc);
volatile size_t* ptr_ring_csr_disp = &(disp->ring_csr_disp);
const size_t proc = *ptr_ring_csr_proc;
const size_t disp = *ptr_ring_csr_disp;
First, using "volatile" on modern compilers (for anything other than MMIO access) is asking for trouble, since it almost certainly won't do what you want. You really want to use "compiler fences" which on x86 don't actually cause any instructions, but instruct the compilers optimiser that you really need a particular memory ordering.camsr wrote: ↑Mon Sep 16, 2019 8:21 pm There seems to be a contention no matter what I try. To counter it I made the copy operation as localized as possible, and set a flag that indicates to the paint thread to spin while it's being copied. But it's still not 100% guaranteed to grab the correct values... (because of the spin flag not being synchronized with anything? The copy op in process is not blocked.)
Not at all. The problem is how to synchronize all the events in a meaningful way.BertKoor wrote: ↑Tue Sep 17, 2019 7:57 am Fact 1: your Audio thread is continuously delivering audio packages at a rate of sample_rate / buffer_size, say 48.000 / 64 = 750 "frames" per second.
Fact 2: your GUI thread will update the screen maybe 60 or 100 times per second.
Derived from that: each GUI update the Audio thread has produced 7 or 8 (maybe some dozens, not hundreds) data packages it might have to reflect on the screen. So the GUI has a similar latency as the audio itself.
Question for you: is it really problematic for the GUI thread to process 7 or 8 queued data packages delivered by the audio thread? Or did I not understand the underlying problem?
I think it's still mandatory to use std::atomic to get the necessary behavior without having to resort to compiler dependent mechanism (_ReadWriteBarrier is actually deprecated). Though atomic stores with default memory order may be costly on x64, acquire and release semantics are free (and so is atomic load with std::memory_order_acquire and store with std::memory_order_release).
Code: Select all
process(float** in, float** out, vstint32 frames)
{
memcpy(a_plugin_alloced_buffer, in[0], sizeof(float)*frames);
out[0] = a_plugin_alloced_buffer;
}
© KVR Audio, Inc. 2000-2024
Submit: News, Plugins, Hosts & Apps | Advertise @ KVR | Developer Account | About KVR / Contact Us | Privacy Statement