KVR Audio

Touch The Universe · Post by **Touch The Universe** » Mon May 25, 2026 6:33 pm

95b649c0-df60-43c6-965d-4fcb8ca57471 (1).png

Also just noticing that there's code that can help me actually have better CPU. So that's crazy.

5. Madronalib (SIMD-Accelerated Audio Math)The Repository: madrona-labs / madronalibWhy it has huge stars: Maintained by Madrona Labs (creators of high-end synths like Aalto), this library provides a highly modern C++ framework optimized specifically for running math on SIMD processors (Vector processing). It contains highly tuned filters and delay smoothers.Best for your synth: Polishing your raw oscillator processing code so you can run twice as many voices without CPU spikes.

Touch The Universe · Post by **Touch The Universe** » Mon May 25, 2026 7:14 pm

Shocked how good the modal synthesis sounds.

Touch The Universe · Post by **Touch The Universe** » Tue May 26, 2026 1:36 am

Screenshot 2026-05-25 at 9.32.36 PM.jpg

Just spend a crazy amount of time getting the cocoa delay to implement and sound identical to the html prototype

This is about half the passes. Gtp only thought to add the real code after 3-4 passes for the MIT source code after first attempting to add facsimiles

That was only the beginning of the troubles. The html knobs parameter values were different from the implementation so it was hard to get a good a/b. i Don't even now what thing finally got it to sound right

, but finally does

Touch The Universe · Post by **Touch The Universe** » Tue May 26, 2026 3:47 am

CPU IS crazy high and I don't have a clue what I'm doing.

tldr - what do i need to know to have a cpu efficient synth, the routing, bus, archetecture layout, etc.

A surge osc, 1 poly, no filter, envleops, is like 5% cpu on my m1 tahoe, same thing 25% on M1. Playing 5 poly unision 8 brings it down a 4096 sample buffer. Cleary, I haven't a clue on these basic fundamental things.

What follows is a shareable architecture snapshot of the synth as it currently appears to be structured, plus a diagnosis of what is solid, what is fragile, and what other agents should look at.

1) High-level architecture map
A. Big-picture signal flow
MIDI IN / Preview MIDI / HTML-MIDI mirror
│
▼
Voice allocation / polyphony / mono-legato-glide
│
▼
Per-voice sound generation
├─ OSC 1 / OSC 2 / OSC 3
├─ Vital oscillator
├─ Attack 2 sample source
├─ Granular / Sampler paths
├─ Noise source
├─ Thorn / Monk / other experimental engines
└─ Per-voice modulation state (LFOs, vibrato, envelopes, portamento)
│
▼
Per-voice routing branches
├─ Main voice path
├─ Vital bypass path → oscVitalBypassBuffer
├─ Attack 2 bypass path → attack2RouteBuffer
├─ Sampler/granular side routes
└─ Other per-engine special routes
│
▼
Per-voice envelope / gain stage
├─ Amp 1
├─ Amp 2
├─ Blend Amp1/Amp2
└─ Amp2 Gate mode
│
▼
Per-voice filtering / tone shaping
├─ Main Filter 1 / Dual / bypass logic
├─ Filter envelope
├─ Special filter engines
└─ Some post-source coloring blocks
│
▼
Per-voice mixdown
├─ Main filtered voice mix
├─ Re-injected Vital bypass signal
├─ Re-injected Attack 2 bypass signal
└─ Other branch returns
│
▼
Global / post-voice stage
├─ Delay Reference Player injection (non-MIDI audition source)
├─ Delay engine(s)
├─ Reverb / Verb Lab / Nebula / other FX
├─ Distortion / disperser / etc.
├─ Master clipper / bus clipper / limiter
└─ Output trim
│
▼
Audio Out
2) The important routing reality

The synth is not one clean linear chain.

It is more like:

VOICE SOURCES
│
├─► normal voice chain ─► amp/filter/mix
│
├─► special bypass branch A (Vital)
│ └─► post-filter re-entry
│
├─► special bypass branch B (Attack 2)
│ └─► post-filter re-entry
│
├─► sampler / granular / engine-specific paths
│
└─► global audition/ref path (Delay Ref)
└─► injected before delay DSP

That means the synth is currently a branch-heavy architecture, not a strictly centralized bus design.

That is the core reason routing bugs have been happening.

3) The current amp / gate structure

This is one of the most important parts for outside review.

Current intended logic

There are two amp envelopes:

Amp 1 = original/main amp envelope
Amp 2 = alternate amp envelope for routing/CPU/behavior tests

And the current route modes include:

Off / Main Amp 1
Replace Amp 1
Blend Amp1/Amp2
Amp2 Gate
Delay Ref Test
Future Matrix
What the codebase has been trying to do

The recent passes were trying to force all voice-owned branches to obey the same selected amp gate, so Amp 1 would not secretly still gate audio when Amp 2 was selected.

That means the intended architecture now looks like this:

voice output from source
│
▼
selectedAmpGain
├─ Amp 1 only
├─ Amp 2 only
├─ Blend
└─ Gate mode
│
▼
voice continues through filter / mix / FX
Why it was failing before

Previously, some side routes were still consulting Amp 1 behavior even when Amp 2 was selected.

The big offenders were things like:

Vital bypass route
Attack 2 route
sampler/filter-bypass routes
voice lifetime / “is this voice still active?” logic

That is why you were seeing:

tails getting chopped,
gate behavior changing when only Amp 1 release changed,
inconsistent behavior between normal voice path and special branches.
4) The Vital routing situation

This is also a big one.

What existed

Vital had a special bypass path, stored in:

oscVitalBypassBuffer

That path was intended to allow Vital to bypass certain filter coloration.

Why it became confusing

Because once a source gets a special bypass lane, it is easy for it to:

avoid the main filter path,
avoid the selected amp path,
or re-enter later in the chain in a way that does not match the other engines.
Current status

The latest cleanup pass was meant to remove the weird hidden behavior where turning Filter 1 on for Vital effectively routed it back through amplitude behavior in a way that was not transparent.

So the current intended rule is:

Vital normal route → common voice path
Vital bypass route → post-filter injection only
No hidden “Filter 1 secretly changes amp behavior” side-door

That was the right move.

5) Delay / reverb tail behavior

This was a major structural discovery.

What was happening

There was a global “no voices active” early return behavior.

That means if the synth decided there were no active voices, it could stop processing too early.

That is bad because:

delay tails may still be ringing,
reverb tails may still be ringing,
but the processor would think “no voices active = safe to stop.”
What the AW pass added

A keepalive variable:

fxTailKeepAlive

This is effectively a little memory of recent post-FX output level.

So the current intended logic is closer to:

If no MIDI voices are active:
do NOT immediately stop processing
keep processing while FX tail energy is still audible
then return to silence-saving mode after tail decays

That is a good patch.

Important note

The host-reported tail length is still very short in the code:

getTailLengthSeconds() const { return 0.5; }

That may not reflect the real synth behavior if you have longer delay/reverb tails.

That is something worth flagging to other agents.

6) Delay Lab architecture

This part is especially important because it is now a hybrid.

There are really three delay layers in play
A. Legacy/core delay engine

This appears to be the stable existing delay engine with types like:

Digital
PingPong
Dual
Dub
Tape-ish
Reverse Delay
Reverse Swell
B. Delay Lab model selector

This is the newer front-end model concept:

Clean
Gesture Echo
Motion
Ring/Flange
Wet Delay
Cocoa Delay
C. Imported/source-inspired module layer

There are dedicated imported module concepts like:

Wet Delay
Cocoa Delay
later other keepers
Structural risk here

This means the synth is not just “a delay.”

It is a system with:

a legacy delay engine,
a lab model layer,
and imported module sockets.

That is powerful, but also easy to make messy if not separated.

The right principle

Only one delay architecture should be active at a time for a given path.

You do not want:

legacy delay core running,
plus imported module delay running,
plus a lab model reinterpretation,
all at once unless explicitly designed.

That is exactly the kind of overlap that causes “model bleed.”

7) The Delay Reference Player path

This is actually a very useful feature.

There is a dedicated non-MIDI reference audio player for Delay Lab.

It is a separate audio source injected just before the delay DSP so you can audition delay behavior without needing to hold a MIDI note.

That means this path looks like:

Imported reference audio
│
▼
Delay Reference Player
│
▼
Injected pre-delay
│
▼
Delay Lab / delay module

That is good for A/B testing, but it is also another parallel audio route that needs clear ownership.

Why CPU is probably rising while tweaking

This is one of the most important diagnosis points.

Based on the editor code, this does not look like only a DSP issue.

It strongly looks like a significant part of the problem is editor-side activity.

What I found in the editor code

There are approximately:

11 timer callbacks
12 startTimerHz(...) usages
214 repaint() calls in the editor source

And a lot of components repaint directly on:

slider changes,
timer callbacks,
page changes,
visual refresh events.
What that suggests

The CPU rise you described — where it starts okay, then grows while tweaking, then drops after closing/reopening — is very consistent with:

repaint churn,
timer-driven UI work,
parameter listener activity,
multiple animated components refreshing too often.

Not necessarily a classic DSP runaway.

In plain English

The likely issue is:

Audio side = some complexity, yes
Editor side = probably too chatty / too many repaints / too much live visual work

That is why reopening the plugin window helps.

9) Current strengths of the architecture

These are the things that are actually good:

Good things already present
MIDI sample offsets are preserved properly.
There is a real per-voice structure.
There is explicit polyphony / mono / legato / glide logic.
There are multiple experimental engines and sources already integrated.
Delay Lab has a genuinely useful reference-audio audition path.
Amp 2 testing has already forced useful routing cleanup.
FX tail keepalive is the correct direction.
There is already some effort to isolate imported delay models from legacy delay behavior.
10) Current weaknesses / fragilities

These are the things I would tell other agents very plainly.

A. The architecture is monolithic

The synth is still largely living inside a very large processBlock / processor file.

That makes it harder to reason about routing and easier for special-case branches to sneak in.

B. There are too many side paths

Instead of a fully explicit bus system, there are several “escape routes” and reinjection points.

That makes bugs likely.

C. GUI and DSP concerns are too intertwined

There are enough timers and repaints that the UI can likely distort CPU diagnosis.

D. Delay Lab is conceptually powerful but structurally easy to tangle

Legacy delay + model layer + imported sockets = must be made explicit.

E. Host tail reporting is probably too short

Returning 0.5s tail length may be misleading for long delay/reverb behavior.

F. Routing intent is not fully documented in the codebase

The synth needs a formal routing sheet / bus map.

11) What the synth is doing now vs. what it is not doing
What it is doing now
Branch-heavy manual signal routing
Per-voice synth generation
Special-case per-engine bypass handling
Mixed dry/wet FX processing
Reference-audio pre-delay injection
Experimental imported delay modules
APVTS-driven parameterization
Animated GUI with many repaint sources
What it is not yet doing cleanly
A formal graph-based audio architecture
A single explicit routing matrix shared by all engines
A strict “one source enters one common voice chain” rule
A clearly separated pre-filter bus / post-filter bus / FX send bus / FX return bus architecture
A centralized UI repaint scheduler with a strict budget
A formal profiling layer that clearly distinguishes UI CPU from DSP CPU
12) Recommended redesign target

If I were briefing other agents, I would say the medium-term goal should be this:

Desired cleaner architecture
[INPUT LAYER]
MIDI / Preview MIDI / Ref Audio

[VOICE LAYER]
Voice Allocation
└─ Source Modules
├─ OSC engines
├─ Sample engines
├─ Granular
└─ Experimental engines

[VOICE BUS LAYER]
Pre-Amp Bus
└─ Selected Amp Stage (Amp1/Amp2/Blend/Gate)

[VOICE TONE LAYER]
Pre-Filter Bus
└─ Filter Stage
Post-Filter Bus

[SPECIAL RETURN LAYER]
Bypass Return Bus
├─ Vital bypass return
├─ Attack2 bypass return
└─ other post-filter returns

[MIX LAYER]
Voice Sum Bus
Dry Bus
FX Send Bus

[FX LAYER]
Delay Bus
Reverb Bus
Other FX Bus

[MASTER LAYER]
Master processing
Clip/Limit/Trim
Output

That would make the synth much easier to debug and explain.

13) Questions to give other agents

If you share this with other people or other agents, these are good questions to ask:

Should this synth be refactored into explicit buses instead of ad hoc branch reinjection?
Should Amp 1 / Amp 2 become one unified “amp stage” abstraction rather than two separate envelope objects with route modes?
Should Vital bypass, Attack2 bypass, and sampler bypass all become formal post-filter return buses?
Should Delay Lab be separated into a self-contained module host rather than sharing the legacy delay path?
Should the editor move to a central repaint scheduler instead of many per-component timers?
Should getTailLengthSeconds() reflect real FX tail capability?
Should the synth be split into subsystem files: voices, oscillators, routing, delay lab, editor visuals, etc.?
14) Copy/paste brief for other agents

Here is a short version you can paste directly:

This JUCE synth currently uses a branch-heavy monolithic architecture rather than a formal bus graph. It has multiple per-voice source engines (OSC1/2/3, Vital, sample/granular, etc.) plus special bypass routes such as Vital bypass and Attack2 bypass that re-enter later in the chain. Amp 2 testing exposed that not all branches were obeying the same amp gate logic; recent fixes tried to unify this. Delay/reverb tails were previously being chopped by a no-voices-active early return, now patched with an FX tail keepalive. Delay Lab currently sits on top of a legacy delay architecture plus imported module concepts, so there is risk of overlap/model bleed if not isolated. CPU spikes seem likely to be substantially editor-side: the GUI contains many timers and a high number of repaint triggers, and CPU drops after closing/reopening the editor. Main evaluation requests: recommend a cleaner routing/bus design, evaluate amp/filter/bypass architecture, suggest UI repaint/timer refactor, and advise how to isolate imported delay modules from the legacy delay core.
15) My bottom-line diagnosis

If I had to summarize the synth in one paragraph:

The synth is powerful and already has many serious systems in place, but the architecture is currently “feature-grown” rather than “bus-planned.” The biggest structural issue is not that the synth lacks capability, but that several engines and special routes bypass or re-enter the common signal path in different ways. That made amp, filter, and tail behavior inconsistent. The recent fixes are moving in the right direction, but the next major win would come from a routing refactor and a UI repaint/timer cleanup rather than adding more features.

Touch The Universe · Post by **Touch The Universe** » Tue May 26, 2026 5:53 am

Alright, think i have my head around what i need to do for the routing, layout, structure, rules and clear audio signal paths. Will likely make a blank synth and transfer over element by element the synth into a clear audio path structure. I can test cpu cost per module, voice, or anything else, to identify what is working or poorly thought out, so i can find ways to minimize the cpu footprint.

Touch The Universe · Post by **Touch The Universe** » Tue May 26, 2026 5:54 am

rafa1981 · Post by **rafa1981** » Tue May 26, 2026 7:23 am

Have you measured CPU with the UI closed?

Tiles · Post by **Tiles** » Tue May 26, 2026 8:24 am

Welcome to where the fun begins. Refactoring ^^

Touch The Universe · Post by **Touch The Universe** » Wed May 27, 2026 4:19 am

So besides having a clear audio path for each voice element and refactoring around that structure, are there any other worthwhile tricks I can employ? I came across free open-source code on GitHub focused on minimizing CPU usage for synths, and it kind of blew my mind for a second that I can study or integrate open-source techniques to make my synth more CPU efficient.

like below....

1. Vector Processing via SIMD (Single Instruction, Multiple Data)

A standard C++ loop calculates your audio samples one by one. If your DAW sends a block of 64 audio samples, your computer has to do 64 separate math calculations in a row.

SIMD forces the CPU to use its specialized, hidden "wide registers" (like AVX or NEON) to calculate multiple audio samples completely simultaneously in a single processor clock cycle.

The Code Solution: Instead of writing complex assembly code, use an open-source library like madrona-labs/madronalib on GitHub. It provides modern C++ wrappers that automatically force your synthesizer's audio math loops onto the computer's SIMD hardware.

The Result: If you run a polyphonic wavetable synth with 16 voices, utilizing SIMD architecture can easily cut your plugin's total CPU footprint right in half. [1]

2. Intelligent "Voice Voice Stealing" & Zero-Amperage Sleep Cycles

The absolute biggest mistake beginner VSTi developers make is letting their synthesizer engine run calculations for notes that aren't even making sound. If a user presses a key and lets go, the sound fades out—but the oscillator code keeps blindly running silent numbers in the background.

The Code Solution: You must write an Active Flag condition directly into your voice tracking framework. Your envelope generator (daisysp::Adsr) must report when it has fully finished its "Release" stage and hit absolute zero volume.
The Result: The moment a voice's volume hits zero, your code completely bypasses that entire block of math for that voice. If a user is only playing a 3-note chord, your CPU only calculates 3 voices instead of calculating all 16 voices all the time. [1]

Touch The Universe · Post by **Touch The Universe** » Wed May 27, 2026 4:20 am

rafa1981 wrote: Tue May 26, 2026 7:23 am Have you measured CPU with the UI closed?

Yes, it doesn’t seem to make a difference whether the GUI is open or closed, though I wonder if I should get a better diagnostic tool. I’m also wondering whether I can implement some type of tool inside the synth itself that shows which element is using how much CPU. I think that should be possible.

I believe I tried this before, or maybe I already have some code somewhere in the synth that was supposed to do this, but it either broke or I got sidetracked before finishing it.

Touch The Universe · Post by **Touch The Universe** » Wed May 27, 2026 12:38 pm

Refactoring. I have a new test synth with proper archetecture and 8 chords get me at 9% cpue, 1 note 2%. This is 10-15x improvement, already. Sweet. Hopefully this continues.

Added unision 8 voice 8 unision 15%.

Added per voice unison detail mode with filtering so each voice unison gets its own filter mode, goes from 15% to 31%, much better. I couldn't do any poly in the last build at all

This is at 4096 buffer however, 1024 it's quite about higher 50%. I wonder if that is decent. It's improved but that still seems quite high to me.

audiojunkie · Post by **audiojunkie** » Wed May 27, 2026 3:01 pm

Touch The Universe wrote: Tue May 26, 2026 5:53 am Alright, think i have my head around what i need to do for the routing, layout, structure, rules and clear audio signal paths. Will likely make a blank synth and transfer over element by element the synth into a clear audio path structure. I can test cpu cost per module, voice, or anything else, to identify what is working or poorly thought out, so i can find ways to minimize the cpu footprint.

It's been interesting watching this. I am learning things about software design, without having to do anything.

Touch The Universe · Post by **Touch The Universe** » Wed May 27, 2026 6:47 pm

This process is not going as smoothly as I hoped. Transferring the project piecemeal—one page tab at a time—only makes three or four knobs functional per pass. Because there are so many menus and knobs (around 30 per page), it requires too much back-and-forth context switching.Instead, I need to do a full-page transplant to the GUI, maintaining an identical visual layout. Once the DSP code is activated and working in a single pass, I can isolate that code from the larger file into a separate .h header file.

From there, I can break out the FM components, feedback loops, and SVF filters into reusable modules for future oscillators and synths.Doing a single-pass transplant is the immediate priority. The current method requires 30 passes just to transfer one page, meaning it would be faster to build a new page from scratch. With 30 more pages left to do, instructing the team to manually recreate the GUI layouts is simply not viable.

I worked too hard on the existing GUI system to redo it. If we cannot transplant everything in one go, I might as well refactor directly inside the existing document and embed the view structures there. While that sounds risky, the current piecemeal method is equally problematic.

Touch The Universe · Post by **Touch The Universe** » Thu May 28, 2026 4:12 am

I'm 80-90 passes in and it's not going well importing into the fresh synth build with proper structuring so I'm now attempting to , instead of doing an organ transplant, to build arteries around the existing working functioning, linked system so the outputs are explicit. Hopefully that works, otherwise, I might as well code a new synth from scratch maybe, as it would requite less passes. I realize the page I'm trying to transplant might have been a couple days work though so I can't be too sure on that last statement, nonetheless, I still have 30 more pages to do, so this is not feasable. I won't if I can just remove all other things of the synth besides the page i want to focus on, once its all removed, and it works, then do the transplant? Because only useful functioning code is there. ALL of it needs tranplanting. This is quite the headache however. I was able to verify the cpu can drasticially be reduced if you structure it right, yet fixing something that wasn't structured right is NOT an easy task, apparently/

Tiles · Post by **Tiles** » Thu May 28, 2026 5:17 am

Your analysis is correct. And congrats, some people take years to reach that realization. Starting over is probably the right move now. The main issue is not the individual code quality anymore, but the lack of clear architecture and system boundaries. And that now comes back to bite you. At this point, a rewrite is likely cheaper than a refactor.

In game development there is a saying that your first ten games will be bad. Software development is not much different. You first need to learn the fundamentals properly. Vibecoding shortens the learning curve a lot, but it does not replace understanding architecture and system design. That’s why I usually recommend starting with the smallest possible projects and building up from there. A legit advivce long before AI already.

AI tends to generate locally working solutions, but audio software needs global coherence. If you do not explicitly enforce structure, you often end up with one giant implicitly coupled system. And AI usually will not warn you about that unless you already know enough to ask the right questions.

My advice would be: stop doing “passes” and “transplants”. Freeze the current project as reference only and treat it as the learning phase.

Then rebuild a minimal vertical slice with proper structure:
• oscillator
• envelope
• filter
• audio out

Define strict boundaries early:
• DSP separated from UI
• no hidden state
• explicit parameter routing
• realtime safe audio thread

In short: make sure the architecture fits from the beginning by building the smallest possible version first and polishing it properly. Once that foundation is solid, expand from there.

Trying to untangle a giant implicitly coupled synth after 80+ AI passes is usually harder than rebuilding it cleanly.

Vibe Coding Log - Sharing Journey - GUI Fine Tuning - Merging Harmonic Synth Branch