SUNO is killer!

Explore how Machine Learning and AI can expand musical creativity while keeping the human in the creative workflow. This forum is dedicated to respectful dialogue where diverse perspectives are welcomed.
Post Reply New Topic
RELATED
PRODUCTS

Post

Kenmac wrote: Wed Feb 25, 2026 6:17 pm
audiojunkie wrote: Wed Feb 25, 2026 4:38 pm I just read through the whole thread..........whew! What I find interesting potential that no one has mentioned is multisampling. I love samples, and it seems to me that these tools would be ideal for creating multisamples for samplers. Want a saxophone? Tell AI to output samples across the map--one for each key. Doesn't sound right? Upload a sample similar to what you want to hear, and tell it to make the multisample set more like the sample you uploaded. Want articulations? Done. Want loop points? Done. You could even tell AI to output the instrument in the sample format of your choice. This is the kind of thing that I think AI could be really good for. Instrument sounds are usually pretty good in isolation at this technological point. Imagine being able to create your own instrument libraries. This is stuff that could be done already if we wanted to.

There are other useful things that it can do right now that doesn't take away from the creativity, and leaves it in the artists hands:

Imagine taking your vocal track and inputting it into an AI and having it correct the flaws in timing and intonation and polish. This can already be done.

Imagine humming a tune, and then describing the rhythm desired for guitar, including up strokes, downstrokes, mutes, bends, etc. This can already be done.

AI is a tool. Nothing more. To an artist, it can be a great help already at the technological point it is at today. Or, it can be a slop machine. But make no mistake. It is a tool.
Regarding AI sampling, I don't know if you saw this, but EastWest has partnered with ACE Studio which is an AI DAW.

https://www.kvraudio.com/focus/ace-stud ... tion-65954
Interesting!
Vendor‑Dependent Copy Protection: Customers lose. Pirates win.:mad:
(Also: I'm Accused of lying about Linux—it boots, runs my pro audio workflow, stays stable, updates--though yearly dismissed as “niche”. Yet I'm the deluded one.)
:roll:

Post

zerocrossing wrote: Wed Feb 25, 2026 8:59 pm
audiojunkie wrote: Wed Feb 25, 2026 4:38 pm I just read through the whole thread..........whew! What I find interesting potential that no one has mentioned is multisampling. I love samples, and it seems to me that these tools would be ideal for creating multisamples for samplers. Want a saxophone? Tell AI to output samples across the map--one for each key. Doesn't sound right? Upload a sample similar to what you want to hear, and tell it to make the multisample set more like the sample you uploaded. Want articulations? Done. Want loop points? Done. You could even tell AI to output the instrument in the sample format of your choice. This is the kind of thing that I think AI could be really good for. Instrument sounds are usually pretty good in isolation at this technological point. Imagine being able to create your own instrument libraries. This is stuff that could be done already if we wanted to.

There are other useful things that it can do right now that doesn't take away from the creativity, and leaves it in the artists hands:

Imagine taking your vocal track and inputting it into an AI and having it correct the flaws in timing and intonation and polish. This can already be done.

Imagine humming a tune, and then describing the rhythm desired for guitar, including up strokes, downstrokes, mutes, bends, etc. This can already be done.

AI is a tool. Nothing more. To an artist, it can be a great help already at the technological point it is at today. Or, it can be a slop machine. But make no mistake. It is a tool.
You're unaware of what AI can actually do. Multisamples are soon going to be completely obsolete. Kontakt... will probably go away. That's probably why NI is in bankruptcy proceedings right now.

With something like ACE Studio, all you have to do is play your MIDI part, or hum it, and tell it to get played back as one of many synthetic instrument models that sound as convincing as your multisampled version, and didn't take you days of putting in all the articulation cues. It doesn't have every instrument, yet, but I don't see myself ever using a sample based instrument to get orchestral sounds ever again. What a tedious PITA. RIP.
Wow! I guess I AM unaware. I'll look into the ACE Studio.
Vendor‑Dependent Copy Protection: Customers lose. Pirates win.:mad:
(Also: I'm Accused of lying about Linux—it boots, runs my pro audio workflow, stays stable, updates--though yearly dismissed as “niche”. Yet I'm the deluded one.)
:roll:

Post

BONES wrote: Thu Feb 26, 2026 12:09 am ACE Studio uses East-West multisamples, so you're using samples anyway, It also looks like you have to put your articulation cues in, too. Have you got a license or are you just speculating?
this is what it looks like to me as well...that ace studio/eastwest partnership probably means ace will AI-ify eastwest catalogue of instruments in the cloud so that you can sing them or feed them midi...may ease ur hard drive and wallet but now ur dependent on the cloud...and the AI may improve ur vocal or midi sketch and make assumptions about performance styles and articulations,..but ur still going to have to go and edit articulation/performance style cues...how is that all that different from things like cubase expression maps?...may be a lil less dicking around in the piano roll, but still editing instead of performing none the less for those who dont have the performance chops to convincingly use keyswitches and midi cc with breath controllers/expression pedals etc...the key feature is training ur own models for ur own custom sound design...but IKM promises that in resing locally without the cloud and without a subscription...and east west wasn't best in class in any instrument category that i know of and misses out on a lot of stylistic motifs other sample developers cover...and as far as the docs there is only half a dozen instruments available now so far...so to me, hardly a revolution at this stage
Music had a one night stand with sound design.....And the condom broke

Post

audiojunkie wrote: Wed Feb 25, 2026 4:38 pm what I find interesting potential that no one has mentioned is multisampling. I love samples, and it seems to me that these tools would be ideal for creating multisamples for samplers. Want a saxophone? Tell AI to output samples across the map--one for each key. Doesn't sound right? Upload a sample similar to what you want to hear, and tell it to make the multisample set more like the sample you uploaded. Want articulations? Done. Want loop points? Done. You could even tell AI to output the instrument in the sample format of your choice. This is the kind of thing that I think AI could be really good for. Instrument sounds are usually pretty good in isolation at this technological point. Imagine being able to create your own instrument libraries. This is stuff that could be done already if we wanted to.
there is probably a lot of opportunity here to automate a lot of what samplist have done for decades in pcm sound design...offloading all the tedium and busy work and focusing on the creative aspects...problem is always the same tradeoff...the more creative decision making you offload and rely on outside assumptions,...the less creative control an satisfaction ur likely to have with the resulting output...but if the process is lightweight enough and interactive enough...users can just efficiently iterate until they get close to what they want through trial and error in an efficient less time consuming and less ear fatiguing way than traditional cradle to grave deterministic methods...this could result in some cool new tools for pcm experimental sound design...but as far replacing the sample library developer ecosystem like zerocrossing wishes,...I don't see that happening anytime soon...there is so much complexity and nuance that goes in to a great library that separates them from each other...the differences in the real world instruments/sound sources themselves, the rooms they are recorded in, the skills of the performers, the configuration of the mics, selection of mics, outboard recorded through...so many variables affect the resulting timbres, and their musical use cases...the best libraries are not as simple as just making sure you have at least 5 velocity levels and 2 articulations...the first generation of AI instruments are going to be the modern equivalent of general midi
Last edited by bermudagold on Thu Feb 26, 2026 2:06 am, edited 2 times in total.
Music had a one night stand with sound design.....And the condom broke

Post

No, you are not "using" multisamples when you use ACE Studio. It may have been trained on them, but it generates the results from the training, not by playing back the samples.
Zerocrossing Media

4th Law of Robotics: When turning evil, display a red indicator light. ~[ ●_● ]~

Post

zerocrossing wrote: Thu Feb 26, 2026 2:00 am No, you are not "using" multisamples when you use ACE Studio. It may have been trained on them, but it generates the results from the training, not by playing back the samples.
what does that even mean?...the signal source has to come from somewhere...im pretty sure its playing the samples...its just making assumptions on velocity and keyswitch articulation triggering and performing smoothing in a void of that input...u think its resynthesized the whole ew catalogue and dynamically generating voicing of that on demand by synthesis?...then it wouldn't sound the same as the ew libraries

in fact the ace studio strings "coming soon" will be recorded by budapest arts orchestra...so sounds like samples to me...it sounds like its rendering the multisamples in the cloud and streaming you the output
Music had a one night stand with sound design.....And the condom broke

Post

bermudagold wrote: Thu Feb 26, 2026 2:12 am
zerocrossing wrote: Thu Feb 26, 2026 2:00 am No, you are not "using" multisamples when you use ACE Studio. It may have been trained on them, but it generates the results from the training, not by playing back the samples.
what does that even mean?...the signal source has to come from somewhere...im pretty sure its playing the samples...its just making assumptions on velocity and keyswitch articulation triggering and performing smoothing in a void of that input...u think its resynthesized the whole ew catalogue and dynamically generating voicing of that on demand by synthesis?...then it wouldn't sound the same as the ew libraries

in fact the ace studio strings "coming soon" will be recorded by budapest arts orchestra...so sounds like samples to me...it sounds like its rendering the multisamples in the cloud and streaming you the output
It's not. If you go look at some of the demos, you'll see that they've built a model, kind of like Synthesizer V or SWAM builds a model, except it's doing off lining the processing.
Zerocrossing Media

4th Law of Robotics: When turning evil, display a red indicator light. ~[ ●_● ]~

Post

bermudagold wrote: Thu Feb 26, 2026 2:12 am...im pretty sure its playing the samples...
The test would be how large the install is, or if you have to be on line to use it, which is why I asked zerocrossing if he had it or not.
NOVAkILL : Legion GO, AMD Z1x, 16GB RAM, Win11 | Audient EVO 8 | Lumi Keys | Studio Pro 8
Korg Odyssey, bx-oberhausen, Proxima, PolyMax, GR8, JP6K, Union, Atomika,
Invader 2, Flow Motion, Olga, TRK 01, Thorn, Spire, VG Iron

Post

wagtunes wrote: Wed Feb 25, 2026 2:47 pm FWIW, this is my take on Suno or whatever.

First off, if people want to use it, power to them. I don't condemn them for it.
As for the sound, all I know is the current #1 country song was completely made with AI. You may not like it but it obviously appeals to a lot of people.

I don't know what the future of AI is nor do I care, outside of my one little site that I use to do my AI vocals. Everything else I do myself.

Anyway, that's all I got.
Really? Radio hits are AI generated now?

This is interesting, but does not change any deep feelings I have about art. Nope, not at all!

Post

Abomination.
“The Generals sat, and the lines on the map, moved from side to side.”
― Pink Floyd

Post

BONES wrote: Thu Feb 26, 2026 10:06 am
bermudagold wrote: Thu Feb 26, 2026 2:12 am...im pretty sure its playing the samples...
The test would be how large the install is, or if you have to be on line to use it, which is why I asked zerocrossing if he had it or not.
the website says it is rendered in the cloud and streamed to you so I'm pretty sure you have to be online anytime you make an edit...and they say them (or ew) are recording budapest arts orchestra for their new string sections instrument...if they aren't using multisamples and are SWAM type hybrids where they are using samples for the attacks and synthesizing the rest of the envelope,...why would they need to spend big money on new sampling?...they could "train" on ew existing catalog...I haven't used it and would need to research more, but I'd be surprised if its not using multisamples...that's why I think zerocrossing's original desire is a long way off and has the similar challenges as today's workflows as u and I stated above...and again synthesis nor hybrid has ever been able to achieve highest realism...they only have a violin, a cello, a trumpet, a saxaphone, and a duduk right now...and none sound anywhere as real or as versatile as current high end sample libraries imo
Music had a one night stand with sound design.....And the condom broke

Post

Zeisner wrote: Fri Feb 20, 2026 5:08 pm I'm still waiting for that killer track. Where is the sheriff?
Its never going to come because idea curation is a big part of being a producer. Anyone who goes to sunos site, listens to the demos and goes “wow, i could never do that” is just way behind the curve already.

Post

bermudagold wrote: Thu Feb 26, 2026 2:12 am
zerocrossing wrote: Thu Feb 26, 2026 2:00 am No, you are not "using" multisamples when you use ACE Studio. It may have been trained on them, but it generates the results from the training, not by playing back the samples.
what does that even mean?...the signal source has to come from somewhere...im pretty sure its playing the samples...its just making assumptions on velocity and keyswitch articulation triggering and performing smoothing in a void of that input.
Ace Studio is not using samples. There are no samples sets and articulations. It generates it the same way it generates an image or a video.

Post

pdxindy wrote: Wed Mar 04, 2026 2:48 am
bermudagold wrote: Thu Feb 26, 2026 2:12 am
zerocrossing wrote: Thu Feb 26, 2026 2:00 am No, you are not "using" multisamples when you use ACE Studio. It may have been trained on them, but it generates the results from the training, not by playing back the samples.
what does that even mean?...the signal source has to come from somewhere...im pretty sure its playing the samples...its just making assumptions on velocity and keyswitch articulation triggering and performing smoothing in a void of that input.
Ace Studio is not using samples. There are no samples sets and articulations. It generates it the same way it generates an image or a video.
but generates it from what?...ur saying the sound is completely synthesized by additive synthesis?...folks above said it's hybrid like SWAM...that's still using samples for at least the attack...why would they be recording an orchestra in budapest for their new string instrument if it doesn't use samples?...and how come no one in over 20 yrs has been able to make additive convincingly real?...google's tone transfer AI project went on for years and couldn't do it...u believe ace studio has made that type of breakthrough?...until they say explicitly I have a hard time believing it

EDIT:
so I haven't done ALL my homework, but they are using SVS (Singing Voice Synthesis) and/or RVC (Retrieval-based Voice Conversion)...all using statistical parametric synthesis ...using statistical models to reproduce the features of a voice, and unit selection, when snippets of vocal recordings are recombined on the fly...been around since the 50s...now scientists are using generic versions of the following to achieve higher accuracy
deep neural networks (DNN)
convolutional neural networks
recurrent neural network with long-short term memory (LSTM)
generative adversarial network
diffusion models
flow-matching generative models
latent audio codecs

They are using samples of their vocals/instruments both to train and synthesize...In total, about an hour of an individual’s vocals are needed to construct an initial model, and 10-15 minutes of recording will be used for the synthesizing process itself

sounds like sampling is still involved in the final sound...although some researcher say ace studio claims with their verse25 engine,...that once trained, the system no longer needs the original PCM samples. It uses its inference engine to "hallucinate" a brand-new audio signal that matches the style it learned...by what mechanism is the "hallucination" generated?...so looks like jury still out
Music had a one night stand with sound design.....And the condom broke

Post

bermudagold wrote: Wed Mar 04, 2026 5:26 am but generates it from what?
I know it's hard to understand, but not from a built-in synthesizer. Modern AI vocal systems don’t run a synth engine in the background. They generate the waveform directly using neural networks trained on real vocal recordings.

The model predicts acoustic features from text and pitch, then a neural vocoder converts that into audio.There are no oscillators, no filters, no subtractive synthesis involved. A synthesizer generates sound from mathematical waveforms. AI voice models generate sound from learned acoustic representations. Different paradigm entirely.
sounds like sampling is still involved in the final sound...although some researcher say ace studio claims with their verse25 engine,...that once trained, the system no longer needs the original PCM samples. It uses its inference engine to "hallucinate" a brand-new audio signal that matches the style it learned...by what mechanism is the "hallucination" generated?...so looks like jury still out
It’s not sampling.
During training, the model learns statistical patterns from PCM recordings. After training, the original audio isn’t used anymore. At inference, a neural acoustic model predicts a spectrogram from text and pitch, and a neural vocoder generates a brand-new waveform from that.
No stored samples are triggered or blended*. The “hallucination” is just probabilistic signal generation based on learned weights.

*It’s a similar misunderstanding as with image models.The system isn’t literally compressing hundreds of terabytes of recordings into a few gigabytes of VRAM. The training data isn’t stored inside the model. That's technically simply not possible.
What remains after training are learned statistical patterns encoded in the weights, not the original audio files themselves.
Last edited by Tiles on Wed Mar 04, 2026 6:42 am, edited 1 time in total.
“The biggest crime of a musician is to play notes instead of making music.”
Isaac Stern

Post Reply

Return to “Machine Learning and AI for Music Creation”