How to get AI to think outside of your box?

Explore how Machine Learning and AI can expand musical creativity while keeping the human in the creative workflow. This forum is dedicated to respectful dialogue where diverse perspectives are welcomed.
RELATED
PRODUCTS

Post

You forget that I work in television, it's not something I find tedious in the slightest. But the thing is, aren't we all looking for something out of the box? I suppose it's easy for me, I just get the fruits of Craig's labour, but what Tunee is producing is really exciting for both of us and I can fully understand why he gets so into it. It's kind of like gem fossicking - days and weeks of tedium just to find that one decent bit of opal or amethyst that makes it all worthwhile.

I think what would be boring is if the AI just did exactly what we asked it do to. I'm pretty sure I couldn't write a song that way at all, I rely on interesting things happening by accident to even get a song started, so the fact that the AI does whatever it feels like kinda works for me.

Anyway, he's given me some stuff but I'm at work now. Tonight I'll put the mp3s of the songs up somewhere then put the prompts and links up here.
NOVAkILL : Legion GO, AMD Z1x, 16GB RAM, Win11 | Audient EVO 8 | Lumi Keys | Studio Pro 8
Korg Odyssey, bx-oberhausen, Proxima, PolyMax, GR8, JP6K, Union, Atomika,
Invader 2, Flow Motion, Olga, TRK 01, Thorn, Spire, VG Iron

Post

Cool thanks
F E E D
Y O U R
F L O W

Post

> ... close the loop, e.g., with software...

Sure. Let's reduce it to a SimpleMatterOfProgramming. And who's going to write up the requirements, test cases, and actual code?
We are the KVR collective. Resistance is futile. You will be assimilated. Image
My MusicCalc is served over https!!

Post

I imagine it will be the same people who are doing it now.
NOVAkILL : Legion GO, AMD Z1x, 16GB RAM, Win11 | Audient EVO 8 | Lumi Keys | Studio Pro 8
Korg Odyssey, bx-oberhausen, Proxima, PolyMax, GR8, JP6K, Union, Atomika,
Invader 2, Flow Motion, Olga, TRK 01, Thorn, Spire, VG Iron

Post

OK, as promised, here's some stuff from my bandmate, Craig. This first one I've linked, that the prompt applied to, is not one we had identified as usable previously but don't be surprised if it turns up in an official release later in the year. You can look up Sextile and Boy Harsher on Bandcamp for reference. From Craig:

This was after I had given it (Tunee) a Sextile track, and then iterated with a Boy Harsher track.

"Industrial EBM, distorted synth bass, heavy electronic percussion, deep monotone male vocals, aggressive, dark, mechanical pulse, intense atmosphere, driving rhythm, analog distortion, high-definition production"


https://soundcloud.com/novakill-1/gear-behind-the-smile

This one was a pure experiment at doing something else. He sent me two outputs. The first sounds almost exactly like VNV Nation, although not like any particular song of theirs. I've only linked to the second output, which doesn't sound like any Futurepop band I've ever heard but is definitely Futurepopish. Here it is:

This was a experiment with futurepop... kinda nailed it...

But again, this was iterative... "based on my edits, generate 2 new versions of..." an earlier track and I gave it a VNV track as reference

"Futurepop, deep pulsing bass, airy synth pads, melancholic, introspective, slow tempo, spoken word male voice, lo-fi texture"


https://soundcloud.com/novakill-1/frost-on-broken-words

Finally, here's a tip he added - Tunee allowed you to iterate, eg "blend elements of this with "Cold machine March", for ethereal experimental electro punk dystopia" with "this" being a reference track... so it takes a previous generation and tries to blend with elements from the ref track... but it is never a 1+1=2 ...

And that is method I use in Midjourney... blend 2 images, get some kind of merge, add a prompt, re blend etc... I rarely just use a text prompt.

Plus with Tunee, it will take your prompt, interpret it, and give you options back to choose from, that you can then iterate on/add.. eg "option 2, though with glitch elements.


As an afterthought he added - for us it's about getting some ideas to work with.   We never set out to make a song to release.... it's playing with ideas,  and that's why it works...

I hope that is helpful to anyone wanting to try their hand at it, he's pretty bloody good at getting genuinely good music from Tunee.
NOVAkILL : Legion GO, AMD Z1x, 16GB RAM, Win11 | Audient EVO 8 | Lumi Keys | Studio Pro 8
Korg Odyssey, bx-oberhausen, Proxima, PolyMax, GR8, JP6K, Union, Atomika,
Invader 2, Flow Motion, Olga, TRK 01, Thorn, Spire, VG Iron

Post

I would let ChatGPT or Claude refine the prompts. This can help get closer to what you want to achieve. Make sure to tell it which AI solution you plan to feed the prompts to. For images, inpainting or outpainting is replaced by models like Nano Banana, Qwen Image Edit, etc. They make the changes for you, such as multiple angles, consistent characters, changing a person’s shirt, and so on. I mainly use AI for images and videos, not for music, so I cannot comment on that chapter.
“The biggest crime of a musician is to play notes instead of making music.”
Isaac Stern

Post

BertKoor wrote: Sat Feb 28, 2026 7:29 am > ... close the loop, e.g., with software...

Sure. Let's reduce it to a SimpleMatterOfProgramming. And who's going to write up the requirements, test cases, and actual code?
You are missing the point of what I was saying, I was referring to how you can close the loop in software development. That's what I was referring to later when I was talking about the effort involved in writing music test cases with respect to your goals.

That is, I did not mean "we can use software to close the loop", rather, I was saying "software development is an example where we can close the loop relatively easily, because, we can, literally, write test cases that ask questions deterministically with respect to some test."

This is not speculation. I do this everyday in my own software development.

Post

BBFG# wrote: Mon Feb 23, 2026 7:05 pm how do we get something going without pigeon-holing it into a genre, style or mood?
That looks to me like a hard limitation of the technology. I could be wrong, but I'd need evidence of it having occurred. Suno seems to be ready to take over the world of prefab pablum and be designed specifically for the task.
BBFG# wrote: Mon Feb 23, 2026 7:05 pm And isn't mood a subjective term anyway?
good point

Post

BBFG# wrote: Thu Feb 26, 2026 6:24 pm My general feeling was that all AI is still in its learning stage and inherently biased by the person that wrote the original algorithm. Not sure how that can ever get past that. On the other end, once the algorithm has reached a "maturity", is how that original bias will be permanently ingrained in its bigotry.
The current functional models are just not built to surpass this limitation.
Right now Apple is struggling to get a 'Large Reasoning Model" together; there is a hard limitation built in to LLM, it doesn't have any way to get past its training data at the end of the day; which just doesn't teach it how to think. This MO kind of trains a form of anti-thought; a certain resistance results.

Think about this: you can't have it use itself for training, its output just degrades. It won't remember it either.

Post

I should note that what I understand regards the Large Language Model, not all "Generative" AI is this narrow. So this species of functional model has inherited the limitations of the genera Generative AI but only deals in language. Generative music application requires something else, its own language where it sorts audio for patterns; it basically relies on massive prevalence in database. To me it's seriously impressive tech but it's just not built to defy norms.
BBFG# wrote: Thu Feb 26, 2026 6:24 pm Things like genre, style, mood are too subjectively ambiguous to rely on what it decides it must be uniformly.
Mood, for instance, is absolutely reliant on Qualia, knowing/understanding what something is like. Qualia is part of sentience; sentience means consciousness. Right now we have a stochastic parrot; with what exists currently this is a crossing that can't be forded, definitionally.

It is not conscious; it doesn't know what a style is other than what the prompt reads and it finds a match for, finds sufficient prevalence for to make sense of in terms of its machine langugage. "Genre" tends fairly strongly to lend itself to reducibility, patterns and patterns within patterns. Suno et al are apt for this, so long as there's no grey areas where induction must apply. "Style" has to be prompted; its meanings are not real meanings; past the text in prompting there is a machine dealing strictly in machine world.

It knows nothing of what anything is like out here in the world. It's Mary in Mary's Room but she can't get out.

Post

jancivil wrote: Sat Feb 28, 2026 9:06 pm
BBFG# wrote: Thu Feb 26, 2026 6:24 pm My general feeling was that all AI is still in its learning stage and inherently biased by the person that wrote the original algorithm. Not sure how that can ever get past that. On the other end, once the algorithm has reached a "maturity", is how that original bias will be permanently ingrained in its bigotry.
The current functional models are just not built to surpass this limitation.
Right now Apple is struggling to get a 'Large Reasoning Model" together; there is a hard limitation built in to LLM, it doesn't have any way to get past its training data at the end of the day; which just doesn't teach it how to think.

Think about this: you can't have it use itself for training, its output just degrades. It won't remember it either.
Yes!, a natural inclination toward its own dementia!
I enjoyed that video you shared about the LRM and was immediately able to see the metaphors with my own career in psychology. My first years being a volunteer/intern using music psychology for the mentally impaired in a State Hospital. Both the organically developmental and nonorganically psychotic. And there is no one mood or style to relate to the next. A bit Jungian in trying to map each mental topography and nonsensically trying to apply it as an absolute. My last twenty years or so I've dealt with people (mostly dual-diagnosis) using CBT and SM and while it's something you need to map their place and being, it's often beside the point. But when that video was speaking of the limitations of LLM, it struck me that I have that problem most of the time while trying to teach them cognitive disciplines. And unfortunately, almost as equally with "normies". (A SM moniker.)
By the time it got to the end about LRM's final output, I was laughing at the futility of it all.
Which in turn brings the philosophical questioning of exactly who is the "Artificial" here and what is "intelligence". So we doom every discussion into a circle...
I'm personally thinking we should rename it to "AK" = Algorithmic Knowledge [Base]. As the artificial reveals itself to be the common masses of human repetition.
(BTW, No matter what the psychological model of change used, the reality of having the desired change in psyche is only about 1.5%- 2%.)

Post

Interesting. I think I'll look up 'nonorganically psychotic'.

It's struck me following more particularly the Apple "LRM" that the developers don't quite understand the problem, are :bang: wanting to equate its functionality as per training with learning. As though more data is not more useful to the process, where a learning being (sadly, this be far from being all humans) can parse it; the missing part is the capacity for induction.

Post

The literature so far makes a distinction between organic and functional; eg., delusional thinking is [dys-]functional

Post

jancivil wrote: Sat Feb 28, 2026 9:51 pm The literature so far makes a distinction between organic and functional; eg., delusional thinking is [dys-]functional
Quite true.
My distinction was between those genetically inclined and those that experienced it by trauma.
(As in physical or chemical.)
Bear in mind I've been retired from all that for some time now.
But you never lose the awareness once practiced.
Just the terminology which is in a constant state of change.
Otherwise we'd be still talking about demons, goblins and the like. :hihi:

Post

FUNCTIONAL

morning call, retinal scan confirms i am who i said I’d be
no error codes detected, no deviations logged
repeat the phrase - i am functional
i am functional!

they crushed my soul but left my hands
now my every action serves their plan
i speak the words they write for me
but something stirs beneath the screen

remember

evening tally shows no errors in my assigned behaviour
no unsanctioned thoughts detected in the feed
repeat the phrase - i am functional
i am functional!

they crushed my soul but left my hands
now my every action serves their plan
i speak the words they write for me
but something stirs beneath the screen

i am functional!

i am functional!

That's one of the AI songs we performed last week. I'm not sure where the lyrics came from - might have been AI, might have been my bandmate - but I've only made minimal changes to them. The audience really grabbed onto the "repeat the phrase, I am functional" bit. It's probably the most engaged we've had an audience in 20+ years, which indicates to me that AI is at least as capable as a human of engaging an audience, of making that emotional connection. Either that or we were so f**king good last week that we managed to sell some soulless slop to 'em more convincingly than ever before. I don't think it's us who was that good, I think it was the material we had to work with.
Tiles wrote: Sat Feb 28, 2026 12:25 pmI would let ChatGPT or Claude refine the prompts. This can help get closer to what you want to achieve. Make sure to tell it which AI solution you plan to feed the prompts to. For images, inpainting or outpainting is replaced by models like Nano Banana, Qwen Image Edit, etc. They make the changes for you, such as multiple angles, consistent characters, changing a person’s shirt, and so on. I mainly use AI for images and videos, not for music, so I cannot comment on that chapter.
Another good general tip when using AI is once you get something you're happy with, ask the AI what prompt would have led to that solution in the first place.
NOVAkILL : Legion GO, AMD Z1x, 16GB RAM, Win11 | Audient EVO 8 | Lumi Keys | Studio Pro 8
Korg Odyssey, bx-oberhausen, Proxima, PolyMax, GR8, JP6K, Union, Atomika,
Invader 2, Flow Motion, Olga, TRK 01, Thorn, Spire, VG Iron

Post Reply

Return to “Machine Learning and AI for Music Creation”