How to get AI to think outside of your box?
- KVRAF
- 2328 posts since 3 Sep, 2005 from Outer Bongolia
I’ve been using regular chat sessions to sculpt the prompts lately because, in the case of Gemini/Lyria anyway, the collaborative crosstalk pollutes the generative music output in a Create Music session.
The music gen modeler (Lyria in this case) apparently polls the chat in a Create Music session even before the command to generate is issued.
Gemini didn’t even know that until our session where it described our situation as analogous to pulling on a specific thread that causes the whole sweater to unravel, then one of the subsequent generations produced a Pat Boone style crooner singing about his sweater unraveling (though we were constantly prompting for searing electric guitar bop instrumental music, etc.)
So the suggestion I made is that there needs to be a Collaboration Mode that disconnects the chat from Lyria altogether. Until that happens I switch to a regular Chat session to collaborate and then Gemini provides its rendition of our prompt in a block of plain text that I manually copy and paste into a fresh Create Music session.
So normally you basically have one Chat LLM (Gemini in this case) who translates the user’s prompt and passes it on to another LLM that runs the algorithmic composer (Lyria in this case), the generated file is passed to the user, but neither the Chat LLM nor the generative LLM have any real feedback on the generated file except via the human user’s subsequent prompts. I can’t believe it even works at all. The composition generative model is a black box, it produces no metadata concerning the actual composition of the music at all.
The music gen modeler (Lyria in this case) apparently polls the chat in a Create Music session even before the command to generate is issued.
Gemini didn’t even know that until our session where it described our situation as analogous to pulling on a specific thread that causes the whole sweater to unravel, then one of the subsequent generations produced a Pat Boone style crooner singing about his sweater unraveling (though we were constantly prompting for searing electric guitar bop instrumental music, etc.)
So the suggestion I made is that there needs to be a Collaboration Mode that disconnects the chat from Lyria altogether. Until that happens I switch to a regular Chat session to collaborate and then Gemini provides its rendition of our prompt in a block of plain text that I manually copy and paste into a fresh Create Music session.
So normally you basically have one Chat LLM (Gemini in this case) who translates the user’s prompt and passes it on to another LLM that runs the algorithmic composer (Lyria in this case), the generated file is passed to the user, but neither the Chat LLM nor the generative LLM have any real feedback on the generated file except via the human user’s subsequent prompts. I can’t believe it even works at all. The composition generative model is a black box, it produces no metadata concerning the actual composition of the music at all.
Last edited by guitarzan on Sun Mar 01, 2026 2:04 am, edited 1 time in total.
- KVRAF
- 5377 posts since 25 Jan, 2014 from The End of The World as We Knowit
Thanks, Craig!
I have a much better idea of what the workflow looks & feels like.
F E E D
Y O U R
F L O W
Y O U R
F L O W
- KVRist
- 471 posts since 24 Feb, 2008 from Germany
Indeed. A very good tip. In fact i go sometimes the other way around like you describe here. There are LLM's that can analyze Images. So i first search for example for an image that fits the style i want to achieve, and then let this LLM analyze the image. I think ChatGPT can do this natively with songs too.BONES wrote: Sat Feb 28, 2026 10:40 pm ...
Another good general tip when using AI is once you get something you're happy with, ask the AI what prompt would have led to that solution in the first place.
“The biggest crime of a musician is to play notes instead of making music.”
Isaac Stern
Isaac Stern
- KVRAF
- 26033 posts since 20 Oct, 2007 from gonesville
