Anyone interested in building an online AI music DAW for implementing on a FaaS cloud vendor?

...and how to do so...
RELATED
PRODUCTS

Post

WatchTheGuitar wrote: Mon May 08, 2023 6:39 pm
I've run a prediction budget model on Stanfords Alpaca, now a part of Open Assistant, and I've got it to reflect that one could run it as an individual on a timely basis by scaling all music AI offered on the platform based on usage and the subscription model (also developed by AI).

Because Stanford excels at business historically, it was kind of neat to hear their statistics reputation of Stanford in the business formulations.
Yeah I’ve always wanted to do all the work for a word salad talking dreamer.
thanks, you can get to cleaning while i go on the modular.

Post

DJ Warmonger wrote: Wed May 03, 2023 8:57 am Hmmhmmhmm. I'm torn between "That's exactly what I want to do" and "I hate SaaS and cloud-based business, as it's another form of capitalism".

Care to explain what kind of product / business model do you have in mind, precisely?
A DAW is essentially a collection of audio processors.

So we would probably toss a coin as to what we feel should be the most important first section of the DAW.

The Logic Pro manual features three essential sections:

1. User guide
2. Instruments
3. Effects

My vote would go to the arrangement window first. The multitrack interface.

If we were asking AI itself to come along for the ride, I imagine our 1st couple of important questions would be "What are the best first-step Questions to ask an AI in this use-case scenario"?

As we got to know each other, one person may have a proclivity, a bent towards "timbre" but timbre in synthesizer morphology.

Like the Alchemy synth.
...business model do you have in mind, precisely?
Admittedly, I got FaaS from asking AI itself.

And to your comment about being "mixed" on the 2 philosophies, I personally keep in mind the roots of A.I. itself.

I B M... Built for those who have money to make even more.

Philosophically speaking, the best AI product would be one that grows exponentially in success and is free to use and free to run.

A tall order.

Personally, I am a member of the Spectrum and my whole life, money has come to me through the eye of a needle.

I would like a job at this.

But I have biases towards mycology/T. McKenna, singularitynet.io and blockchain, respectively.

If I am going to be honest.

Additionally, I like "modularity".... like Berkeley's "Caffe" AI.

Modularity as a motto even.

It resembles neural networks and even the fact that (sound)waves have nodes.


Gotta answer another reply in this thread.

I hope this helps.


Sincerely,

KV

Post

EmRysRa wrote: Wed May 03, 2023 9:20 pm Does it have no database? Ooooo :o so gets the information out of his pocket :hyper: :tu:
Databases are good thinking.

Vector databases are being built to augment any limitations in One-shot MMLU results in chatGPT rubrics.

But either a combo of auto-gpt/babyAGI loops and/or functions across multiple Hugging Face arrays (tranformers, generatives or the most optimum large language models) OR ONE well understood neural network may be the right foot to lead with.

On chatGPT (Generative (NN) Pretrained (NN) Transformer (NN)) there are great recent suggestions in prompt engineering beginning already.

Enter "Smart GPT" https://www.youtube.com/watch?v=wVzuvf9D9BU
Last edited by kivadour on Mon May 08, 2023 7:28 pm, edited 2 times in total.

Post

No_Use wrote: Wed May 03, 2023 10:05 am
kivadour wrote: Wed May 03, 2023 2:21 am It isn't necessary to know "everything".

With chatGPT today, one can teach themselves by simply knowing good prompt engineering.
In the Reaper forum more and more scripts pop up 'written' by chatGPT (asked by people who are not into scripting/programming) and most of the time the scripts are ridicolous, chatGPT making up functions which simply don't exist, mixing up programming languages and so on.
I believe this because of the way of looking at AI itself...and also resource management.

Even if your local machine has achieved an amenable collection of offline AI agents to help you release AI tools... replete with awesome github repo documentation... if I am not thinking about what I'm thinking"...like expressing myself with 5 words instead of twelve...then I'm down a rabbit hole.

This why it's important for me to keep certain philosophies in check like the old Texas A+M slogan

"Uniting the right brain dreamers with the left brain doers".

*In order for an optimal "snap-on-tools" moment to happen with a novel music AI tool, a premeditated "online AI music DAW/website" would suit this use case optimally.

THEN you'd benevolently offer the offline version of your AI music DAW, but without the bells and whistles of the [name brand goes here] Online Music AI DAW universe.

I HAVE to know that Apple, AVID and Ableton etc. are already about to release this...however the question remains, "How good is it for the songwriter in the DAW really"?



-KV

Post

kivadour wrote: Mon May 08, 2023 6:22 pm Because Stanford excels at business historically, it was kind of neat to hear their statistics reputation of Stanford in the business formulations.
Its a shame it doesnt excel at proper use of human language.
my other modular synth is a bugbrand

Post

whyterabbyt wrote: Mon May 08, 2023 7:40 pm
kivadour wrote: Mon May 08, 2023 6:22 pm Because Stanford excels at business historically, it was kind of neat to hear their statistics reputation of Stanford in the business formulations.
Its a shame it doesnt excel at proper use of human language.
Just need to connectify the databole to the computipode
Image

Are you safe?
"For now… a bit like a fish on the floor"
https://tidal.com/artist/33798849

Post

I think chatGPT1-4 is supposed to be a better "coversational AI" then it that case.

Post

kivadour wrote: Mon May 08, 2023 8:01 pm I think chatGPT1-4 is supposed to be a better "conversational AI" then in that case.

Post

wtf

Post

Some random thoughts...

Serverless / FaaS needs some proper thinking about the overall architecture.

I guess it's OK if the AI aspect sits in a serverless environment. Give it a prompt, it thinks for a while and it gives back an answer.

But the data flow of a traditional DAW, it's bonkers to put that on the cloud as nano-services. Here's why.

Audio gets processed in small buffers of anywhere between 32 and 512 samples, depending on how much latency the user is willing to experience. Larger buffers are more efficient but latency is high as well. Smaller buffer sizes have much more processing overhead costs. Each full buffer of audio needs to flow through the whole processing chain.

Synth -> zero or many effects -> mixer channel -> effects -> mixer bus -> mastering effects.

Do this for 30 tracks in parallel with 6 components each, on a buffer size of 128 samples at 48 kHz that's 180 serverless function calls 375 times per second. Nearly 70.000 for a second, on a track of 3:30 that's 14 million calls. It usually takes a musician some hundred attempts before even remotely satisfied. Now look up the cloud computing costs.

Also musicians are rather protective about their work. What if this platform they used to create things goes bankrupt. They lost the projects they were working on, and even with a backup they cannot continue. Vendor lock-in. The market hates it.
We are the KVR collective. Resistance is futile. You will be assimilated. Image
My MusicCalc is served over https!!

Post

whyterabbyt wrote: Mon May 08, 2023 7:40 pm
kivadour wrote: Mon May 08, 2023 6:22 pm Because Stanford excels at business historically, it was kind of neat to hear their statistics reputation of Stanford in the business formulations.
Its a shame it doesnt excel at proper use of human language.
Please get ChatGPT to rephrase these posts, as I can hardly understand WTF is your point in all of this.

BTW, recent news show that local open-source models are catching up with ChatGPT 4 (which is not even not released out to wider audience yet) for a fraction of the cost, and can run locally on a decent GPU.
https://www.semianalysis.com/p/google-w ... Wx5kFbg0ZE
Think about it, as GPUs are in general underutilized by DAW users. Then, for music specifically we may do well with even smaller specialized model, trained on a curated data set.
Blog ------------- YouTube channel
Tricky-Loops wrote: (...)someone like Armin van Buuren who claims to make a track in half an hour and all his songs sound somewhat boring(...)

Post

EmRysRa wrote: Wed May 03, 2023 9:58 am AI music... hihihiii, AI will never can sing like human and never compose good music. Also AI audio coding it's rather funny, full of wrong codes.
Have you heard this one?
https://youtu.be/m04LOZ3ps9A :lol: :lol:

Post

It seems linear, here it is more obvious https://www.youtube.com/watch?v=JwISRB9bouQ
Creator of EmRysRa and sound designer at SaschArt and BWP

Post

An important premise: I'm not interested in anything remotely close to AI daw or software as service.

With that said, I don't want to be *that* person, but there's something I can't really work out about this thread. You said
kivadour wrote: Wed May 03, 2023 2:21 am I'm looking to shift the music producer experience from "think like a computer" in your DAW... to "dream in color."
Do you already have any idea of how the user will interact with the software, or is it just your vision? Would you keep some elements of current DAWs, or are you thinking about a completely different paradigm? How the user will provide instructions to the DAW, so the AI can execute them? Will the user be able to "do it the old way" ("thinking like a computer") bypassing the AI when it doesn't cut? If so, how will you keep it "user friendly"? Conversely, if you don't allow bypassing the AI, how will you enable the user to make his ideas reality when the AI doesn't output what's expected?

These are just some random questions I came up with in 5 minutes...


From what I can gather from this thread (maybe there are more informations you didn't share?), the project seems at brainstorming stage, not ready even for the initial requirement/functional analysis... I hope I'm not too hard, but at this stage "dream in color" and "AI" seem just marketing words to me, I can't even remotely envision a tool/product...


I'm not interested at all in such a product, I'm just trying to provide some (hopefully) constructive criticism...



About "AI writing code"... let's say it produces some working code (it doesn't seems to be always the case from what I gather online, even on simple things a programmer would write without any trouble), but can you really trust it does what it needs to to? does it handles correctly any special case? and what about any evolution of the code? how about the software architecture? who grants you the code suggested by AI won't put you in a dead end down the line (or even just a situation where any evolution of the product is expensive)? will the code perform as good as the code written by a human being (you may reach the same results in various ways and some will be faster and a good programmer cares about performances)? Of course a human is not an insurance for these things to be handled properly, but they usually do a decent job once they have some experience.

To sum it up, do you feel confident in having AI writing a really complex software like a daw, where you / your staff has no real control of the code? How will you succeed in this task (especially with the current state of AI when it comes to coding)?


By the way, I'm a Java programmer (I started programming in C as a student) and I work in software maintenance (so my expertise is on "things going wrong")...


Just my two cents, of course!
free multisamples (last upd: 22th May 2021).
-------------------------
I vote with my wallet.

Post

I've been thinking about possibilities and I envisioned this:

Autonomous AI agent tightly integrated with DAW

- Takes user input as a task: "Create me a 5-minute syntwave track in C minor"
- Proceeds executing the task autonomosuly, but still allowing user intervention or interruption, like this: https://agentgpt.reworkd.ai/pl
- Uses AudioGPT to analyze the generated audio: https://github.com/AIGC-Audio/AudioGPT and understand the features described in natural language ("make this bass darker and fatter).
- Might need a dedicated LLM / LoRA tuned specifically to music production, trained over a large dataset of audio, including audio snippets, one-shots, presets (?).
- Not sure how to train on how to use syntesizers though. We can stick to "preset only" for now?
- Tight DAW integration (or just a brand new DAW) gives agent access to all controllable parameters of plugins, and all the features of DAW in general.
! Using commercial LLM (ChatGPT) is not free, and might actually get quite pricy if the workload is high. An alternative is to use smaller LLM running on user's local GPU. It is otherwise wasted resource for music production, anyway.

This seems doable with current state of knowledge and technology, but certainly not something a single person could create in their basemernt.
Blog ------------- YouTube channel
Tricky-Loops wrote: (...)someone like Armin van Buuren who claims to make a track in half an hour and all his songs sound somewhat boring(...)

Post Reply

Return to “DIY: Build it and they will come”