Ok, here's my situation. For some unpredictable life reasons everytime I really get on with someone musically, they move half a continent away.
Found this trhead here on KVR: viewtopic.php?t=501875
There were some amazing tips. Dante over VPN, Jimjam, Ohm Studio. Out of these I'm tempted to try Ohm Studio, but I'm affraid that the "new DAW" learning curve might be quite demotivating. I'd like to continue that topic with a fresh start here, though.
See ... the other day I was messing around with OBS. I use that to occasionally stream a video game or to record a little tutorial here or there. I've even found a way how to mix a mic and output of my DAW and feed it into OBS. (Either externally in the analog domain or with a fancy soundcard.) I can do 1:1 stream of Ableton, but twitch transcoding makes huge latency and it would be quite one-sided collab.
I'm not really looking forward to do real-time jamming. That's impossible over the internet. But I'd love to be able to open a synth and nerd out about it the same way we would in front of one screen. For that I think anything up to half a second of delay might work. And if OBS + twitch transcoding server + whatever CDN they use + client decoder can do 3 seconds of end to end delay (tested while streaming a game to one of those friends), if I cut down the middleman somehow, it might be technically possible to get close to that 500ms.
What I really need for this to work is pretty much a remote access app with voice chat that I could route my mixed signal into. So I did some digging, pinpointed four candidates end went testing. TeamViewer, Zoom.us, Parsec and Chrome Remote Desktop. First two were slower, but feature rich. Parsec and Chrome Remote had way less features, but I like how promisting the speed was. The problem was oddly enough, the audio, not so much the video. Parsec and Chrome have no voice chatting functionality at all. TeamViewer and Zoom.us do but there is a massive glitch there. Both of these platforms implemented some heavy echo-rejection algorithms to the voice-chat, that just screws the audio over big time. If it doesn't think it's a voice, it won't pass it through, so even a metronome tick was extremely confusing for it.
So ... when I get back on this, I will try Discord (for audio) + Parsec/Chrome combo, because that's the only idea I have left. I'm affraid that two different delays might be uncomfortable at best. But thought I'd ask you guys. There must be someone in this wonderful community who was solivng the same problem, doing the same research. Let's brainstorm a bit more.