What are your methods for transcribing individual instruments from mixed audio?

Chords, scales, harmony, melody, etc.
KVRist
92 posts since 26 Mar, 2017

Post Mon Oct 18, 2021 11:32 am

There's been some discussion about this in other threads, so here's a dedicated topic.


When you have to transcribe separate instruments from mixed audio (stereo or mono audio to MIDI or staff), how do you [try to] ensure that pitches and notes you are writing down correspond to what is actually being played /sequenced?



To what extent that kind of transcription is even possible for a complex spectrum, like symphonic orchestral timbres or dense electronic mixes?

How do you deal with problematic situations like:
- the risk of mistaking upper bass partials for fundamentals of treble sounds?
- loud rhythmic delays which are mixed with played notes?
- tracing the changes in timbre (filter/wavetable/phaser/flanger etc.)?

Do you use any software tools for demixing / audio to MIDI?

Do you recall any particular examples (individual tracks or entire genres) that are very hard to transcribe?


Bonus questions: :)
In case of complex orchestral works, do you even try to transcribe yourself if the score is available on IMSLP (or for recent/copyrighted ones, via $$ purchase of official score) ?
Have you ever wished that original DAW project files (and/or MIDI, plugin presets etc.) could be bought alongside mp3s, WAVs etc. at stores like Beatport?

User avatar
addled muppet weed
86198 posts since 26 Jan, 2003 from through the looking glass

Post Mon Oct 18, 2021 11:41 am

N__K wrote:
Mon Oct 18, 2021 11:32 am

Do you recall any particular examples (individual tracks or entire genres) that are very hard to transcribe?
i hate transcription, with a passion, takes forever!
mainly because im crap at it.

im reasonably ok (slow) with clean piano recordings or guitar, other instruments not a chance. hardest i ever tried personally, was ozric tentacles, ed, is a f**king wizard with snakes for fingers, and also uses a lot of delays and such, so can get a bit, was that a note or a delay?
more often than not, i ignore the scription part and just learn to play, rather than write it down.
still takes a while, because i didnt practice enough as a teen, because we had a friend, who could hear a piece once, and write it out for us :shrug:

KVRAF
1786 posts since 14 Sep, 2004 from $HOME

Post Mon Oct 18, 2021 11:54 am

I’ll call my people and tell them I need it tomorrow…

KVRist
232 posts since 4 Aug, 2020 from Montreal, Canada

Post Mon Oct 18, 2021 4:39 pm

Funny enough, I had to do this once, on my own stuff... T'was a mid-project decision that I wanted to hire a real horns musician. He very much preferred having a sheet of paper, but I didn't have a MIDI track of my thing (the fake brass section was key'ed on the fly and recorded as audio on the way in)

So I ended up spending half an hour discerning the exact notes and voicings I had just fiddled with a week ago myself, before drawing them on sheets of paper... Nothing fancy content-wise, really. 30~40 vertical moments at most, easy chords, straight rhythm, three and a half motives, three pages of letter paper but hey there were four parts. It was honestly full of tension to listen and compare many passes to make sure the transcription was accurate enough. Oh, and if it's not obvious enough, I was clever enough to listen to the solo of that stereo track, and have a rough recall of the virtual instrument patch for the sake of comparison.

I'd not do this again :)

KVRAF
23311 posts since 20 Oct, 2007 from not here

Post Mon Oct 18, 2021 6:23 pm

I don't have tips 'n tricks regarding transcribing, I can rely on my ear and am trained to know what things are away from any instrument. When I started out I figured the first thing I had to do was be able to replicate this thing I was stumped on at first via just singing it (this way my body would know, it wasn't abstract). I understood that I didn't grok how it appeared to change tonic (no, I did not have the lingo at 13 like that, or know anything) from verse to this bridge riff (Proud Mary by Creedence Clearwater Revival. It goes te te te sol, te te te sol, te te te sol, fa me, me me me me, do. It turns around back to do from sol.) So I did the thing and had a much more solid aural picture of things behind it. This was huge.

Now, some things I could not sing at all, obviously, too many notes or leaps or too much movement. These were guitar solo kind of objects and idiomatic so the shapes of the fretboard were key to knowing what it was (also I watched everybody I could). But, the m.o. here was replicate notes by the voice to whatever extent and on the fretboard, the latter being a pretty strong clue (w. Proud Mary, I didn't actually play the guitar at the time, which is fortunate because I'd have had that crutch and I really did sort myself finally. I coulda guessed the notes, but I didn't play.).

But the presentation here seems much more interested in privileging sound design terms over music, as a comfort zone: "complex spectrum, like symphonic orchestral timbres or dense electronic mixes"
problematic situations like:
- the risk of mistaking upper bass partials for fundamentals of treble sounds?
I don't even recognize that one, and tbh doesn't pass my smell test.

I have to say one who has gotten to know the instruments of an orchestra won't tend to feel so lost as that.
- tracing the changes in timbre
(filter/wavetable/phaser/flanger etc.)?
What? The best I can do with this is, one of the early things I wanted to pick up from the guitar seat was Hendrix, Foxy Lady. I never quite got that beginning with the G vibrating on the 12th fret going into feedback, up in Reliable Music uptown, yo. Not for lack of trying.
(At the time nobody knew what a flanger was. I asked the guy at Reliable what that airplane taking off deal was on those records and he tried to sell me a phase shifter. It didn't sound at all right. :lol:)

We're writing flanger settings down 'by ear' are we? Transcribing, sure. :scared:
N__K wrote:
Mon Oct 18, 2021 11:32 am
Have you ever wished that original DAW project files (and/or MIDI...
I have projects of my own, who has time for that.

When I adapted these 6 or 7 Satie things, I didn't bother, I don't need the practice. I went to IMSLP like any schmuck and downloaded pdfs and MIDIs. This one was from a straight notation MIDI, hard quantized:
https://y2u.be/sGdKS769C8U
I had the tune by ear pretty much at once (sol te la sol FI, FI; sol te la sol DO), I'd not heard it before, but I heard a reggae scritch on this two-step and got to work making the MIDI swing a bit. I was so buzzed I forget to play most of the written tune in the "B" section but ultimately who gives a f**k anyway. It's a take.
Last edited by jancivil on Mon Oct 18, 2021 6:50 pm, edited 1 time in total.

KVRAF
23311 posts since 20 Oct, 2007 from not here

Post Mon Oct 18, 2021 6:48 pm

No, the MIDI comes from https://www.kunstderfuge.com/satie.htm
some of it is "live" ie, not hard quantized, and they're copyrighting those ones.

the one more challenging thing I remember transcribing ALL of was Brazieal and I did was a fair patch of Fred Steiner's music for Rocky and Bullwinkle for his band Cartoon.

was it accurate? y'all tell me:
Cartoon live at KPFA, Rocky & Bullwinkle
https://youtu.be/IUTqqotOtvk?t=111
it wasn't the tune that was tricky, it was the harmonies on the upbeats. You have to know one or two things to get that.
big fun

KVRist
341 posts since 18 Jun, 2010

Post Mon Oct 18, 2021 7:46 pm

Ear training - being able to identify intervals/chords/harmonic progressions, based on their sound - is the biggest aspect of this. But, a few things that I find helpful:

- Looping the audio (and transcribing it) one measure at a time

- Timestretching things to far slower than the original tempo (Ableton's Complex Pro algorithm is great for this, and YouTube's little-known, built-in 25/50/75/125/150 timestretch settings can be surprisingly good)

- For lower-register things, transposing the pitch up (while retaining the original time) can be really helpful (again, Ableton FTW)

KVRAF
1887 posts since 2 Jul, 2010

Post Tue Oct 19, 2021 3:33 am

By ear, with some singing/humming, maybe a piano if checking a whole chord or something. Gets easier with practice.

It is slow and difficult, but also educational. It's not such a waste of time if you are doing some of the intended analysis/learning in the process.

The wrong instrument/octave problem should be fairly easy to notice/correct later in practice. But certainly I would have trouble with novel band arrangements involving unknown instruments and effects; you do have to know what parts you are separating.

KVRist

Topic Starter

92 posts since 26 Mar, 2017

Post Tue Oct 19, 2021 11:52 am

Thanks to everyone for replies so far!



jancivil wrote:
Mon Oct 18, 2021 6:23 pm
N__K wrote: problematic situations like:
- the risk of mistaking upper bass partials for fundamentals of treble sounds?
I don't even recognize that one, and tbh doesn't pass my smell test.

I have to say one who has gotten to know the instruments of an orchestra won't tend to feel so lost as that.
Sorry, I might have phrased that in a confusing way.

What I mean is a situation when partials of lower notes are masking fundamentals (and/or other partials) of higher notes, especially when the lower sound is harmonically rich and the higher one is not.


For example, a loud sustained note on a cello one octave below Middle C, and a quiet sustained flute note at G above Middle C. If the flute (magenta) is not loud enough in the mix, it will be masked by the cello (yellow):

cello C2 (color marked).png
cello C2 + flute G3 (color marked).PNG


A more complex example with one cello note and two flute notes:

cello C2 + flute G3 and Bb4 (color marked).PNG


Those are contrived examples, admittedly, but such situations do happen.

In my experience, it can be a problem especially in transcribing electronic music when:
1) bass has harmonic spectrum like a saw wave (possibly with distortion, filling the treble range further), and
2) treble instruments are mixed so that not all notes stand out from the bass

In that case all instruments do contribute to general timbre, but it can be very hard (or outright impossible) to figure out whether a particular frequency peak belongs to the bass instrument, or is it one of first partials of some treble instrument.

And of course it gets more complicated if sounds do not follow the harmonic series in a recognizable enough way.
You do not have the required permissions to view the files attached to this post.

KVRist

Topic Starter

92 posts since 26 Mar, 2017

Post Tue Oct 19, 2021 12:04 pm

Here are my current methods:

- select/synthesize similar sounds (timbre, envelopes, etc.) and compare by ear

- in case of heavily effected (delay, phaser etc.) sounds, try to find similar effects and settings

- do momentary A/B comparison (via solo/mute buttons) between mix and individual instrument I'm transcribing

...while checking what's heard on a spectrum analyzer with semitone grid (such as this: https://forum.cockos.com/showthread.php?t=258602 ), especially if unsure.



Sometimes I also use:

- spectral filtering tools in Subtract mode (such as REAPER's ReaFIR) to try isolating most relevant peaks from the rest of the spectrum

- EQs with notch filters in harmonic series to attenuate partials of specific notes from the spectrum (usually to remove bass). SplineEQ ( https://photosounder.com/splineeq/ ) is good for this due to transposing control.




In sequenced genres with sample-based drums and mix-in intros (with only drums and bass playing), I sometimes take a loop from the intro, invert its polarity and play it alongside other sections of the track. That results in phase-cancelling bass and/or drums from the mix, making other instruments easier to discern.

I also use the "phase-cancelling trick" in cases when near-identical sections of music exist with different instruments present. This obviously works only for precisely sequenced electronic music (without randomized values in synthesis) or sections with same exact audio copied from elsewhere in the track.




In case of fast sequences, I sometimes split the audio into 8ths or 16ths notes. This can also help FFT analysis to "grab onto" actual frequencies better, but not always.

I also speed up or slow down the audio - sometimes with timestretching/maintaining pitch, sometimes without, as that may create artifacts. Speeding up with corresponding change in pitch works for material with dense spectrum in bass, effectively pushing bass into treble range where FFT analysis has better frequency resolution. Slowing down similarly brings treble down into bass range, where it may (or may not) help in detecting pitches by ear.

I never expect 100% success in transcription, but - depending on material - oftentimes get close enough to get a decent idea of what's playing.

I'm vaguely aware that nowadays there are AI-based demixing and transcription tools. Do tell if you know of such tools and how to use them :)
Last edited by N__K on Tue Oct 19, 2021 1:10 pm, edited 1 time in total.

User avatar
Boss Lovin' DR
9220 posts since 15 Mar, 2002 from the grimness of yorkshire

Post Tue Oct 19, 2021 12:51 pm

N__K wrote:
Tue Oct 19, 2021 12:04 pm


I'm vaguely aware that nowadays there are AI-based demixing and transcription tools. Do tell if you know of such tools and how to use them :)
I know knack all about music theory, but on the de-mixing side this is free and seems to do a job (Windows only), if a bit rough and warbly. Only messed about with it out of curiosity so I'm sure there are tricks you can use to get better results etc.

https://makenweb.com/SpleeterGUI

KVRAF
23311 posts since 20 Oct, 2007 from not here

Post Tue Oct 19, 2021 4:06 pm

"AI-based demixing and transcription tools."
suffice to say: :roll:

There's a forum here for general bs, Everything Else Music Related, btw
Last edited by jancivil on Tue Oct 19, 2021 5:02 pm, edited 1 time in total.

KVRAF
23311 posts since 20 Oct, 2007 from not here

Post Tue Oct 19, 2021 4:28 pm

I used Cubase's audio-to-MIDI once. Not because I needed to know what a thing was, because I figured it would seriously screw with the dense polyphonic texture, being built pretty strictly to sort monophonic tunes. It was fun, it gave me something I won't have done, it wasn't predictable, but it retained integrity with the object and gave me brand new material. I broke its function but good.

oh sorry, it's F# that begins Foxy Lady. D string, 16th fret. my memory is shot, false memories conflicting. Checking this for fact, I found JH didn't get the feedback at Miami Pop 1968 at all either.

KVRAF
23311 posts since 20 Oct, 2007 from not here

Post Tue Oct 19, 2021 4:56 pm

now THIS is a transcription job:

https://youtu.be/uncRyVo2iFg

if I'da wanted that there bassline the score (basically a lead sheet) I bought from ZFT doesn't have it, just root notes and chord symbols. let alone the solos

in pdf: https://www.dropbox.com/s/57svmrh7cdgwb ... e.pdf?dl=0

User avatar
Rad Grandad
34946 posts since 6 Sep, 2003 from Downeast Maine

Post Tue Oct 19, 2021 5:29 pm

jancivil wrote:
Tue Oct 19, 2021 4:06 pm
"AI-based demixing and transcription tools."
suffice to say: :roll:

There's a forum here for general bs, Everything Else Music Related, btw
Welcome to the KVR music theory forum!

This forum was created for 2 reasons:

1: As a response to the many hundreds of requests for theoretical advice and information that the KVR community receives each month. It was thought that creating a dedicated forum to deal with these questions, and the rather lengthy responses that they can sometimes elicit, would make it easier for people to find the information that they want.

2. As a place for more general discussions about the nature of music as a whole. Not just harmony and melody and rhythm, but psychoacoustics, ethnomusicology, musical history, audio history, or any other musico-philosophical issue.

So that means that this is the forum to come to if you have or are interested in:

Questions regarding scales, intervals, chords, and pitch structures generally (tonal and atonal).

Questions regarding time signatures, counting patterns, and metrical structure.

Questions regarding the creation and harmonization of a melody.

Questions regarding counterpoint.

Questions regarding polyrhythms and polymetrical techniques generally.

Questions about alternative tuning systems (Just intonation, Mean tone, etc.) and their use in electronic music.

Questions about acoustics: room acoustics, Helmholtz bass traps, room equalization, etc.

Questions about the relationship between music and the field of sound as a whole, from acoustically pure concord to noise.

Questions about the underlying principles of any music or musical tradition. (e.g. raga and tala in Indian classical music, slendro and pelog tunings in Balinese music, the standard waveforms of analog oscillators and their properties, etc.)

Questions about the use of plugins that use and apply theoretical concepts.
viewtopic.php?p=2287683#p2287683
Words are a barrier to help-seeking and a motivator for making discrimination acceptable.
Mad Pride
If there is a direction to mankind, it ought to be a coming together Brian May

Return to “Music Theory”