Is the human voice 'special'

Anything about MUSIC but doesn't fit into the forums above.
RELATED
PRODUCTS

Post

There is a thread in the instruments forum about a hypothetical virtual instrument that could convincingly mimic a human voice.

That thread has lots of different kinds of drama in it. Some of it personality driven, which is something I make an effort to avoid. But some of the contention is more interesting.

Specifically, many people seem apprehensive about the idea of convincingly mimicking a human voice as opposed to convincingly mimicking drummers or violinists or pianists.

I will admit that this surprised me. Not that people would feel that way, but that so many people in this community would feel this way.

Anyway, I thought it would be interesting to explore these issues in a separate thread that is unencumbered by off topic personality issues.

Discuss.

Post

I did think the same as you...on all accounts of both your description of the thread and possible hypocrisy of what could be the pot calling the kettle black - re instruments vs. voice samples. In the Cafe section a couple of weeks ago, there was a "female" vocalist who solicited comments from "her" song, and it ended up, apparently, being a sampled voice that actually fooled a few people (me included). I think as long as you don't copy someone "known" (otherwise, everyone could have John Lennon as their lead vocalist), then making a song from a really good sampled voice would be fine.

Post

I think music with high-quality synthesized voices could fit within its own genre. Each musical genre has its own conventions, and standards of quality. (For example, characters in animated movies often don't sing like humans, yet their performances are quality in that genre.)

Also, there are many sample-based vocal construction kits as in the above example, so it is only a matter of time before a group like Sample Modeling creates a physical-modeling hybrid SWAM engine that you can play with a breath controller and a Linnstrument together, to mimic a singer for music in that genre.

If you have ever tried to get a singer to relax her breath to give an expressive performance, you know the subtle meanings we can detect in a real voice, so I dont think that spontaneity will ever be able to be professionally equaled using synthesis, a controller or automation, but perhaps AI can study millions of syllables and create song interpretations like AI now mimics Bach compositions (that sound similar but lack spontaneity and soul.)

I asked Siri and she says, "I can't sing."
s a v e
y o u r
f l o w

Post

Bodhisan wrote:I did think the same as you...on all accounts of both your description of the thread and possible hypocrisy of what could be the pot calling the kettle black - re instruments vs. voice samples. In the Cafe section a couple of weeks ago, there was a "female" vocalist who solicited comments from "her" song, and it ended up, apparently, being a sampled voice that actually fooled a few people (me included). I think as long as you don't copy someone "known" (otherwise, everyone could have John Lennon as their lead vocalist), then making a song from a really good sampled voice would be fine.
Just to be clear, I am not expressing an opinion either way. I have never thought about the issue before, and it is deeply complicated.

I am more interested in exploring other people's ideas. They can hash it all out, and then I can decide which of the dead bodies was right afterwards. :hihi:

Post

If you are into this outcome, than not really

https://www.youtube.com/watch?v=JjTV8i_KjXM
This entire forum is wading through predictions, opinions, barely formed thoughts, drama, and whining. If you don't enjoy that, why are you here? :D ShawnG

Post

They probably had the same discussions and concerns when the first synthesised spoken words were created. How foolish they would look now.

The first speech synth (no human speech is used or processed) and you really had to play it with skill

https://www.youtube.com/watch?v=0rAyrmm7vv0
The Voder synthesized human speech by imitating the effects of the human vocal tract. The operator could select one of two basic sounds by using a wrist bar. A buzz tone generated by a relaxation oscillator produced the voiced vowels and nasal sounds, with the pitch controlled by a foot pedal. A hissing noise produced by a gas discharge tube created the sibilants (voiceless fricative sounds). These initial sounds were passed through a bank of 10 band pass filters that were selected by keys; their outputs were combined, amplified and fed to a loudspeaker. The filters were controlled by a set of keys and a foot pedal to convert the hisses and tones into vowels, consonants, and inflections. Additional special keys were provided to make the plosive sounds such as "p" or "d", and the affrictive sounds of the "j" in "jaw" and the "ch" in "cheese". This was a complex machine to operate. After months of practice, a trained operator could produce recognizable speech.
Amazon: why not use an alternative

Post

Is the human voice 'special'
Absolutely! Even if you have the tone sorted out. Convincing speech/song needs an amalgamation of dynamics, diction, intonation, phrasing, and emotion. Most people could not pick out a virtual piano in a mix (or even standalone), but we're a long way off from people not being able to pick out virtual speech/song.

Post

el-bo (formerly ebow) wrote:
Is the human voice 'special'
Absolutely! Even if you have the tone sorted out. Convincing speech/song needs an amalgamation of dynamics, diction, intonation, phrasing, and emotion. Most people could not pick out a virtual piano in a mix (or even standalone), but we're a long way off from people not being able to pick out virtual speech/song.
Yeah, also vocals sit most of the times in dead center as star of the show where you can dissect every nuance of it, can't say that for the most of other instruments that are just in background most of the times (or fighting for space with plenty of others) and quickly move on from listeners attention. (there's always exceptions, but in majority of cases, it's like this)
This entire forum is wading through predictions, opinions, barely formed thoughts, drama, and whining. If you don't enjoy that, why are you here? :D ShawnG

Post

when it is possible it will be telephone scammers who invent it :hihi:
The highest form of knowledge is empathy, for it requires us to suspend our egos and live in another's world. It requires profound, purpose‐larger‐than‐the‐self kind of understanding.

Post

Hink wrote:when it is possible it will be telephone scammers who invent it :hihi:
https://www.youtube.com/watch?v=WkiaILsshb8

Post

el-bo (formerly ebow) wrote:
Is the human voice 'special'
Absolutely! Even if you have the tone sorted out. Convincing speech/song needs an amalgamation of dynamics, diction, intonation, phrasing, and emotion. Most people could not pick out a virtual piano in a mix (or even standalone), but we're a long way off from people not being able to pick out virtual speech/song.
I completely agree that the project would be difficult. The fact that people have mouths is the biggest problem, because mouths change shape all of the time. It is hard enough to model a clarinet, and it is a precisely designed object that doesn't even have a variably shaped mouth.

But I saw numerous responses that seemed to indicate that convincingly mimicking human voices would (were it possible) represent some kind of creepy singularity. This is a separate issue, and I am wondering why people feel this way.

Post

herodotus wrote:
But I saw numerous responses that seemed to indicate that convincingly mimicking human voices would (were it possible) represent some kind of creepy singularity. This is a separate issue, and I am wondering why people feel this way.
Its similar to the AI dislike some have, its the idea that a machine can mimic a human and it can be used for negative rather that positive outcomes.
Amazon: why not use an alternative

Post

im not anti the idea, if nothing else it would make phoning places more pleasurable if those automated announcements sounded like humans.
realistically speaking though were a fair bit off what now is basically an artefact free autotune.
i think peoples initial horror was based on the idea of an ai listening to artists recording and allowing any one of us to sound like that particular singer. while again, doesn't bother me either way, i can see why this might have legal problems down the line, so would probably end up being somewhat more expensive than was offered :hihi:

Post

herodotus wrote: Specifically, many people seem apprehensive about the idea of convincingly mimicking a human voice as opposed to convincingly mimicking drummers or violinists or pianists.

I will admit that this surprised me. Not that people would feel that way, but that so many people in this community would feel this way.
My problem is with the absolute cop of this and the next personality, a premise given in the OP of that thread.
I was as clear as I can be about that distinction.

I doubt I read every post but I didn't notice other strong objection *to the idea*. My objection is not an idea in a vacuum. Music which is a carbon copy as close as can be to its model has been having an effect for some time, and I don't tend to think that's my imagination.

Post

BUT: absolutely the human voice is special. The authenticity of human individual - individuated - thought is special. I have thought for a long time about the notion of copies, of people that are never fully individuated from what they've decided to be 'just like' from their given set of influences. So the thought in that post disturbed me.

So there is a personality type that prefers to copy, in music; and so does hack work. Unoriginality in music. That's their ceiling and there is nothing one can do to change others, so... but where that kind of thought 'originates' can't really be ignored if you think about the problem.

I went into my criticisms as thought experiment. It's not an argument about reality, yet. I think technically it isn't going to happen in any of our lifetimes, including the youngest of us.

So I said "read The Time Machine", a thought experiment in extremis on a parallel problem.

Post Reply

Return to “Everything Else (Music related)”