Ambience CPU usage
-
- Banned
- 22457 posts since 5 Sep, 2001
[DELETED]
-
- KVRist
- 133 posts since 19 Jan, 2003
You are forgetting that AMD actually only really deserves it's rating up to 2400+. AMD 2400+ *will* still be better than P4 2.4 in most cases, but with higher models this isn't true anymore (I'm talking about original Athlons here, Athlon64 is improved).
Part of this is that P4 needs high clock speed to compensate for it's low IPC, and it looks like around 2.5GHz is the point where it's NetBurst architecture starts to show the advantages. The other part is that new Athlons with 400MHz bus have lower clock speeds at the same rating as older Athlons. Unfortunately bus speed isn't the bottleneck here, like it is with P4, so they actually perform worse than one would expect when looking at their rating and comparing them to older models.
But it is true that with Ambience the difference between P4 and Athlon is even greater than with other plugs. Looks like Ambience does heavy use of something that represent much bigger bottleneck on P4 than on Athlon.
Part of this is that P4 needs high clock speed to compensate for it's low IPC, and it looks like around 2.5GHz is the point where it's NetBurst architecture starts to show the advantages. The other part is that new Athlons with 400MHz bus have lower clock speeds at the same rating as older Athlons. Unfortunately bus speed isn't the bottleneck here, like it is with P4, so they actually perform worse than one would expect when looking at their rating and comparing them to older models.
But it is true that with Ambience the difference between P4 and Athlon is even greater than with other plugs. Looks like Ambience does heavy use of something that represent much bigger bottleneck on P4 than on Athlon.
-
- Banned
- 22457 posts since 5 Sep, 2001
[DELETED]
-
- KVRist
- 263 posts since 24 Oct, 2000 from Germany
let me explain 
Ambience does a LOT of memory access, all at random places all over the memory it allocates... Neither me nor Magnus really know how to speed up Ambience for P4 and even if we did, it'd be very hard : neither of us has a P4!!!
It's extra problematic as we don't really understand why P4 should be slower than AMD as it's memory is most likely FASTER! But, this might not help with the random-access thing.
Afaic it's not a 'bug' in Ambience, it's rather the way P4's work internaly that makes it soak so much more CPU. It shouldn't be denormals either as Magnus has gone to GREAT lengths to fix the denormal problems in Ambience...
Ambience's code isn't "optimised" for any particular platform (nor AMD nor P4), FYI, it's just optimised as is.
So, I guess the simple answer would be: "no until further notice"
FYI, magnus is working on yet another reverb, which soaks a heck of a lot more CPU than Ambience
What can I say, the man is nuts 
- bram
Ambience does a LOT of memory access, all at random places all over the memory it allocates... Neither me nor Magnus really know how to speed up Ambience for P4 and even if we did, it'd be very hard : neither of us has a P4!!!
It's extra problematic as we don't really understand why P4 should be slower than AMD as it's memory is most likely FASTER! But, this might not help with the random-access thing.
Afaic it's not a 'bug' in Ambience, it's rather the way P4's work internaly that makes it soak so much more CPU. It shouldn't be denormals either as Magnus has gone to GREAT lengths to fix the denormal problems in Ambience...
Ambience's code isn't "optimised" for any particular platform (nor AMD nor P4), FYI, it's just optimised as is.
So, I guess the simple answer would be: "no until further notice"
FYI, magnus is working on yet another reverb, which soaks a heck of a lot more CPU than Ambience
- bram
-
- KVRAF
- 7886 posts since 24 Feb, 2003 from Earth, USA
Don't think it's RAM, as mine is just as bad, and I have 1 gig of PC1066 RDRAM in my box. Certainly something between the AMD and Intel arcitecture though....Bram wrote: It's extra problematic as we don't really understand why P4 should be slower than AMD as it's memory is most likely FASTER! But, this might not help with the random-access thing.
Devon
-
- KVRist
- 263 posts since 24 Oct, 2000 from Germany
I should have said 'cache' and not 'ram', ...
It might be related directly to the size of the cache...
As said: VERY hard to tell!
- bram
It might be related directly to the size of the cache...
As said: VERY hard to tell!
- bram
-
- KVRist
- 133 posts since 19 Jan, 2003
P4 has much larger latency when accessing memory than AMD, that's probably the reason. Larger bandwidth doesn't help much here. Anyway, we'll see if this is true when someone will try Ambience on Athlon64, which has even smaller latency than normal Athlon (its integrated memory controller provides blazing fast memory access).
-
- KVRist
- 263 posts since 24 Oct, 2000 from Germany
-
- Banned
- 22457 posts since 5 Sep, 2001
[DELETED]
- KVRAF
- Topic Starter
- 37408 posts since 14 Sep, 2002 from In teh net
What about my original suggestion of making CPU usage a set and forget feature so it doesn't change back everytime you switch presets (even better - between sessions too)- any progress in that direction?
btw - I'm on an Athlon 500 so it struggles but even then I find most patches are playable at around 75% quality (and a few at 100%) - its just annoying that it forgets where I've set it.
btw - I'm on an Athlon 500 so it struggles but even then I find most patches are playable at around 75% quality (and a few at 100%) - its just annoying that it forgets where I've set it.
-
- KVRist
- 263 posts since 24 Oct, 2000 from Germany
ttoz : sse/sse2 optimalizations cannot be done for just any algorithm. they work best when you can do parallel instructions. For example:
for(all samples)
out = in * gain
is "easily" optimised by doing:
for(all samples / 4)
{
out[0] = in[0] * gain
out[1] = in[1] * gain
out[2] = in[2] * gain
out[3] = in[3] * gain
}
Now, when coding in SSE you could write this with one instruction:
for(all samples / 4)
multiply 4 in's by gan, store in 4 out's
However it's not easy to apply this to any random algorithm!! Ambience is especialy ill-suited for this kind of paralellisation without rewriting it completely from the bottom up -which isn't going to happen-
But really, I'm still thinking the "problems" comes from different cache/memory-usage in P4 vs AMD. We'll see after checking it out on an Athlon64 -someone with an athlon64 contacted me about this-
------------------------------
aMUSEd : yes, we should be able to at least do that!
I'll look into it myself. I'm thinking it'll be a GUI-only option where you can set it yourself, but it doesn't get stored. If we'd store it it would break the current preset-format, which would create quite a bit more work for us... But something u can set after loading ambience ("now keep the cpu at 50%, sucker") would be interesting...
cheers,
- bram
for(all samples)
out = in * gain
is "easily" optimised by doing:
for(all samples / 4)
{
out[0] = in[0] * gain
out[1] = in[1] * gain
out[2] = in[2] * gain
out[3] = in[3] * gain
}
Now, when coding in SSE you could write this with one instruction:
for(all samples / 4)
multiply 4 in's by gan, store in 4 out's
However it's not easy to apply this to any random algorithm!! Ambience is especialy ill-suited for this kind of paralellisation without rewriting it completely from the bottom up -which isn't going to happen-
But really, I'm still thinking the "problems" comes from different cache/memory-usage in P4 vs AMD. We'll see after checking it out on an Athlon64 -someone with an athlon64 contacted me about this-
------------------------------
aMUSEd : yes, we should be able to at least do that!
I'll look into it myself. I'm thinking it'll be a GUI-only option where you can set it yourself, but it doesn't get stored. If we'd store it it would break the current preset-format, which would create quite a bit more work for us... But something u can set after loading ambience ("now keep the cpu at 50%, sucker") would be interesting...
cheers,
- bram
- KVRAF
- Topic Starter
- 37408 posts since 14 Sep, 2002 from In teh net
Bram wrote:aMUSEd : yes, we should be able to at least do that!
I'll look into it myself. I'm thinking it'll be a GUI-only option where you can set it yourself, but it doesn't get stored. If we'd store it it would break the current preset-format, which would create quite a bit more work for us... But something u can set after loading ambience ("now keep the cpu at 50%, sucker") would be interesting...
cheers,
- bram
ta
-
- KVRist
- 263 posts since 24 Oct, 2000 from Germany
could everyone readin' this thread have a look at:
http://www.kvr-vst.com/forum/viewtopic.php?t=30610
and try the benchmarking...
- bram
http://www.kvr-vst.com/forum/viewtopic.php?t=30610
and try the benchmarking...
- bram
-
- Banned
- 22457 posts since 5 Sep, 2001
[DELETED]
-
- Banned
- 22457 posts since 5 Sep, 2001
[DELETED]
