Optimizer bug or threading bug?
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
When a local variable value inside a loop is printed the code works correctly, if it is not printed, it does not. The value (or anything related) is not being accessed by any other thread, only its copy (a tbb::atomic<double> variable set by the same thread) is being read by another thread. What would you think? File a bug report to Microsoft? Only happens with /O2 optimization (/O1 and /Od, and gcc/linux versions are all OK)
~stratum~
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
Would be interesting to see the actual code, as there as many subtle pitfalls when it comes to multi-threading and it's easy to blame the compiler, only to realise later than it's actually just a simple user mistake. If printing the value (or rather more likely just calling another non-inlined function) fixes the problem, then it's probably something to do with the compiler keeping the variable in registers and the function call happens to make it flush it into memory.stratum wrote:When a local variable value inside a loop is printed the code works correctly, if it is not printed, it does not. The value (or anything related) is not being accessed by any other thread, only its copy (a tbb::atomic<double> variable set by the same thread) is being read by another thread. What would you think? File a bug report to Microsoft? Only happens with /O2 optimization (/O1 and /Od, and gcc/linux versions are all OK)
Try putting a _ReadWriteBarrier() call (ie. a full compiler memory fence) in place of the print-call and see if that fixes the problem. If it does, then your code probably expects stricter memory ordering guarantees that what you are currently getting out of your use of tbb::atomic; that's supposed to guarantee some memory ordering, but maybe not enough?
-
- KVRAF
- 7400 posts since 17 Feb, 2005
Go through this list of O2 flags and eliminate them with the -fno- prepend (ex. -fno-gcse).
https://gcc.gnu.org/onlinedocs/gcc/Opti ... tions.html
https://gcc.gnu.org/onlinedocs/gcc/Opti ... tions.html
-
- KVRian
- 1379 posts since 26 Apr, 2004 from UK
Run address sanitizer on your code!
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
Hi,
Thanks you your replies. I'm not able to reproduce the problem with a shorter code extracted from the actual one, so for the time being this looks like a threading or memory corruption problem. _ReadWriteBarrier has no effect.
The compiler is Visual studio 2017, version 15.7.1 and the problem appears only with optimization option /O2.
Thanks you your replies. I'm not able to reproduce the problem with a shorter code extracted from the actual one, so for the time being this looks like a threading or memory corruption problem. _ReadWriteBarrier has no effect.
The compiler is Visual studio 2017, version 15.7.1 and the problem appears only with optimization option /O2.
~stratum~
-
- KVRAF
- 7400 posts since 17 Feb, 2005
This MSVC specific pragma may help you to locate the block of code causing the issue (by switching optimization form or disabling them)
https://msdn.microsoft.com/en-us/library/chh3fb0k.aspx
https://msdn.microsoft.com/en-us/library/chh3fb0k.aspx
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
Interestingly disabling optimizations only for the function that works incorrectly solves the problem as in
Code: Select all
#pragma optimize( "", off )
void buggy_function()
{
}
#pragma optimize( "", on)
~stratum~
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
The same process had a deadlock bug in a related thread pool implementation that appeared only with /O2 optimizations and disabling optimizations with the pragmas above for the implementation of the whole thread pool class code solved the deadlock bug too.
All this looks a bit suspicious because the Visual studio team has been making very frequent updates recently but I'm not able to reproduce the bugs with shorter code fragments.
All this looks a bit suspicious because the Visual studio team has been making very frequent updates recently but I'm not able to reproduce the bugs with shorter code fragments.
~stratum~
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
Here is something similar
https://developercommunity.visualstudio ... gives.html
My own code also had a variable whose calculation was optimized out (or perhaps written over due to a memory bug as I cannot prove this to be a compiler bug)
https://developercommunity.visualstudio ... gives.html
My own code also had a variable whose calculation was optimized out (or perhaps written over due to a memory bug as I cannot prove this to be a compiler bug)
~stratum~
-
- KVRAF
- 7400 posts since 17 Feb, 2005
The way to avoid having the optimizer eliminate unused results is to just store the result, like in an allocated address. Don't expect stack variables to not be eliminated if they are unused.
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
This is exactly what I do, and what happens is much weirder than that.camsr wrote:The way to avoid having the optimizer eliminate unused results is to just store the result, like in an allocated address. Don't expect stack variables to not be eliminated if they are unused.
p.s. I don't really believe that this code makes sure that there is no memory corruption around last_percent as the compiler might reorder these variables and assign some of them to registers, but still I think a stack corruption or something like that would probably destroy other variables also.
p.s.2 The code below is simplified version of the actual version it's not the same but this doesn't matter because the problem is not reproducible anywhere other than its actual execution context anyway.
p.s.3 None of the related values below are actually initialized to zero. Zero is a very strange result value for this code.
Code: Select all
// problem: some other thread reads m_Percentage and when optimizations are enabled m_Percentage value stays fixed at zero because last_percent value does not change
// and stays fixed at zero in spite of the fact that prog_info.percentComplete does change from 0 to 100 and there is no evidence of memory corruption around last_percent.
// note: the value of m_Percentage set by the constructor is 50.
#pragma optimize( "", off )
void Worker::WaitTask(int taskIdent)
{
TimerClass timer;
ProgressInfo prog_info;
prog_info.percentComplete = 70;// that's intentionally initialized to a nonzero value to expose a bug
std::string electric_fence1("K");// we need this to understand whether there is memory corruption around last_percent
double last_percent = 40; // that's intentionally initialized to a nonzero value to expose a bug
std::string electric_fence2("L");// we need this to understand whether there is memory corruption around last_percent
TaskStatus last_status(TASKSTATCODE_NOTSTARTED);
try
{
do
{
m_LastError = m_SoapClient->GetTaskStatus(taskIdent, prog_info);
if (FailCodes(m_LastError) != FAILCODE_SUCCESS)
{
m_LastStatus = STATUS_FAILED;
break;
}
else
{
// BEGIN I think optimizer messes something here
std::cerr << electric_fence1;
last_status = prog_info.status;
if (last_status == TASKSTATCODE_FINISHED ||
last_status == TASKSTATCODE_RUNNING)
{
last_percent = prog_info.percentComplete;
if (last_percent < 0)
last_percent = 0;
else if (last_percent > 100)
last_percent = 100;
m_Percentage = last_percent;
}
std::cerr << electric_fence2;
// END I think optimizer messes something here
}
// if we don't print last_percent here vs2017 produces incorrect code with /O2
//std::cerr << "m_Percentage is now " << (double)m_Percentage << " vs last_percent = " <<last_percent << std::endl;
//_ReadWriteBarrier();
if (last_status == TASKSTATCODE_NOTSTARTED)
{
if (timer.Elapse() > s_StartDeadlineInseconds)
{
m_LastStatus = STATUS_FAILED;
break;
}
}
else if (last_status == TASKSTATCODE_FINISHED)
{
m_LastStatus = STATUS_COMPLETED;
m_LastError = FAILCODE_SUCCESS;
break;
}
else if (last_status == TASKSTATCODE_ERROR)
{
m_LastStatus = STATUS_FAILED;
break;
}
else
{
// invalid status code
m_LastStatus = STATUS_FAILED;
break;
}
Sleep(1);
if (m_QuitRequest)
{
m_LastStatus = STATUS_CANCEL;
break;
}
} while (true);
}
catch (...)
{
m_LastStatus = STATUS_FAILED;
}
}
#pragma optimize( "", on )
~stratum~
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
Using a Nullable class template for "double last_percent" as "Nullable<double> last_percent" causes the result to be 100 instead of 0, which makes me think that these if statement comparisons are evaluated incorrectly:
Because the result is either 0 or 100, but never anything in between with /O2.
Code: Select all
last_percent = prog_info.percentComplete;
if (last_percent.value() < 0)
last_percent = 0;
else if (last_percent.value() > 100)
last_percent = 100;
m_Percentage = last_percent.value();
~stratum~
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
And if we change the declaration to
, causing the actual value of last_percent to be on heap, then the behavior is the same. (If there was a stack corruption around last_percent this would cause the process to crash but it does not. (I guess I know what you are thinking about this hacking style but unfortunately this kind of trial and error is the best I can do on an executable which lacks debug symbols - I don't know any address sanitizer or debugger that works on an optimized executable without debug symbols).)Nullable<double> *last_percent = new Nullable<double>();
~stratum~
-
- KVRAF
- Topic Starter
- 2256 posts since 29 May, 2012
Correction: The idea that Nullable<double> results 100 instead of 0 was wrong, as temporarily modified code incorrectly was:
Anyway. I'm not sure this has any significance as there is no stack corruption here and the value of last_percent is wrong unless it is printed or optimizations are disabled.
Code: Select all
last_percent = prog_info.percentComplete;
if (last_percent.value() < 0)
last_percent = 100; // *** wrong ***
else if (last_percent.value() > 100)
last_percent = 100;
m_Percentage = last_percent.value();
~stratum~