Optimizer bug or threading bug?

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

When a local variable value inside a loop is printed the code works correctly, if it is not printed, it does not. The value (or anything related) is not being accessed by any other thread, only its copy (a tbb::atomic<double> variable set by the same thread) is being read by another thread. What would you think? File a bug report to Microsoft? Only happens with /O2 optimization (/O1 and /Od, and gcc/linux versions are all OK) :?
~stratum~

Post

stratum wrote:When a local variable value inside a loop is printed the code works correctly, if it is not printed, it does not. The value (or anything related) is not being accessed by any other thread, only its copy (a tbb::atomic<double> variable set by the same thread) is being read by another thread. What would you think? File a bug report to Microsoft? Only happens with /O2 optimization (/O1 and /Od, and gcc/linux versions are all OK) :?
Would be interesting to see the actual code, as there as many subtle pitfalls when it comes to multi-threading and it's easy to blame the compiler, only to realise later than it's actually just a simple user mistake. If printing the value (or rather more likely just calling another non-inlined function) fixes the problem, then it's probably something to do with the compiler keeping the variable in registers and the function call happens to make it flush it into memory.

Try putting a _ReadWriteBarrier() call (ie. a full compiler memory fence) in place of the print-call and see if that fixes the problem. If it does, then your code probably expects stricter memory ordering guarantees that what you are currently getting out of your use of tbb::atomic; that's supposed to guarantee some memory ordering, but maybe not enough?

Post

Go through this list of O2 flags and eliminate them with the -fno- prepend (ex. -fno-gcse).
https://gcc.gnu.org/onlinedocs/gcc/Opti ... tions.html

Post

Run address sanitizer on your code!

Post

What compiler are you using?

Post

Hi,

Thanks you your replies. I'm not able to reproduce the problem with a shorter code extracted from the actual one, so for the time being this looks like a threading or memory corruption problem. _ReadWriteBarrier has no effect.
The compiler is Visual studio 2017, version 15.7.1 and the problem appears only with optimization option /O2.
~stratum~

Post

This MSVC specific pragma may help you to locate the block of code causing the issue (by switching optimization form or disabling them)
https://msdn.microsoft.com/en-us/library/chh3fb0k.aspx

Post

Interestingly disabling optimizations only for the function that works incorrectly solves the problem as in

Code: Select all

#pragma optimize( "", off )  
void buggy_function()
{
}
#pragma optimize( "", on)
~stratum~

Post

The same process had a deadlock bug in a related thread pool implementation that appeared only with /O2 optimizations and disabling optimizations with the pragmas above for the implementation of the whole thread pool class code solved the deadlock bug too.
All this looks a bit suspicious because the Visual studio team has been making very frequent updates recently but I'm not able to reproduce the bugs with shorter code fragments.
~stratum~

Post

Here is something similar

https://developercommunity.visualstudio ... gives.html

My own code also had a variable whose calculation was optimized out (or perhaps written over due to a memory bug as I cannot prove this to be a compiler bug)
~stratum~

Post

The way to avoid having the optimizer eliminate unused results is to just store the result, like in an allocated address. Don't expect stack variables to not be eliminated if they are unused.

Post

camsr wrote:The way to avoid having the optimizer eliminate unused results is to just store the result, like in an allocated address. Don't expect stack variables to not be eliminated if they are unused.
This is exactly what I do, and what happens is much weirder than that.

p.s. I don't really believe that this code makes sure that there is no memory corruption around last_percent as the compiler might reorder these variables and assign some of them to registers, but still I think a stack corruption or something like that would probably destroy other variables also.

p.s.2 The code below is simplified version of the actual version it's not the same but this doesn't matter because the problem is not reproducible anywhere other than its actual execution context anyway.

p.s.3 None of the related values below are actually initialized to zero. Zero is a very strange result value for this code.

Code: Select all

// problem: some other thread reads m_Percentage and when optimizations are enabled m_Percentage value stays fixed at zero because last_percent value does not change
// and stays fixed at zero in spite of the fact that prog_info.percentComplete does change from 0 to 100 and there is no evidence of memory corruption around last_percent.
// note: the value of m_Percentage set by the constructor is 50.
#pragma optimize( "", off )  
void Worker::WaitTask(int taskIdent)
{
	TimerClass timer;
	ProgressInfo prog_info;
	prog_info.percentComplete = 70;// that's intentionally initialized to a nonzero value to expose a bug
	std::string electric_fence1("K");// we need this to understand whether there is memory corruption around last_percent 
	double last_percent = 40; // that's intentionally initialized to a nonzero value to expose a bug
	std::string electric_fence2("L");// we need this to understand whether there is memory corruption around last_percent 
	TaskStatus last_status(TASKSTATCODE_NOTSTARTED);
	try
	{
		do
		{
			m_LastError = m_SoapClient->GetTaskStatus(taskIdent, prog_info);
			if (FailCodes(m_LastError) != FAILCODE_SUCCESS)
			{
				m_LastStatus = STATUS_FAILED;
				break;
			}
			else
			{
				// BEGIN I think optimizer messes something here
				std::cerr << electric_fence1;
				last_status = prog_info.status;
				if (last_status == TASKSTATCODE_FINISHED ||
					last_status == TASKSTATCODE_RUNNING)
				{
					last_percent = prog_info.percentComplete;
					if (last_percent < 0)
						last_percent = 0;
					else if (last_percent > 100)
						last_percent = 100;
					m_Percentage = last_percent;
				}
				std::cerr << electric_fence2;
				// END I think optimizer messes something here
			}
			// if we don't print last_percent here vs2017 produces incorrect code with /O2 
			//std::cerr << "m_Percentage is now " << (double)m_Percentage << " vs last_percent = " <<last_percent << std::endl;
			//_ReadWriteBarrier();
			if (last_status == TASKSTATCODE_NOTSTARTED)
			{
				if (timer.Elapse() > s_StartDeadlineInseconds)
				{
					m_LastStatus = STATUS_FAILED;
					break;
				}
			}
			else if (last_status == TASKSTATCODE_FINISHED)
			{
				m_LastStatus = STATUS_COMPLETED;
				m_LastError = FAILCODE_SUCCESS;
				break;
			}
			else if (last_status == TASKSTATCODE_ERROR)
			{
				m_LastStatus = STATUS_FAILED;
				break;
			}
			else
			{
				// invalid status code
				m_LastStatus = STATUS_FAILED;
				break;
			}

			Sleep(1);

			if (m_QuitRequest)
			{
				m_LastStatus = STATUS_CANCEL;
				break;
			}
		} while (true);
	}
	catch (...)
	{
		m_LastStatus = STATUS_FAILED;
	}
}
#pragma optimize( "", on )
~stratum~

Post

Using a Nullable class template for "double last_percent" as "Nullable<double> last_percent" causes the result to be 100 instead of 0, which makes me think that these if statement comparisons are evaluated incorrectly:

Code: Select all

last_percent = prog_info.percentComplete;
if (last_percent.value() < 0)
                  last_percent = 0;
else if (last_percent.value() > 100)
                  last_percent = 100;
m_Percentage = last_percent.value();
Because the result is either 0 or 100, but never anything in between with /O2.
~stratum~

Post

And if we change the declaration to
Nullable<double> *last_percent = new Nullable<double>();
, causing the actual value of last_percent to be on heap, then the behavior is the same. (If there was a stack corruption around last_percent this would cause the process to crash but it does not. (I guess I know what you are thinking about this hacking style but unfortunately this kind of trial and error is the best I can do on an executable which lacks debug symbols - I don't know any address sanitizer or debugger that works on an optimized executable without debug symbols).)
~stratum~

Post

Correction: The idea that Nullable<double> results 100 instead of 0 was wrong, as temporarily modified code incorrectly was:

Code: Select all

last_percent = prog_info.percentComplete;
if (last_percent.value() < 0)
                  last_percent = 100; // *** wrong ***
else if (last_percent.value() > 100)
                  last_percent = 100;
m_Percentage = last_percent.value();
Anyway. I'm not sure this has any significance as there is no stack corruption here and the value of last_percent is wrong unless it is printed or optimizations are disabled.
~stratum~

Post Reply

Return to “DSP and Plugin Development”