Pitfalls in the use of Synchronous (blocking) methods

Synchronous vs Asynchronous

SCAPI contains both synchronous and asynchronous methods.

For our purposes, we define a synchronous operation as one which a function is called to perform some task, and it doesn't return until the task is complete.

We define an asynchronous operation as one which is started with a call to some function. The operation continues independently for some period of time, and when it is complete, an event may be raised to inform the caller of the fact.  The start method typically takes a very small amount of time and returns directly, and while the operation continues the calling thread may function normally, though it should take care not to interfere with anything involved in the ongoing asynchronous operation. A completion event signals that the operation has finished (or failed), that results are available, and another operation involving the same object(s) could be started.

Synchronous operations are generally a far simpler way to do things, provided the operation takes only a short time.  Most SCAPI methods are almost instant and cause no problem.  A small number of methods do take considerable time; tsc_TsModes::Measure(), for example, may take a minute or more.  tsc_SurveyCore::Sleep() is another.

Measure and Sleep are examples of interruptible methods which must be treated very carefully, since other nested events may occur while they wait to complete.  

Synchronous methods that are interruptible

Aside from Measure and Sleep, there are a number of other methods which must be treated with care.  The tsc_TsModes class has a large number of such methods, however asynchronous equivalents are available for all of them.

Another notable example is tsc_Form::ShowDialog(). ShowDialog is necessarily a synchronous method, in order to make its dialog modal. If a form creates a new form and displays it by calling ShowDialog, the user cannot interact with the first form until the second ShowDialog has returned.

One scenario

Let us say that a plugin method called RunMeasure has called tsc_TsModes::Measure to take a measurement. While the thread is inside the call to the Measure, the system has entered a wait which is placed on the thread's call stack along with Measure and the plugin's RunMeasure method used to call it. During the wait, any event may be raised, which will call the appropriate event handler in the plugin. This event is nested inside the existing RunMeasure, Measure, and wait calls - it is deeper on the stack - and so it is not possible for the wait, Measure, or RunMeasure methods to return and continue their processing until the nested event has exited, even if the Measure has completed.

Of course the inner event method may also call a long-running synchronous API method which will deepen the stack even further, possibly leading to serious recursion problems such as a stack overflow.  A very simple example of this follows:

class MyForm : public tsc_Form
{
public:
    MyForm ()
    {
        this->Controls.Add (new tsc_SoftkeyControl (PX_MakeItCrash));
        this->Controls.Add (...);
    }
    virtual tsc_DialogResult OnSoftkeyClick (x_Code sk)
    {
        if (sk == PX_MakeItCrash)
        tsc_SurveyCore::Sleep(5.0);
        return tsc_DialogResult_Ok;
    }
};

In this code snippet, pressing the softkey will cause a 5 second delay, and the the form will exit. This seems innocent enough although not especially useful. But what happens if the softkey is pressed several times before the 5 seconds has elapsed?  Another softkey event will interrupt the sleep to make a new call to OnSoftkeyClick and put new items on the call stack, and these will occur recursively because the previous call is still waiting inside the Sleep. 

Even after the 5 seconds has elapsed nothing will happen because the first call to OnSoftkeyClick can not continue since it is buried on the stack underneath the other calls to OnSoftkeyClick, Sleep, etc.  It is not until the last softkey press has slept for 5 seconds that it can exit and all the buried calls can also exit, and then finally the form can exit.  If the softkey is pressed enough times, the application will crash with a stack overflow.

Solutions

There are two solutions.  The simple solution is to add a boolean variable to the form which is used to bypass all calls to Sleep after the first.  A more elegant solution is to use an asynchronous design - the softkey handler starts a one-shot timer (but only once, the boolean is still required), and the timer completion event closes the form.

While asynchronous methods are generally more complex to program, they are almost always less prone to hidden problems like this simple example.

Use of synchronous methods that use internal waits should be avoided, except where the operation occurs on a separate non-UI thread - which effectively makes it an asynchronous operation.