Ellipsis (...) and __VA_ARGS__

__/~\_/~\__
 »»»»»»»»»»»»»»»          (== | ^ ==)

A facination for Variable Arguments

I hold a facination for functions with variable number of arguments, especially in C/C++ :). Of course, it stems from the inevasible printf and the likes from the standard C library. I am not going to delve into a how to write your own functions with variable argument lists as there are plenty of resources that rant about it. This article assumes that the reader is quite familiar with using va_list, va_start, va_arg, va_end to write his own functions that handles variable argument list. An understanding of the implementation of these macros would also be handy in understanding aspects covered in this Idiom.

Everytime I adopt a new Idiom my prime concern revolves around tailoring it to be very easy to use and hard to misuse ;). Afterall, much of the time a programmer spends is in debugging his misbehaving code. And with most maintainers / bug-fixers being reliant on their copy-pasting skills, it is wise to put an infrastructure that makes an error less likely to happen.

As alluring as "..." operator is, it does come with a potential to misuse. A mismatch between the format specifier and the arguments (either in count or type) of a printf or its likes stands as a good example for how things could go wrong. It is quite possible to use C++ RTTI to make it more robust, but it only gets an inch further while incurring substantial decrease in performance (which is why C/C++ are preferred).

The Issue of argument count

A function that implements variable number of arguments has to determine the number of arguments and the type of each argument in the list. Similar to the "Hello World" e.g. used to illustrate a basic program in any language, sum/average of a list of value makes the easiest of e.g. to illustrate the use of variable argument list. It may be observed that all implementations either resort to mentioning the number of arguments in the variable argument list as a fixed parameter, or to using a fixed value as a sentinel (indicating end of list).

The problem with specifying the number of arguments in the variable argument list is the possibility of a mismatch. This doesn't hurt when making few calls to such a function. But when the function is called a huge number of times, the potential for making a mistake is increased, not to mention the added burden on the programmer to remember and diligently update the count correctly every time the parameter list is updated. Although specifying the number of arguments makes it easier on the implementer, it ain't making life any simpler for the user. And such a call when done over-and-over IMO makes the code simply UGLY (the count appearing awfully redundant). Okie! I admit I have a compulsive disorder to write code that is beautiful to look at *blush*, but hey! that's me.

The latter approach where the function determines the number of argument in the list seems sweeter, freeing the user from having to do some counting. After all, if the standard library functions does takes that approach, then it makes sense to assuming that some good thought had gone into such decision ;) (why reinvent the wheel). But wait, this entails that the user doesn't forget to specify the sentinel value :(. Doesn't seem any better than the previous approach, does it?

It so happens that I bumped into a simple and neat way of freeing the user from having to specify the sentinel value :). The thought occurred to me as I bumped into the predefined-macro __VA_ARGS__ used in what are known as Variadic-macros (Macros with variable number of arguments, hellelujuah!). Here is a simple and illustrative e.g. I found here -

#define eprintf(...) fprintf(stderr, __VA_ARGS__)

It is quite simple to write a macro for your function with variable argument list to include the sentinel value. The following code illustrated this with an e.g.

#define SENTINEL -32768 // Used to mark the end of Variable Argument List
int sum(int, ...) { /* code to handle variable argument list */ };
#define SumIntegers(...) sum(__VA_ARGS__, SENTINEL)
.
.
.
int aggregate = SumIntegers(10, 20, 30, 40, 50, 60, 70);
int average = SumIntegers(425, 982, -976, 5861, -329) / 5;

Note: The approach is practicable only when all the items in the variable argument list are of the same type (or follow a particular pattern of types in sequence). Support for this predefined macro has been available only since GNU Compiler v3.0 and Visual Studio 2005.

Passing a Variable Argument List to another Variadic-function

This is another pursuit I ran into as I ended up implementing a series of functions with variable argument lists with overlapping functionality. For discussion sake lets assume we are implementing an average function (an extension to the above shown sum function) that returns the average of a list of integers. And since the sum function is already available we would like to reuse it.

Being aware of and having studied some implementations of printf / vprintf etc, I was wondering it there would be a more elegant solution for achieving it. Thats how I landed up reading this article on va_pass. I even went about to write a more intuitive version of my own, and this is what I came up with.

Disclaimer: The e.g. is quite ridiculous considering that its easier for the average function to evaluate the summation by self rather than re-using the sum function. This is written only for illustrative purpose.

#define MAX_PARAMS 10

struct va_pass_intlist
{
int list[MAX_PARAMS];
va_pass_intlist(va_list & args)
{
va_list countArgs = args;
byte paramCount = 0;

while ( SENTINEL != va_arg(countArgs, int) )
++paramCount;

ASSERT(paramCount <= MAX_PARAMS);
memcpy(this, args, sizeof(int) * paramCount);
}
};

int average(int first, ...)
{
va_list args, params;
params = va_start(args, first);

va_pass_intlist myparams(params);
int count = 0,
sum = sum(first, myparams);

while( SENTINEL != va_arg(args, int) )
++count;

return sum / count;
}

At first this seems like a much neater option. But as I began using this Idiom I began to realize that this mechanism had many pit-falls. Let me briefly explain each in turn

  • Limiting - this implementation imposes a limit on the number of arguments that can be specified in the variable argument list of the function. Any number above this is likely to break the function. Although a certain usage maybe content with such a limitation, the generic nature of the variable argument list is lost.
  • Profuse in resource usage - When one variadic-function (average) calls another (sum), there are 3 copies of the variable argument list. One on average's stack, 2nd on local va_pass_intlist instance (although occupying space required for maximum number of arguments, in addition to the local va_list as well) and third on sum's stack (with the maximum number of arguments).

    In contrast the implementation pattern used by printf family of function uses only a local va_list (which is a pointer) and the same on the called variadic-function's stack. Since va_list is a pointer to the 1st item of the variable argument list, the inner function access the same directly from callee's stack.
  • Inferior performance - By this implementation, the variable argument list on the caller's stack is replicated onto the callee's stack in addition to the local copy of the variable argument list. The memory copy operations and the additional memory can incur a considerable performance loss. Further every inner call simulates as if the maximum number of arguments were passed, which would often not be the case and hence an overkill.
  • and Implementation dependent - The above implementation is dependent on the arrangement of the arguments of the Variable Argument List on the stack in-order. Unless this is a specification in C/C++ it is likely that in event for an alternate implementation, the va_pass_intlist would fail. Such a possibility doesn't arise when using the standard va_* functions.

Considering all these factors, it makes sense to have another function (such as vsum) which takes the fixed parameters followed by va_arg (initialized using the original functions variable parameter list) when a function is likely to be reused. This is the same pattern used by the printf / fprintf / vprintf family of standard functions where printf and fprintf in turn use the vprintf function.

This exercise had led me to conclude that the pattern used by the printf family of functions is a simple yet efficient way of passing variable argument list to from one variadic-function to another, only requiring an additional function to be written. Of course, unless I bump into some other elegant way of doing it ;).