Eight Rules Of Using BSTR

Some anonymous good soul's advice (found online) about using BSTRs.

Knowing what the BSTR functions do doesn't mean you know how to use them. Just as the BSTR type is more than its typedef implies, the BSTR functions require more knowledge than documentation states. Those who obey the rules live in peace and happiness. Those who violate them live in fear-plagued by the ghosts of bugs past and future. The trouble is, these rules are passed on in the oral tradition; they are not carved in stone. You're just supposed to know. The following list is an educated attempt-based on scraps of ancient manuscripts, and revised through trial and error-to codify the oral tradition. Remember, it is just an attempt.

Rule 1: Allocate, destroy, and measure BSTRs only through the OLE API (the Sys functions).Those who use their supposed knowledge of BSTR internals are doomed to an unknowable but horrible fate in future versions. (You have to follow the rules if you don't want bugs.)

Rule 2: You may have your way with all the characters of strings you own. The last character you own is the last character reported by SysStringLen, not the last non-null character. You may fool functions that believe in null-terminated strings by inserting null characters in BSTRs, but don't fool yourself.

Rule 3: You may change the pointers to strings you own, but only by following the rules. In other words, you can change those pointers with SysReAllocString or SysReAllocStringLen. The trick with this rule (and rule 2) is determining whether you own the strings.

Rule 4: You do not own any BSTR passed to you by value. The only thing you can do with such a string is copy it or pass it on to other functions that won't modify it. The caller owns the string and will dispose of it according to its whims. A BSTR passed by value looks like this in C++:

void DLLAPI TakeThisStringAndCopyIt(BCSTR bsIn);

The BCSTR is a typedef that should have been defined by OLE, but wasn't. I define it like this in OleType.H:

typedef const wchar_t * const BCSTR;

If you declare input parameters for your functions this way, the C++ compiler will enforce the law by failing on most attempts to change either the contents or the pointer. The Object Description Language (ODL) statement for the same function looks like this:

void WINAPI TakeThisStringAndCopyIt([in] BCSTR bsIn);

The BCSTR type is simply an alias for BSTR because MKTYPLIB doesn't recognize const. The [in] attribute allows MKTYPLIB to compile type information indicating the unchangeable nature of the BSTR. OLE clients such as Visual Basic will see this type information and assume you aren't going to change the string. If you violate this trust, the results are unpredictable.

Rule 5: You own any BSTR passed to you by reference as an in/out parameter. You can modify the contents of the string, or you can replace the original pointer with a new one (using SysReAlloc functions). A BSTR passed by reference looks like this in C++:

void DLLAPI TakeThisStringAndGiveMeAnother(BSTR* pbsInOut);

Notice that the parameter doesn't use BCSTR because both the string and the pointer are modifiable. In itself the prototype doesn't turn a reference BSTR into an in/out BSTR. You do that with the following ODL statement:

void WINAPI TakeThisStringAndGiveMeAnother([in, out] BSTR * pbsInOut);

The [in, out] attribute tells MKTYPLIB to compile type information indicating that the string will have a valid value on input, but that you can modify that value and return something else if you want. For example, your function might do something like this:

// Copy input string.
bsNew = SysAllocString(*pbsInOut);

// Replace input with different output.
f = SysReAllocString(pbsInOut, L"Take me home");

// Use the copied string for something else.

Rule 6: You must create any BSTR passed to you by reference as an out string. The string parameter you receive isn't really a string-it's a placeholder. The caller expects you to assign an allocated string to the unallocated pointer, and you'd better do it. Otherwise the caller will probably crash when it tries to perform string operations on the uninitialized pointer. The prototype for an out parameter looks the same as one for an in/out parameter, but the ODL statement is different:

void WINAPI TakeNothingAndGiveMeAString([out] BSTR * pbsOut);

The [out] attribute tells MKTYPLIB to compile type information indicating that the string has no valid input but expects valid output. A container such as Visual Basic will see this attribute and will free any string assigned to the passed variable before calling your function. After the return the container will assume the variable is valid. For example, you might do something like this:

// Allocate an output string.
*pbsOut = SysAllocString(L"As you like it");

Rule 7: You must create a BSTR in order to return it. A string returned by a function is different from any other string. You can't just take a string parameter passed to you, modify the contents, and return it. If you did, you'd have two string variables referring to the same memory location, and unpleasant things would happen when different parts of the client code tried to modify them. So if you want to return a modified string, you allocate a copy, modify the copy, and return it. You prototype a returned BSTR like this:

BSTR DLLAPI TransformThisString(BCSTR bsIn);

The ODL version looks like this:

BSTR WINAPI TransformThisString([in] BSTR bsIn);

You might code it like this:

// Make a new copy.
BSTR bsRet = SysAllocString(bsIn);

// Transform copy (uppercase it).

// Return copy.
return bsRet;

Rule 8: A null pointer is the same as an empty string to a BSTR. Experienced C++ programmers will find this concept startling because it certainly C++ isn't true of normal C++ strings. An empty BSTR is a pointer to a zero-length string. It has a single null character to the right of the address being pointed to, and a long integer containing zero to the left. A null BSTR is a null pointer pointing to nothing. There can't be any characters to the right of nothing, and there can't be any length to the left of nothing. Nevertheless, a null pointer is considered to have a length of zero (that's what SysStringLen returns).