Gnocl Cookbook‎ > ‎

Parsing Strings into Words

Gnocl options often take lists as arguments. For the most part these are of fixed length such as x y coordinate pairs but sometimes this might be a list of arbitrary length. The process of breaking such strings down into their components is achieved using the C library function strtok. The following example illustrates how it works.
/* strtok example */
#include <stdio.h>
#include <string.h>

int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}

Which outputs:

Splitting string "- This, a sample string." into tokens:
This
a
sample
string

The above sample looks all very satisfying as there is is only one application of strtok to the string str. But, if the whole process is repeated again on str the same results will never be obtained -- only the first element in the list will be returned. This is because strtok will gradually nibble away at str while its parsing for tokens, eventually reducing str to the first element.

In order to work around this its necessary to create duplicate of the string (not a pointer to the same memory address), and then complete the tokenization process afresh.

Here's a snippet from the Gnocl core module text.c showing this in action:
while ( gtk_text_iter_forward_search ( &start, Tcl_GetString ( objv[cmdNo + 1] ), 0, &begin, &end, NULL ) != NULL ) {

row1 = gtk_text_iter_get_line ( &begin );
col1 = gtk_text_iter_get_line_offset ( &begin );
row2 = gtk_text_iter_get_line ( &end );
col2 = gtk_text_iter_get_line_offset ( &end );

/* create duplicate string */
Tcl_Obj *copyObj = Tcl_NewStringObj( Tcl_GetString ( objv[cmdNo + 3]) , -1);

pch2 = strtok ( Tcl_GetString ( copyObj ), " " );

while ( pch2 != NULL ) {
if ( gtk_text_tag_table_lookup (gtk_text_buffer_get_tag_table ( buffer ), pch2) == NULL ) {
Tcl_AppendResult (interp, "ERROR! Tag \"",pch2,"\" does not exist.",NULL);
return GNOCL_ERROR_OTHER;
}

gtk_text_buffer_apply_tag_by_name ( buffer, pch2, &begin, &end );
pch2 = strtok ( NULL, " " );
}

Tcl_ListObjAppendElement ( interp, resList, Tcl_NewIntObj ( row1 ) );
Tcl_ListObjAppendElement ( interp, resList, Tcl_NewIntObj ( col1 ) );
Tcl_ListObjAppendElement ( interp, resList, Tcl_NewIntObj ( row2 ) );
Tcl_ListObjAppendElement ( interp, resList, Tcl_NewIntObj ( col2 ) );

start = end;
}

 
Comments