SAS SCAN( ) is mainly used to extract nth part of the string. String is considered to be devided into number of parts by some delimeter/s.
One string can have one delimiting character
consistently occuring through out the string or one string itself can have multiple delimiting characters. SCAN() functions has the ability to handle a set of delimiters in a single string.
However specifying delimiting character in SCAN() function is optional; and in absense of this argument SCAN () will consider a default set of delimiters while processing. Following is set of these default delimiters.
blank . < ( + & ! $ * ) ; ^ - / , % |
Its worth to note that we can specify
SCAN() function has been demonstrated below;
SAS SCAN( ) function:
SYNTAX: SCAN (char_string, n, dlm_char<optional> );
If we see the above code more closely;
Using SCAN() function we are trying to extract title, first name and Surname from the given name.
String has been separated by list of delimiters viz blank / *
The first statement title = SCAN(str, 1);
will extract 1 part that is title.
The Second statement first_name = SCAN(str,2);
will extract the second part of the string that is First Name separated by a blank.
The third statement surname = SCAN(str,-1);
considering the fact that middle name might be missing in many cases; to generalize the code second argument is given negative value. In this case SCAN( ) will strart couting the words (parts) from the end of the string in reverse direction.
In the Fourth and Fifth statement; we are trying to test what happens when we supply invalid values as second argument to SCAN() function.
Its worth to note that in case of fourth statement invalid_second_arg_1 = SCAN(str,100);
The corresponding output is blank but no ERROR or No Notes in the log.
While in case of Fifth statement invalid_second_arg_2 = SCAN(str,0);
The corresponding output is blank and log shows the following note.
NOTE: Argument 2 to function SCAN at line 22 column 24 is invalid.
str=Mr. Rob/K*Thomas title=Mr first_name=Rob surname=Thomas invalid_second_arg_1= invalid_second_arg_2= _ERROR_=1 _N_=1
Other variant of SCAN( ) function :
SCANQ() : SCAN( ) and SCANQ( ) operate very similar the considerable differences would be SCAN() considers occurance of delimiting characters inside double quotation mark while SCANQ( ) ignores it.
So SCANQ() is more safer in case; when we are dealing with strings having double quotation marks within.
In addition to this difference SCANQ( ) also operates on some different set of delimiters.