A Quick Look at Cross Site Scripting - What is Cross Site Scripting?

Post date: Mar 18, 2011 10:59:45 AM

Let's say we are taking some information passed in on a querystring (the string after the (?) character within a URL),

with the purpose of displaying the content of a variable, for example, the visitor's name:

http://www.yourdomain.com/welcomedir/welcomepage.php?name=John

As we can see in this simple querystring, we are passing the visitor's name as a parameter in the URL,

and then displaying it on our "welcomepage.php" page with the following PHP code:

<?php

echo ‘Welcome to our site ' . stripslashes($_GET[‘name']);

?>

The result of this snippet is shown below:

Welcome to our site John

Following the same concept above described, we might build a new URL for achieving more dangerous and annoying effects. It’s just a matter of including a little bit of JavaScript.

For instance:

    http://www.yourdomain.com/welcomedir/welcomepage.php?

    name=<script language=javascript>window.location=

    ”http://www.evilsite.com”;</script>

It’s getting more complex now. As we can appreciate, a JavaScript redirection will  take place to “www.evilsite.com”, just by including the above URL in the browser location bar. At first glance, it’s not as bad as it seems. After all, we haven’t seen anything that could significantly harm our website. But, is it really true? Let’s present a new example, which might quickly change your mind

We’ll demonstrate how easy is to manipulate URLs and inject JavaScript into them, for malicious purposes.

For example:

http://www.yourdomain.com/welcomedir/welcomepage.php?

name=<script language=javascript>setInterval

("window.open('http://www.yourdomain.com/','innerName')",100);

</script>

Now, let’s explain in detail what’s going on here. We have inserted JavaScript code to making a request for the http://www.yourdomain.com index page every 100 milliseconds.

The setInterval() method is taking care of the task, but other JavaScript methods, such as setTimeout() with a recursive implementation would do the trick too. The code could either heavily overload the Web server where our site is located or generate a Denial of Service condition by denying access to other visitors requesting the same page (or other pages), and inflict noticeable damage to the server performance.

On the other hand, it would be harmful to our website’s reputation, just because other users cannot get access to it. Not very good, huh?

JavaScript embedded in the link.

An example is useful to properly understand how this technique works:

    <a href=”http://www.yourdomain.com/welcomedir/

    welcomepage.php?name=<script language=javascript>window.location=’

    http://www.evilsite.com’;</script>”>healthy food</a>

If we take a deeper look at the code above listed, we can see clearly what’s going on. Within the regular link, the JavaScript code is inserted to redirect users to a completely different site. The expression seems to be an apparently innocent link, but it’s in fact hiding something else, the JavaScript embedded in the link.

We might send out this link to someone else, so our unworried recipient would click the link to find out a little more about healthy food, and instead being redirected to a different site location, getting something he or she would never expect to see.

Our site’s reputation could be seriously wounded, as we can fairly imagine, if someone is taking care of sending around our URL with the JavaScript code embedded in the link, to numerous recipients. That would result in the nasty redirecting effect previously described. And recipients wouldn’t be happy about it at all!

Having presented the most commonly used Cross Site Scripting techniques, we need to tackle a proper solution to avoid their ugly effects and prevent ourselves from becoming victims of them.

JavaScript embedded in the link.

First off, we need to follow simple and straight rules, applicable to common scenarios, where user input is always involved.

Always, all the time, and constantly (pick your term), check to ensure what’s coming from POST and GET requests. However obvious, you should never pass by these steps.

If a specific and particular type of data is expected, check to ensure that it’s a really valid type and that its of the expected length. Whatever programming language you’re using will give you the possibility and the power to do that easily.

Whenever possible, use client-side validation for adding extra functionality to user input checking. Please note that JavaScript validation cannot be used on its own for checking data validity, but it may help to discourage some evil-minded visitors from entering malicious data while providing useful assistance to other well-intended users.

Remove conflicting characters from user input. Search for < and > characters and make sure they're quickly removed. Single and double quotes must be escaped properly too. Many professional websites fail when dealing with character escaping. I hope you won’t.

We might go on endlessly, with numerous tips about validating user data, but you can get a lot more from just checking some other useful tutorials and articles. For the sake of this article, we’ll show an example to prevent Cross Site Scripting using PHP.

Coding for our safety

Let’s define a simple function to prevent the querysting from being tampered with external code. The function “validateQueryString()” is the following:

 

   <?php

    function validateQueryString ( $queryString , $min=1,

                                   $max=32 ) {

      if ( !preg_match ( "/^([a-zA-Z0-9]{".$min.",".$max."}=[a-zA-Z0-9]{".$min.",".$max."}&?)

    +$/", $queryString ) ) {

        return false;

      }

      return true;

    }

    ?>

Once we have defined this function, we call it this way:

    <?php

    $queryString = $_SERVER[‘QUERY_STRING’];

    if ( !validateQueryString ( $queryString ) ) {

      header( ‘Location:errorpage.php’ );

    }

    else {

    echo ‘Welcome to our site!’;

    }

    ?>

Let’s break down the code to see it in detail.

The function performs pattern matching to the querystring passed as a parameter, checking to see if it matches the standard format of a querystring, including GET variable names that only contain the numbers 0-9 and valid letters either in lowercase or uppercase. Any other characters will be considered as invalid. Also, we have specified as a default value that variables can be from 1 to 32 characters long. If matches are not found, the function returns false. Otherwise, it will return true.

Next, we have performed validation on the querystring by calling the function. If it returns false -- that is, the querystring contains invalid characters -- the user will be taken to an error page, or whatever you like to do. If the function returns true, we just display a welcome message.

Of course, most of the time, we really know what variables to expect, so our validation function can be significantly simplified.

Given the previous URL,

    http://www.yourdomain.com/welcomedir/

    welcomepage.php?name=John

where the “name” variable is expected, we might write the new  “validateAlphanum()” function:

    <?php

    function validateAlphanum( $value , $min = 1 , $max =

                                32 )  {

      if ( !preg_match( "/^[a-zA-Z0-9]{".$min.",".$max."}

    $/", $value ) ) {

        return false;

      }

      return true;

    }

    ?>

and finally validate the value like this:

    <?php

    $name = $_GET[‘name’];

    if ( !validateAlphanum ( $name ) ) {

      header( ‘Location:errorpage.php’ );

    }

    else {

      echo ‘Welcome to our site!’;

    }

    ?>

The concept is the same as explained above. The only noticeable difference is that we’re taking in the “name” variable as the parameter for the “validateAlphanum()” function and checking if it contains only the allowed characters 0-9, a-z and A-Z. Anything else will be considered an invalid input.

If you’re a strong advocate of object oriented programming, as I am, we might easily include this function as a new method for an object that performs user data validation. Something similar to this:

    <?php

    $name = $_GET[‘name’]; 

      // get variable value

    $dv = &new dataValidator(); 

      // instantiate new data

    validator object

    if ( !$dv->validateAlphanum( $name ) ) { 

      // execute validation method

      header( ‘Location:errorpage.php’ );

    }

    else {

      echo ‘Welcome to our site!’;

    }

    ?>

    

Pretty simple, isn’t it?

In order to avoid Cross Site Scripting, several approaches can be taken, whether procedural or object-oriented programming is your personal taste.

In both cases, we’ve developed specific functions to validate querystrings and avoid tampered or unexpected user input data, demonstrating that Cross Site Scripting can be prevented easily with some help coming from our favorite server-side language.

Conclusion

As usually, dealing with user input data is a very sensitive issue, and Cross Site Scripting falls under this category. It is a serious problem that can be avoided with some simple validation techniques, as we have seen through this article.

Building up robust applications that won’t make poor assumptions about visitor’s input is definitely the correct way to prevent Cross Site Scripting attacks and other harmful techniques. Client environments must always be considered as a pretty unsafe and unknown territory. So, for the sake of your website’s sanity and yours, keep your eyes open.