Overview
Malicious injections may leak confidential information to the attacker, leading to system crashes, malicious database manipulation, and even database corruption.
Learning Objectives
To understand the dangers of invalid input.
To understand the fundamental concepts of input validation.
To understand the basic defensive practice skills against malicious injection in mobile software development.
The Goal of Input Validation and Sanitization
Input validation can detect unauthorized input, such as code injection or malicious injection, before it is processed by the application. Therefore, to minimize malformed data from entering the system, unauthorized input, such as code injection or malicious injection, should be detected and sanitized before it is processed by the application. Some techniques that can be used to do this include white list validation, black list validation, and parameterized query.
White List Validation
White list validation or whitelisting is an input validation technique that checks the user's input to see if it is a "known good input." It is utilized by checking input to see if it is authorized by definition and rejecting everything that is not authorized. For example, an input field for social security number can be coded to only accept an input of 9 digits, and if an input has any characters in it that are not digits or is not 9 digits in length, the input can be rejected and prompt the user to input a valid SSN. Well-defined data patterns, such as data of SSN, date, zip codes, URL, and e-mail addresses, can be easily protected with input validation, using regular expression (RegEx) to define a very strong validation pattern. Below is an example of a snippet of code using RegEx for white list validation to validate the parameter "zip."[1]
White List Validation Example
private static final Pattern zipPattern = Pattern.compile("^\d{5}(-\d{4})?$"); public void doPost( HttpServletRequest request, HttpServletResponse response) { try { String zipCode = request.getParameter( "zip" ); if ( !zipPattern.matcher( zipCode ).matches() { throw new YourValidationException( "Improper zipcode format." );} .. do what you want here, after its been validated .. } catch(YourValidationException e ) { response.sendError( response.SC_BAD_REQUEST, e.getMessage() ); } }
If the input fields come from a fixed set of limited options, such as date of birth, gender, or race, a drop-down list or radio buttons can be used to make an exact match, removing the free text field altogether to prevent malicious data injection attacks. However, if drop-down lists or radio buttons cannot be utilized, the free text field must be protected from invalid inputs. A programmer must only provide a very limited RegEx subset to restrict the user inputs, preventing possible malicious data injections. It is always better to prevent attacks as early as possible in the processing of the user’s request.
Black List Validation
Black list validation or blacklisting is an input validation technique that checks the user's input to see if it is a "known bad input." It is utilized by detecting common attack characters and patterns like the ' character, the string like "1=1", or the <script> tag and can easily be done with RegEx. Black list validation can only be used most effectively if the bad pattern is known in advance. The main difference between white list validation and black list validation is that white list validation is accepting inputs if they are composed of only the allowed characters, whereas black list validation checks the input for disallowed characters and rejects any inputs that contain any disallowed characters. Below is an example of a snippet of code using RegEx for black list validation to exclude the characters <, >, %, and $.[2]
Black List Validation Example
Pattern p = Pattern.compile("[<>%\$]");
Matcher m = p.matcher(unsafeInputString);
if (m.matches())
{
// Invalid input: reject it, or remove/change the offending characters.
}
else
{
// Valid input.
}
Parameterized Query
Parameterized query or prepared statement is a query that uses placeholders as parameters. The parameter values are provided at the time of execution. This method is used primarily to avoid SQL injections. Parameterized queries force the developer to pass in each parameter to the query. This coding style allows the database to distinguish between code and data, regardless of what user input is supplied. Below is an example of a snippet of code that is unsafe and would allow an attacker to inject code into the query that would be executed by the database. The "customerName" parameter that is simply appended to the query allows an attacker to inject any SQL code they want.
Unsafe Parameterized Query Example
String query = "SELECT account_balance FROM user_data WHERE user_name = "
+ request.getParameter("customerName");
try {
Statement statement = connection.createStatement( … );
ResultSet results = statement.executeQuery( query );
}
The following example of a snippet of code is a safe version of the previous code. This code is protected by parameterized query, using the "?" character.
Safe Parameterized Query Example
String custname = request.getParameter("customerName");
//This should REALLY be validated too
//perform input validation to detect attacks
String query = "SELECT account_balance FROM user_data WHERE user_name = ? ";
PreparedStatement pstmt = connection.prepareStatement( query );
pstmt.setString( 1, custname);
ResultSet results = pstmt.executeQuery( );
In summary, input validation validates all input data for length, type, range, allowed character set, and, authorized pattern, and it should also identify bad patterns.
Figure 1. Mitigate Injection Risks with RegEx and Parameterized Query
References:
[1] https://www.owasp.org/index.php/Input_Validation_Cheat_Sheet
[2] http://stackoverflow.com/questions/756567/regular-expression-for-excluding-special-characters