How to mitigate Cross Site Scripting Vulnerabilities?

Following are the two most commonly used techniques to mitigate Cross Site Scripting vulnerabilities in web applications.

Input Validation Output Encoding

In the next few sections, we will discuss these techniques in detail.

Input Validation:

Input Validation is a technique to ensure that the user supplied data is sufficiently validated before it is processed by the application. Input validation should always be done on the user supplied code to prevent unwanted characters from being processed by the application. Use of regular expressions is one of the ways to achieve this.  Vulnerable source code example:   String studentid = request.getParameter(“studentid”); if(studentid.isEmpty()) {   request.setAttribute(“examresult”,”Please enter a value”); RequestDispatcher rd=request.getRequestDispatcher(“/Home.jsp”);             rd.include(request, response); } else { try{ *** truncated ***   } if(nodeList.getLength() > 0){   *** truncated *** } else { request.setAttribute(“examresult”,”No Results found for the input”+studentid); RequestDispatcher rd=request.getRequestDispatcher(“/Home.jsp”);             rd.include(request, response); } } If you observe the piece of code highlighted in the preceding excerpt, the application takes user input, which is then stored in the variable studentid and it is echoing back the studentid when the login fails. Furthermore, there is no validation performed on the user supplied input. When this returned response is embedded into the HTML page, it will cause a reflected XSS vulnerability if it is rendered without appropriate output encoding. When developers need to validate the user input using custom validation filters, Regular expressions can help us do it. Let us assume that the studentid must be a number. The following code shows how once can validate the user input using regular expressions allowing only whitelisted characters. Secure source code example:   String studentid = request.getParameter(“studentid”); String regex = “\d+”; if(studentid.isEmpty()) {   request.setAttribute(“examresult”,”Please enter a value”); RequestDispatcher rd=request.getRequestDispatcher(“/Home.jsp”);             rd.include(request, response); } else {             if(studentid.matches(regex)) { //Go and get the exam results *** truncated *** }     else {     request.setAttribute(“examresult”,”Enter input is not a valid number”); RequestDispatcher rd=request.getRequestDispatcher(“/Home.jsp”);             rd.include(request, response);     } } If you notice the highlighted code in the above excerpt, studentid entered by the user is validated against a regular expression.  The regular expression defined using the line String regex = “\d+”; is used to ensure that the user input is a number.  Following are the series of steps taken place in the preceding code. 

First, the application receives the post parameter studentid. Then, we are defining a regular expression to match numbers. Before the user input is processed, it is checked against the regular expression to make sure that the user input is a number. If the input is not a number, the user will be shown a generic error message.

Output Encoding:

There can be scenarios, where the application requires the user input to be  put into the HTML body directly. In such scenarios, Input Validation is not useful and thus an alternative technique must be used. This is where output encoding can be used. Output encoding is another way of sanitizing the data, which ensures that the characters are properly encoded before being displayed. So that they won’t be executed as code, instead they will just remain as data.  IBM developer works portal has a library called EscapeUtils, which provides output encoding features for Java web applications. It can be found at the following URL. https://www.ibm.com/developerworks/library/se-prevent/ The following code snippet shows how we can use this EscapeUtils library to implement output encoding in the code snippet shown earlier. Following is the code within the library EscapeUtils.   import java.io.Writer; import java.util.HashMap; public class EscapeUtils {       public static final HashMap m = new HashMap();     static {         m.put(34, “"”); // < – quote         m.put(60, “<”);   // < – less-than         m.put(62, “>”);   // > – greater-than     //User needs to map all html entities with their corresponding decimal values.       //Please refer to below table for mapping of entities and integer value of a char               }       public static String escapeHtml(String input) {         String str = input;         try {             StringWriter writer = new StringWriter((int)                             (str.length() * 1.5));             escape(writer, str);             System.out.println(“encoded string is ” + writer.toString() );             return writer.toString();            } catch (IOException ioe) {             ioe.printStackTrace();             return null;                                                     }                                                      }       public static void escape(Writer writer, String str) throws IOException {         int len = str.length();         for (int i = 0; i < len; i++) {             char c = str.charAt(i);             int ascii = (int) c;             String entityName = (String) m.get(ascii);             if (entityName == null) {                 if (c > 0x7F) {                     writer.write(“&#”);                     writer.write(Integer.toString(c, 10));                     writer.write(‘;’);                 } else {                     writer.write(c);                 }             } else {                      writer.write(entityName);             }         }     } } As you can notice in the code highlighted above, the following lines are added to EscapeUtils.java. m.put(34, “"”); // < – double quote m.put(60, “<”);   // < – less-than m.put(62, “>”);   // > – greater-than We are mapping html entities with their corresponding decimal values. So, whenever a defined html code is detected, it will be safely encoded. As you can see, not all the characters are listed here. So, developers need to map all html entities with their corresponding decimal values.  Secure source code example:     request.setAttribute(“examresult”,”Please enter a value”); RequestDispatcher rd=request.getRequestDispatcher(“/Home.jsp”);             rd.include(request, response); } else { try{        *** truncated ***   } if(nodeList.getLength() > 0){   *** truncated ***   } else { studentid = EscapeUtils.escapeHtml(studentid);      request.setAttribute(“examresult”,”No Results found for the input”+studentid);      RequestDispatcher          rd=request.getRequestDispatcher(“/Home.jsp”);             rd.include(request, response); } } As highlighted in the preceding excerpt, the data received by the application is passed to EscapeUtils.escapeHtml() function before it is used by the application. Any special characters that go through EscapeUtils.escapeHtml() will be encoded and the characters are displayed back on the page. If the payload goes through this code, it will be encoded as highlighted below.

Conclusion:

Cross Site Scripting vulnerabilities are often taken for granted with an assumption that they are not dangerous. Developers must be aware of the causes and ways to mitigate them. This article has provided practical examples of how Cross Site Scripting vulnerabilities can be mitigated by using Input Validation and Output Encoding. 

Exam Results*** truncated ***                   
         
    
    
         
        

           

                 

                        

           
     
 

Sources:

https://owasp.org/www-community/attacks/xss/ https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html https://owasp.org/www-project-top-ten/