Security for building modern web apps
This article is inspired by this post, a great article about things to know before building a web app nowadays. It isn't a very long list, but several security recommendations were left out, so I'm moving my ass and sharing the knowledge.
The focus of this article is for developers from startups who want to develop a web application from scratch, don't know much about information security, and don't want to spend much time adding security to their applications either. That said, some important activities will not be discussed here, such as threat modeling, continuous delivery security, etc. The goal here is not to replace existing code security checklists (e.g., OWASP, SANS), but complement them with modern advice. After all, security concepts are, in general, very old (e.g. security design principles were defined in the 70s) and are present today and will be present in the future, but there is a need to adapt to our reality.
Note: Although lists and articles like these are helpful, security is a process that must be very close to the development process, since the beginning. Always consider an application security professional to help you out.
Output filtering: the famous Cross-Site Scripting, also known as "XSS" or "HTML Injection," strikes when there is no output filtering and executes arbitrary code. Its defense depends on the context, such as whether the dynamic value is placed on an HTML tag attribute (onclick, onload, etc.) or within the body (e.g.,
$("p:first").innerHTML=dangerousVariable). HTML encode examples such as 'dangerousVariable' and only allow a whitelist if you store any dynamic variable in an HTML tag attribute. Filtering the input will help, but remember that XSS depends on context, so not all filtering may be sufficient. I've explained XSS in a detailed post here (PT-BR).
Use Static Pages: the good thing about Single Page Applications (SPA), besides the reduced traffic due to AJAX requests, is to have a static frontend. It means less attack surface and less cost, so you can host all your content on Amazon S3, for example, and let Amazon secure that for you, which is great if you don't have a security team or if your security team isn't more skilled than theirs. The downside is the lack of custom HTTPS certificate support. You'll need to move to Amazon CloudFront (CDN), which is easy to set up and will boost the availability of your web app. The downside is the need to handle asset invalidations, but there's not much to do. They have some versioning techniques using file names, which sucks but is better than nothing :/
Don't leave HTML comments behind: there are security tools that search for HTML comments to see if there is any usefulness for the attacker, e.g., OWASP WebScarab. Remove HTML comments. If you need to do so, use a dynamic language to comment so that when the view is generated, those comments do not appear in the response.
Perform Client-Side Validation (and server-side as well): It must not replace server-side validation and has two advantages here: 1) better user experience because of the rapid feedback and 2) prevent useless requests to the backend, thus saving its availability.
Logout should be visible on every page: please don't forget that. Preferably in an expected place such as the top right corner after clicking on the user's avatar.
Consider using Stateful Sessions instead of JSON Web Tokens (JWT): JSON Web Tokens (JWTs) are suitable for machine-to-machine communication, but they are not ideal for Single Sign-On (SSO) scenarios due to their limitations. JWTs lack revocation capabilities, meaning that once issued, they remain valid until their expiration time, making it challenging to invalidate them if needed. Additionally, the security of JWTs heavily relies on using a secret key that is difficult to guess and secure libraries for parsing the tokens, which introduces additional complexity and potential vulnerabilities.
Remember that LocalStorage is vulnerable to XSS, and Cookies with the HttpOnly flag are not!: Although this is good for storing session identifiers (cookie with the HttpOnly flag), there's the risk of CSRF as well. It's a trade-off, but keep this information in mind.
Pick a web framework, MVC at least: get away from scripts for building web apps. The most used frameworks already give you some protection (e.g., CSRF protection, Security headers) that you will need to implement if you're coding in PHP scripts, for example. However, be careful, you may fall for the next item:
Avoid Too Much Magic: I'd say that this is the most common flaw among developers. They're so delighted with the usefulness of certain features or frameworks that they blindly trust them. It gives room for a lot of security flaws and bugs. The most common example here is OAuth libraries. When SSO is needed, make sure to understand how the SSO will work in detail. Otherwise, you'll have authentication/authorization bypass. There is no free lunch in development either. Do your homework before plugging in some unknown code in your application, perform some code review, static analysis, check for known bugs (CVEs), read the RFC whenever possible, but don't be blind, even more in critical parts of your web app such as authentication, authorization, accountability, and payment processing/gift cards.
Validate CORS Origin: unless you're exposing an API to the whole world, you should allow only the origin of your Single Page Application (SPA) to avoid in-browser calls from other websites.
Set the Cookie flag Secure by default: the Secure flag allows cookies to be transmitted only over HTTPS connections, which is great, but you need to have an HTTPS port listening. It should be a must nowadays, not only for security but also for increasing your Google rank in search queries. However, as far as I know, you can't use a custom certificate in Amazon S3. You'll need to deploy your custom cert to Amazon CloudFront (CDN), which is kind of bad to give your private key, but for small teams, there aren't many options. CloudFlare thought about that and developed Keyless SSL, but you need to set up a server that will handle all SSL handshakes, at least a part of it that uses the key, and it means more servers and more cost, of course.
Avoid Business Logic Bypass: one of the most common flaws out there is authorization bypass, even on Facebook you can see that happening. For example, when editing the user account details, are you sure that if the user embeds a user_id of another user, your application will block the update attempt? You need to verify that carefully among all your controllers! This usually requires some validation that the developer must implement on their own, so it's common to be skipped, neglected, or badly implemented. Test it yourself, ask someone with a security background to test it, and even make some unit tests to verify your controls! It's also worth noting the Mass Assignment Vulnerability, the same vulnerability that Homakov exploited to hack GitHub. Basically, you need to whitelist your model params; otherwise, an attacker can force model params by guessing their names and leverage the "framework magic" of constructing the model object from the request parameters.
Put a damn CSRF protection in your API: your web framework usually struggles to make you use CSRF protection, and when you build an API and see the "CSRF token not sent in the request" message, you usually disable it to make it work. Don't do this. CSRF is extremely dangerous, inform yourself, and make sure to add a CSRF token even during API calls. You can do this in basically 3 ways:
- Stateful Session: add a CSRF random token to each session and check in every request if they match;
- Stateless Double cookie submit technique: basically, as the attacker can manipulate the request body but can't manipulate cookies because they are from another domain, you send the same random value in the cookie and in the body to the server and let it check if they match; There are some techniques to bypass this if your users (or 3rd-party scripts, such as advertisements) could control any subdomain. Check this paper from Blackhat for more info.
- Stateless JSON Web Tokens: stored in LocalStorage and sent in every request. The attacker cannot access cross-domain LocalStorage.
Don't give access to ALL operations in ALL resources in your AWS account: you won't waste much time figuring out the right permissions for your AWS access credentials for your application. Don't be dumb enough to allow access to everything. If you commit your keys to a public GitHub repo, you're ruined (see this), or get hacked or whatever, the impact will be much smaller at the cost of a few minutes to set this up right.
Don't store credentials in your source code: read them from the environment or from a file that is deployed separately from the source code. It may cause some trouble at first, but some libraries make it really easy, such as the dotenv gem for Ruby.
When making a Server-To-Server communication, VERIFY the endpoint certificate. Consider PINNING it or its public key: when you're browsing some HTTPS website, your browser verifies its certificate against its trusted CAs. But when you're doing a server-to-server communication, who verifies the certificates for you? Usually no one, so you need to set up your own logic to verify the endpoint certificate. Don't move forward until you verify it; otherwise, you're simply discarding the usefulness of SSL/TLS. Besides encrypting the data during the transmission, the other goal of HTTPS is to verify the authenticity of the endpoint, thus protecting against man-in-the-middle attacks. Consider using Certificate Pinning or, even better, Public Key pinning. There's a very good article from OWASP explaining this, so I won't go into much detail. The basics are that you talk only to who you are expecting, e.g., generate a digest from a given X509 certificate and compare it to a hard-coded digest. However, there is a problem if the certificate is revoked/changed. It will result in a denial of service. The better option is to use public key pinning because the public key is present in the X509 certificate, and unless the certificate was generated using another key pair, no matter how many certificates are revoked/changed, your endpoint will be verified because of the public key. I'd say that it is a must for mobile apps as well.
Set up Security Headers: easily protect your web app from Clickjacking, Reflected XSS, and IE content guessing by setting headers in the response (note: Ruby on Rails does most of that for you if you configure it correctly). For more details, check this OWASP page.
- X-FRAME-OPTIONS: "deny" or "same origin" to prevent Clickjacking;
- X-XSS-Protection: "1; mode=block" Force the XSS Reflection protection, which is enabled by default in Chrome but not in IE.
- X-Content-Type-Options: "nosniff" Unfortunately, IE tries to guess the content of the web page even if the content/type means another content type, which leads to text files executing scripts if IE detects HTML code. Disable it by using this header.
- Strict-Transport-Security: "max-age=16070400; includeSubDomains" HTTP Strict-Transport-Security (HSTS) enforces secure (HTTP over SSL/TLS) connections to the server. Even if the user types http, the browser will force HTTPS, which is great.
- There are others such as Content Security Policy (CSP), but I won't discuss it here.
Use CAPTCHA on your "Sign Up" and "Forgot Password" pages: CAPTCHA isn't that boring today, thanks to Google's reCAPTCHA. Today, you can verify if your user is human based on their behavior instead of only challenges, thus preventing fake accounts and excessive email deliveries.
Store API Keys as you would store a Password (or as close as possible): if both leak, the impact will be the same, so why store one more safely than the other? Actually, there are some differences, but the point is to not store API Keys in plaintext. API Keys should be random characters generated by the system, so they won't be subject to dictionary attacks like passwords are. However, in a database/filesystem/OS compromise, API keys will be available in plaintext. That said, at least some hashing is needed here. But be careful if you use something like scrypt or bcrypt, which is highly recommended for passwords because of their slow hash computation. Slow hash computation also results in a denial of service. So, in a usual flow, you input your password once and get a session ID, but when we talk about APIs, API credentials are passed all the time, so a slow generation will greatly affect the application's availability. Storing the digest of the API Key should suffice for the first version of your app, using a decent algorithm such as SHA256 or SHA512. Stay away from MD5 and SHA1.
Use UUIDs instead of sequential IDs for the Primary Key, at least for users: to prevent user account guessing/brute force and facilitate replication. There are more advantages and few advantages, but it's worth it. Note: it won't make you much more secure, only add more unguessability/obscurity from a security perspective compared to sequential integers.
Tokens for Forgot Password or Email Confirmation: When generating a token for Forgot Password or Email Confirmation, make sure to use a Secure Pseudo-random Number Generator (RPNG), otherwise, they could be guessed. Use trusted libraries/language APIs. Also, set an Expiration Date/Time for this token. Imagine a situation where the user doesn't want to change their password, but one week later, someone grabs that email, accesses the URL, and changes their password. Unnecessary exposure.
Notify the old email in an email update: The most common action after account takeover is to change the account email to prevent the owner from recovering the password and signing in. So, make sure to send an email to the old email and add an option to revert the process. Facebook does that, check it out. It also applies to sensitive data updates. No matter who did it, but the account owner must be notified.
Disable port 80 instead of redirecting to 443: It only increases the attack surface. If port 80 isn't needed, disable it. Remember that your API should only listen on port 443. If you want redirection from 80 to 443, do it on your <Insert CDN name here>.
ALWAYS use generic error messages: Keep in mind to ALWAYS use generic error messages. For example, during a login attempt, don't say 'invalid username' or 'invalid password,' just say 'invalid credentials' to make brute force harder. Although it's possible to enumerate emails during sign up, as your system probably will (and should) let emails be unique per account. If your application generates an exception, just say 'Something went wrong,' without ever exposing the stack trace. I also recommend using some solution to collect all exceptions and send them to your email or present them in a dashboard such as Raygun, Sentry, Airbrake, etc.
Confirm the user's email or phone: to verify if it belongs to that user before sending emails/notifications. It's recommended to be non-blocking, i.e., let the user log in even without confirmation because it affects onboarding. Take a look at Facebook: you can use your unconfirmed account for 1 day. After that, you must confirm before signing in. I used to think that services like 10-minute mail render this email confirmation useless, but as mentioned above, the benefit is not sending emails to users who do not want them and save you from being unnecessarily marked as 'spam' by users.
Others [Non-exclusive about security]
Don't pick any vendor just because they have cool features or a super low price: your data is at stake, so is your reputation. There is a principle called "Reluctance to Trust," which means that you need to be careful before trusting. Reducing the number of entities that you trust is also a good thing. The more you trust, the bigger the attack surface. That said, I usually recommend playing it safe here. Bitbucket may seem cheaper than GitHub in the beginning, but there is no 2-Factor authentication -- edit: they implemented 2fa recently -- How much is your source code worth? AWS leads the public cloud market compared to its competitors and seems to be doing a great job when it comes to security, considering the amount of sensitive information they host and the scrutiny on them. So just a cheaper price for instances isn't enough to make me use another service. Everything must be taken into account, but be aware that static pages accept anything, so it's common to see company pages talking about security as if they are protected against APTs and using "SSL," which by the way, is deprecated. Have reluctance to trust, but when you do trust, verify!
(REST) API Oriented Development: if you look closely at AWS, you'll see that API comes first, then the web UI, and finally SDKs. APIs are awesome, language-independent. In my humble opinion, that's the way to go. It's also worth looking closer at HATEOAS. It also makes it easy to visualize the segregation between parties. The client is the static pages, and the server is the brain that will receive inputs and generate outputs for the frontend. It is clearer to segregate roles and note that the web server must validate input, for example. Otherwise, confusion may arise in non-API web apps.
Delegate Credit Card Processing: it's a good advice to delegate risk to trusted entities when you can. If you're by yourself and start storing credit card data, think again. You have a very high responsibility. Wouldn't it be better if you delegate it to a trusted payment provider such as Stripe or PayPal? I think it is, unless you can do better than that. So, make sure your app doesn't touch credit card data. Redirect to their website to finish the entire process, if possible.
Where to go from here?
There's a plethora of information out there, just search for it. OWASP and SANS will help you a lot. They have many projects, articles, checklists, and tools. I also recommend keeping an eye on security advisories from your tools and vendors. Besides all of this, always follow the Reddit channel /r/netsec.
You can now follow my next post: Security for later-stage web apps.
Credits: Collin Greene for the 'generic error messages' topic; Reddit user _tpyo for the 'UUID note'; Reddit user oauth_gateau for pointing out the Blackhat paper regarding CSRF;