Security for building modern web apps
This article is inspired in this post, a great article about things to know before building a web app nowadays. Isn't a very long list, but several security recommendations were left out, so I'm moving my ass and sharing the knowledge.
The focus of this article is for developers from startups who want to develop a web application from scratch, don't know much information security and don't want spend much time to add security to their applications either. That said, some important activities will not be discussed here such as threat modeling, continuous delivery security, etc. The goal here is not to replace existing code security checklists (e.g., OWASP, SANS), but complement them with modern advices. After all, security concepts are in general very old, (e.g. security design principles were defined in 70s), and are present today and will be present in the future, but there is a need to adapt to our reality.
Note: Although lists and articles like these are helpful, security is a process that must be very close to the development process, since the beginning. Always consider an application security professional to help you out.
Output filtering: the famous Cross-Site Scripting, also known as "XSS" or "HTML Injection", strikes when there is no output filtering and execute arbitrary code. Its defense depends on context such as whether the dynamic value is placed on a HTML tag attribute (onclick, onload, etc) or within the body (e.g.,
$("p:first").innerHTML=dangerousVariable). HTML encode examples such as 'dangerousVariable' and only allow a whitelist if you store any dynamic variable in a HTML tag attribute. Filtering the input will help, but remember that XSS depends on context, so not all filtering may be sufficient. I've explained XSS in a detailed post here (PT-BR).
Use Static Pages: the good thing about Single Page Applications (SPA), besides the reduced traffic due to ajax requests, is to have a static frontend. It means less attack surface and less cost, so you can host all your content in Amazon S3 for example and let Amazon secure that for you, which is great if you don't have a security team or if your security team isn't more skilled than theirs. The downside is the lack of custom https certificate support. You'll need to move to Amazon CloudFront (CDN), which is easy to set up and will boost the availability of your web app. The downside is the need to handle assets invalidations, but there's not much to do. They have some versioning techniques using file name, which sucks, but better than nothing :/
Don't leave HTML comments behind: there are security tools that searches for HTML comments to present to the attacker to see if there is any usefulness, e.g., OWASP WebScarab. Remove HTML comments. If you need to do, use a dynamic language to comment that when the view be generated, those comments do not appear in the response.
Perform Client-Side Validation (and server-side as well): It must not replace server-side validation and has two advantages here: 1) better user experience because of the rapid feedback and 2) prevent useless requests to the backend, thus saving its availability.
Logout should be visible in every page: please don't forget that. Preferably in an expected place such as right top corner after clicking on the user's avatar.
Consider Json Web Tokens (JWT) instead of Sessions: you can have stateless servers relying on JWT instead of sessions and a database for that. The drawback here is confidentiality, remember the item above. This way you can improve your application's availability and prevent Cross-Site Request Forgery (CSRF) attacks if you store them on the LocalStorage instead of cookies. The problem of CSRF is the dumbness of the browser, which send your cookies to the server, even in cross-site requests.
Remember that LocalStorage is vulnerable to XSS and Cookies with HttpOnly flag are Not! - Although this is good for storing session identifiers (cookie w/ HttpOnly flag), there's the risk of CSRF as well. It's a trade-off, but keep this information in mind.
Pick a web framework, MVC at least: get away from scripts for building web apps. The most used frameworks already give you some protection (e.g., CSRF protection, Security headers) that you will need to implement if you're coding in PHP scripts for example. However, be careful, you may fall for the next item:
Avoid Too Much Magic: I'd say that this is the most common flaw among developers. They're so delighted with the usefulness of certain features or frameworks that blindly trust them. It gives room to a lot of security flaws and bugs. The most common example here are OAuth libraries. When SSO is needed, make sure to understand how the SSO will work in detail. Otherwise you'll have authentication / authorization bypass. There is no free lunch in development either. Do you homework before plugging some unknown code in your application, perform some code review, static analysis, check for known bugs (CVEs), read the RFC whenever possible, but don't be blind, even more in critical parts of your web app such as authentication, authorization, accountability and payment processing / gift cards.
Validate CORS Origin: unless you're exposing an API to the whole world, you should allow only the origin of your Single Page Application (SPA) to avoid in-browser calls from other websites.
Set Cookie flag Secure by default: the Secure flag allows cookies to be transmitted only over HTTPS connections, which is great, but you need to have a HTTPS port listening. It should be a must nowadays not only for security, but for increasing your Google rank in search queries. However you can't use a custom cert in Amazon S3 as far as I know. You'll need to deploy your custom cert to Amazon CloudFront (CDN), which is kind of bad to give your private key, but for small teams there aren't many options. CloudFlare thought about that and developed a Keyless SSL, but you need to set up a server that will handle all SSL handshakes, at least a part of it that uses the key, and it means more servers and more cost of course.
Avoid Business Logic Bypass: one of the most common flaw out there is authorization bypass, even on Facebook you can see that happening. For example, when editing the user account details, are you sure that if the user embed a user_id of another user, your application will block the update attempt? You need to verify if carefully among all your controllers! This usually is some validation that the developer must implement on his own, so it's common to be skipped, neglected or badly implemented. Test yourself, ask for someone with a security background to test and even make some unit tests to verify your controls! Also is worth to note the Mass Assignment Vulnerability, the same vulnerability that Homakov exploited to hack GitHub. Basically you need to whitelist your model params, otherwise an attacker can force model params by guessing their names and leverage the "framework magic" of constructing the model object from the request parameters.
Put a damn CSRF protection in your API: your web framework usually struggles to make you use CSRF protection, and when you build an API and see the "CSRF token not sent in the request" message, you usually disable it to make it work. Don't do this. CSRF is extremely dangerous, inform yourself, and make sure to add a CSRF token even during API calls. You can do this basically in 3 ways:
- Stateful Session: add a CSRF random token to each session and check in every request if they match;
- Stateless Double cookie submit technique: basically as the attacker can manipulate the request body, but can't manipule cookies because they are from another domain, you send the same random value in the cookie and in the body to the server and let it check if they match; There are some techniques to bypass if your users (or 3rd party scripts, such as advertisement) could control any subdomain. Check this paper from Blackhat for more info.
- Stateless Json Web Tokens: stored in the LocalStorage and sent in every request. The attacker cannot access cross-domain localstorage.
Don't give access to ALL operations in ALL resources in your AWS account: you won't waste much time to figure out the right permission for your AWS access credentials for your application. Don't be dumb enough to allow access to everything. If you got commit your keys to a public github repo, you're ruined (see this), or get hacked or whatever, the impact will be much smaller at the cost of few minutes to set up this right.
Don't store credentials in your source code: read it from the environment, or from a file that is deployed separately from the source code. It may give some trouble at first, but some libraries make it really easy, such as dotenv gem for ruby.
When making a Server To Server communication, VERIFY the endpoint certificate. Considering PINNING it or its public key: when you're browsing some https website, your browser verify its certificate against its trusted CAs. But when you're doing a server to server communication, who verify the certificates for you? Usually no one, so you need to set up your own logic to verify the endpoint certificate. Don't move forward until you verify, otherwise you're simply discarding the usefulness of SSL/TLS. Besides encrypting the data during the transmission, the other goal of HTTPS is to verify the authenticity of the endpoint, thus protecting from man-in-the-middle attacks. Consider using Certificate Pinning, or even better, Public Key pinning. There's a very good article from OWASP explaining this, so I won't detail much. The basics is that you talk only to who you are expecting, e.g., generate a digest from a given X509 certificate and compare it to a hard coded digest. However there is a problem if the certificate be revoked / changed. There will be a deny of service. The better options it to use public key pinning, because the public key is present in the X509 certificate and unless the certificate was generated using other key pair, no matter how many certificates be revoked / changed, your endpoint will be verified because of the public key. I'd say that it is a must for mobile apps as well.
Set up Security Headers: easily protect your web app from Clickjacking, Reflected XSS and IE content guessing by setting headers in the response (note: Ruby on Rails do most of that for you if you sent your configuration correctly). For more details, check this OWASP page.
- X-FRAME-OPTIONS: "deny" or "same origin" to prevent Clickjacking;
- X-XSS-Protection: "1; mode=block" Force the XSS Reflection protection, which is enabled by default in Chrome, but not in IE.
- X-Content-Type-Options: "nosniff" Unfortunately IE tries to guess the content of the web page even if the content/type means other content type. Which leads txt files to execute scripts if IE detect HTML code. Disable it by using this header.
- Strict-Transport-Security: "max-age=16070400; includeSubDomains" HTTP Strict-Transport-Security (HSTS) enforces secure (HTTP over SSL/TLS) connections to the server. Even if the user types http, the browser will force HTTPS, which is great.
- There are others such as Content Security Policy (CSP) but I won't discuss here.
Use CAPTCHA in your "Sign Up" and "Forgot Password" pages: Captcha isn't that boring today, thanks to Google's reCaptcha. Today you can verify if your user is a human based on his behavior instead of only challenges, thus preventing fake accounts and insane email deliveries.
Store API Key as you would store a Password (or closest as possible): if both leak the impact will be the same, so why store one safer than another? Actually there are some differences, but the point is to don't store API Keys in plaintext. API Keys should be random characters generated by the system, so they won't be subject for dictionary attacks, as passwords are, but still, in a database / filesystem / OS compromise, API keys will be available in plaintext. That said, at least some hashing is needed here. But be careful if you use something like scrypt or bcrypt, which is very recommended for passwords because its slow hash computation. Slow hash computation also results in deny of service. So, in a usual flow, you input your password one time and get a session ID, but when we talk about APIs, API credentials are passed all the time, so a slow generation will hurt a lot the application's availability. Storing the digest of the API Key shall suffice for the first version of your app using a decent algorithm such as SHA256 or SHA512. Run away from MD5 and SHA1. Run away!
Use UUID instead of sequential IDs for Primary Key at least for users: prevent user account guessing / brute force and facilite replication. There are more advantages and few advantages, but it's worth it. Note: it won't make you much more secure, only add more unguessability/obscurity from a security perspective in comparison to sequential integers.
Tokens for Forgot Password or Email Confirmation: When generating a token for Forgot Password or Email Confirmation, make sure to use Secure Pseudo-random Number Generator (RPNG), otherwise they could be guessed. Use trusted libraries / language API. Also set a Expiration Date/Time for this token. Imagine the situation where the user don't want to change his password, but one week later, someone grabs that email, access the URL and change his password. Unnecessary exposure.
Notify the old e-mail in email update: The most common action after account takeover is to change the account email to prevent the owner from recovering the password and signing in, so make sure to send an email to the old e-mail and add an option to revert the process. Facebook does that, check it out. It also applies for sensitive data update. No matter who did, but the account owner must be notified.
Disable port 80 instead of redirecting to 443: It only increases the attack surface. If 80 isn't needed, disable it. Remember that your API should listen only in 443. If you want redirection from 80 to 443, do it on your <Insert CDN name here>.
ALWAYS use Generic Error Messages: Keep in mind to ALWAYS use generic error messages, e.g., during a login attempt, don't say 'invalid username' or 'invalid password', just say 'invalid credentials' to make brute force harder, although it's possible to enumerate emails during sign up, as your system probably will (and should) let emails be unique per account. If your application generates an exception, just say 'Something went wrong', without ever exposing the stack trace. I also recommend you to use some solution to collect all exceptions and send to your email or present in a dashboard such as Raygun, Sentry, Airbrake, etc.
Confirm the user email or phone: to verify if belongs to that user before sending emails / notifications. It's recommended to be non blocking, i.e., let the user login even without confirmation, because it affects the onboarding. Take a look at Facebook: you can use your unconfirmed account for 1 day. After that you must confirm before signing in. I used to think that services like 10 minute mail turn this email confirmation useless, but as mentioned above, the benefit are not sending emails for users who do not want them and save you from being unnecessarily marked as 'spam' by users.
Others [Non-exclusive about security]
Don't pick any vendor just because they have cool features or super low price: your data is at stake, so is your reputation. There is a principle called "Reluctance to Trust", which means that you need to be careful before trusting. Reduce the number of entities that you trust is also a good thing. The more you trust, the bigger is the attack surface. That said, I usually recommend to play safe here. Bitbucket seems cheaper than GitHub in the beginning, but there is no 2 Factor authentication -- edit: they implemented 2fa recently -- How much is worth your source code? AWS leads public cloud market in comparison to their competitors and seems to be doing a great job when it comes to security taking into account the amount of sensitive information they host and eyeballs on them. So just a cheaper price of instances isn't enough to make me use another service. Everything must be taken into account, but be aware that static pages accept anything, so it's common to see company pages talking about security like they are protected against APTs and use "SSL", which btw is deprecated. Have reluctance to trust, but when you do trust, verify!
(REST) API Oriented Development: if you look closer into AWS, you'll see that API comes first, then the web UI and finally SDKs. APIs are awesome, language independent. IMHO, that's the way to go. Also is worth looking closer to HATEOAS. It also makes easy to visualize the segregation between parties. The client are static pages and the server is the brain that will receive inputs and generate outputs for the frontend. It is more clear to segregate roles and note that the web server must validate input for example. Otherwise confused in non API web apps.
Delegate Credit Card Processing: it's a good advise to delegate risk to trusted entities when you could. If you're by yourself and start storing credit card data, think again. You have a very high responsibility. Wouldn't it be better if you delegate it to a trusted payment provider such as Stripe or Paypal? I think it is, unless you can do better than that. So, make sure your app doesn't touch credit card data. Redirect to their website to finish the entire process if possible.
Where to go from here?
There's a plethora of information out there, just search for it. OWASP and SANS will help you a lot. They have many projects, articles, checklists and tools. I also recommend keeping an eye on security advisories from your tools and vendors. Besides all of this, always follow the Reddit channel /r/netsec.
You can now follow my next post: Security for later stage web apps.
Credits: Collin Greene for the 'generic error messages' topic; Reddit user _tpyo for 'UUID note'; Reddit user oauth_gateau for pointing the Blackhat paper regarding CSRF;*