Leveraging Holistic Synergies

Triaging security issues in existing Django apps (This Old Pony #52)

Last week we looked over some strategies for prioritizing issues in an existing Django project specifically in the context of linter and static analysis reporting. That provided a good context, so now we can backtrack a little bit and cover the first priority, always, when updating an existing Django app: security!

FYI, last week's issue was mis-numbered as #50 when it was, in fact, #51, making this issue #52. 


You know what's great about third party packages? You get other people's code that already solves the same problems you had, and all for free! You know what the bad part about third party packages is? It's never actually free.

Sure, the marginal financial cost of adoption may be $0, but you still need to keep up-to-date, especially when these packages have security vulnerabilities. And the cost is yet higher if these packages go out of date without addressing said vulnerabilities, or supporting other software that does address them.

This is a roundabout way of pointing out that outdated dependencies are one of the easiest security issues to identify and _often _among the easiest to solve for (except when they're not). Now it's generally a good thing to update your project's dependencies but this entails its own costs, especially if you rely on many third-party packages or your project is deeply coupled with one or more third-party packages. 

The solution is to triage, using a tool like safety[x] to analyze your requirements and check against a vulnerability database. This isn't a perfect strategy, but it's at least a good one. It won't tell you if your dependencies have 0 security vulnerabilities, but it will tell you if your dependencies have any known vulnerabilities[x]. So now you can focus on only these dependencies, even if others are out of date. Further, you can assess the particular vulnerabilities and decide whether they affect your application. They may not! All things being equal there are significant benefits to keeping up-to-date, but in the face of costs to the upgrades and other opportunity costs in the app, you can triage and focus only on those that affect your project.

Can you keep a secret?

If it's hard coded in your source code then probably not. 

_Secrets _include mainly passwords, but also API keys and your project's SECRET_KEY used for cryptographic signing (e.g. session values). If these values are present in your source code they only as secure as your source code, and as the number of locations your source exists outside of your deployed environment increases, so does the likelihood of these secrets leaking. 

"Hardly a problem," you say, "my team uses private repositories." And if all of these repos are on a secure intranet on desktop computers on this secure intranet and all of the developers have access to these credentials even without the source code then maybe you're right. But in an age of laptops and sharing services and cloud services your source code is already all over the place. Even if you're not worried about hackers there's the chance of incidental exposure which is non-trivial. Not to mention the cost of needing to not just redeploy after accidental exposure but reconfiguring your code distribution on the fly... and this is not to mention the opportunity for accidents involving access to the wrong environment because the credentials were available and the developer friction this may cause.

A typical hacker in the wild, Homo hoodius

The first step is identifying the issue. I'd recommend using a tool like bandit[2] to perform security oriented static analysis. Among other things this will look for hard coded passwords. Of course you start by looking through your settings and how these values are provided.

When it comes to solving, there are several solutions, and I'll group into three categories (even though they overlap quite a bit)[3]:

  1. Use environment variables
  2. Use a server-side managed deployment-specific settings file (e.g. "local_settings.py" that is manually updated) but kept out of source control
  3. Do store values in the repo but keep them encrypted (e.g. using Ansible Vault[4]) and decrypt as part of deployment Like I said, these overlap. Environment variables could be managed via a PaaS like Heroku or provided by an intermediating service (etcd); they could even be added to the process from a file, manually updated or stored in source control encrypted.

Johnny Drop Tables[5]

SQL injection is a now quite mature attack in which malicious strings are added to form inputs that will include custom-to-the-attacker SQL in the expected queries.

It's mature and its well guarded against but it's still a threat. If you're using the ORM and nothing else, you have little to worry about. If you're using custom SQL then you must be absolutely sure that you a) use the driver's provided API for inserting inputs into queries and b) you always do this with user provided inputs.

Tools like bandit will also look for apparent SQL injection vulnerabilities but if you find any they should be addressed immediately.

Your system configuration

This is a little less Django specific but critical nonetheless, especially if you're managing your own [virtual] servers. None of this matters a lick if your servers are all exposed to the internet and unprotected.

If you're using a PaaS you can _mostly _take for granted this is taken care of, but otherwise make sure you understand how your private network is configured (services like AWS and Digital Ocean will let you control this), use firewalls on your machines, and try to minimize what needs to be exposed. You can use standard Linux networking tools to perform basic port scanning.

Secure HTTP

Contrary to what Google claims, there's no strict need for every site to be HTTPS, certainly not HTTPS only. For most web apps though, HTTPS should be the default, and it should be strictly enforced for admin access and most user interaction.

This is mostly not a Django concern. The HTTPS connection is brokered by the web server (e.g. Nginx, Apache, or even uWSGI). However if you have responsibility and power to keep this updated, take advantage of it, and if you do not, then make sure to prod those who do. 

From within Django you can enforce HTTPS connections only (dropping plain HTTP connections) and you can add in app-level redirects to ensure that, for example, the Django admin is always accessed over HTTPS, no matter the default web server configuration allows.

Is this it? No. You might look at whether CSRF protection is enabled and whether your application code makes proper use of user permissions and access controls within the app itself. This is not your exhaustive reference, but a key for where to start when starting seems too daunting.

Inescapably yours,

[0] Safety uses the pyup database https://github.com/pyupio/safety
[1] This reminds me of my favorite medical question of all time, "Are you allergic to any medications?" The only sensible answers are "yes" and "I don't know". You either know you have at least one drug allergy or your have yet to encounter a drug or class of drugs to which you have an allergic reaction. The certainty of "no" is appealing but total BS!
[2] Bandit: https://github.com/PyCQA/bandit 
[3] An oldie but goodie from the Wellfire archives: https://wellfire.co/learn/easier-12-factor-django/
[4] Ansible Vault is but one way to handle encrypted hard coded data https://docs.ansible.com/ansible/2.4/vault.html
[5] In case you missed it: https://www.xkcd.com/327/

Originally published 2018-06-26