Sysadmins are a stereotypically grumpy bunch. Oh wait, no, that was infosec people. Or was it infosec sysadmins? The two jobs are intersecting at the corner of cynicism and experience, and while any senior system administrator worth their salt has all the information security basics down, we still find the two camps at logger heads all too frequently.
Information Security frequently covers not only the general aspects of applying sound principles, but also the often ridiculed area of “compliance”, where rules too frequently seem blindly imposed without a full understanding of the practical implications or even their effectiveness. To overcome this divide, it is necessary for both camps to better understand one another’s daily routine, practices, and the reasons behind them.
Information Security professionals would do well to reach out and sit with the operational staff for extended periods of time, to work with them and get an understanding of how the performance, stability, and security requirements are imposed and met in the so-called real-world.
Similarly, System Administrators need to understand the reasons behind any requirements imposed or suggested by an organization’s Security team(s). In an attempt to bring the two camps a little bit closer, this post will present some of the general information security principles to show that there’s reason behind what may at times seem madness.
The astute reader will be amused to find occasionally conflicting requirements, statements, or recommendations. It is worthwhile to remember Sturgeon’s Law. (No, not his revelation, although that certainly holds true in information security just as well as in software engineering or internet infrastructure.)
Nothing is always absolutely so.
Understanding this law and knowing when to apply it, to be able to decide when an exception to the rules is warranted is what makes a senior engineer. But before we go making exceptions, let’s first begin by understanding the concepts.
Defense in Depth
Security is like an onion: the more layers you peel away, the more it stinks. Within this analogy lies one of the most fundamental concepts applied over and over to protect your systems, your users and their data: the principle of defense in depth. In simple terms, this means that you must secure your assets against any and all threats – both from the inside (of your organization or network) as well as from the outside. One layer is not enough.
Having a firewall that blocks all traffic from the Big Bad Internet except port 443 does not mean that once you’re on the web server, you should be able to connect to any other system in the network. But this goes further: your organization’s employees connect to your network over a password protected wireless network or perhaps a VPN, but being able to get on the internal network should not grant you access to all other systems, nor to view data flying by across the network. Instead, we want to secure our endpoints and data even against adversaries who already are on a trusted network.
As you will see, defense in depth relates to many of the other concepts we discuss here. For now, keep in mind that you should never rely separate protection outside of your control.
Your biggest threat comes from the inside
Internal services are often used by large numbers of internal users; sometimes they need to be available to all internal users. Even experienced system administrators may question why it is necessary to secure and authenticate a resources that is supposed to be available to “everybody”. But defense in depth requires us to, as it hints at an uncomfortable belief held by your infosec colleagues: your organization either already has been compromised and you just don’t know it, or it will be compromised in the very near future. Always assume that the attacker is already on the inside.
While this may seem paranoid, experience has shown time and again that the majority of attacks occur or are aided from within the trusted network. This is necessarily so: attackers can seldom gather all the information or gain all the access required to achieve their goals purely from the outside (DDoS attacks may count as the obligatory exception to this rule – see above re Sturgeon’s Law). Instead, they usually follow a general process in which they first gain access to a system within the network and then elevate their privileges from there.
This is one of the reasons why it is important to secure internal resources to the same degree as services accessible from the outside. Traffic on the internal network should be encrypted in transit to prevent an adversary on your network being able to pull it off the wire (or the airwaves, as the case may be); it should require authentication to confirm (and log) the party accessing the data and deny anonymous use.
This can be inconvenient, especially when you have to secure a service that has been available without authentication and around which other tools have been built. Which brings us to the next point…
You can’t just rub some crypto on it
Once the Genie’s out of the bottle, it’s very, very difficult to get it back in. Granting people access or privileges is easy, taking them away is near impossible. That means that securing an existing service after it has been in use is an uphill battle, and one of the reasons why System Administrators and Information Security engineers need to work closely in the design, development and deployment of any new service.
To many junior operations people, “security” and “encryption” are near equivalent, and using “crypto” (perhaps even: ‘military grade cryptography’!) is seen as robitussin for your systems: rub some on it and walk it off. You’re gonna be fine.
But encryption is only one aspect of (information) security, and it can only help mitigate some threats. Given our desire for defense in depth, we are looking to implement end-to-end encryption of data in transit, but that alone is not sufficient. In order to improve our security posture, we also require authentication and authorization of our services’ consumers (both human and software alike).
Authentication != authorization
Authentication and authorization are two core concepts in information security which are confused or equated all too often. The reason for this is that in many areas the two are practically conflated. Consider, for example, the Unix system: by logging into the system, you are authenticating yourself, proving that you are who you claim to be, for example by offering proof of access to a given private ssh key. Once you are logged in, your actions are authorized, most commonly, by standard Unix access controls: the kernel decides whether or not you are allowed to read a file by looking at the bits in an inode’s st_mode, your uid and your group membership.
Many internal web services, however, perform authentication and authorization (often referred to as “authN” and “authZ” respectively) simultaneously: if you are allowed to log in, you are allowed to use the service. In many cases, this makes sense – however, we should be careful to accept this as a default. Authentication to a service should, generally, not imply access of all resources therein, yet all too often we transpose this model even to our trusty old Unix systems, where being able to log in implies having access to all world-readable files.
Principle of least privilege
Applying the concept of defense in depth to authorization brings us to the principal of least privilege. As noted above, we want to avoid having authentication imply authorization, and so we need to establish more fine grained access controls. In particular, we want to make sure that every user has exactly the privileges and permissions they require, but no more. This concept spans all systems and all access – it applies equally to human users requiring access to, say, your HR database as well as to system accounts running services, trying to access your user data… and everything in between.
Perhaps most importantly (and most directly applicable to system administrators), this precaution to only grant the minimal required access also needs to be considered in the context of super-user privileges, where it demands fine-grained access control lists and/or detailed sudoers(5) rules. Especially in environments where more and more developers, site reliability engineers, or operational staff require the ability to deploy, restart, or troubleshoot complex systems is it important to clearly define who can do what.
Extended filesystem Access Control Lists are a surprisingly underutilized tool: coarse division of privileges by generic groups (“admins”, “all-sudo”, or “wheel”, perhaps) are all too frequently the norm, and sudo(8) privileges are granted almost always in an all-or-nothing approach.
On the flip side, it is important for information security engineers to understand that trying to restrict users in their effort to get their job done is a futile endeavor: users will always find a way around restrictions that get in their way, often times in ways that further compromise overall security (“ssh tunnels” are an immediate red flag here, as they frequently are used to circumvent firewall restrictions and in the process may unintentionally create a backdoor into production systems). Borrowing a bit from the Zen of Python, it is almost always better to explicitly grant permissions than to implicitly assume they are denied (and then find that they are worked around).
Perfect as the enemy of the Good
Information security professionals and System Administrators alike have a tendency to strive for perfect solutions. System Administrators, however, often times have enough practical experience to know that those rarely exist, and that deploying a reasonable, but not perfect, solution to a problem upon which can be iterated in the future is almost always preferable.
Herein lies a frequent fallacy however, which many an engineer has derived: if a given restriction can be circumvented, then it is useless. If we cannot secure a resource 100%, then trying to do so is pointless, and may in fact be harmful.
A common scenario might be sudo(8) privileges: many of the commands we may grant developers to run using elevated privileges can be abused or exploited to gain a full root shell (prime example: anything that invokes an editor that allows you to run commands, such as via vi(1)’s “!command” mechanism). Would it not be better to simply grant the user full sudo(8) access to begin with?
Generally: no. The principle of least privilege requires us to be explicit and restrict access where we can. Knowing that the rules in place may be circumvented by a hostile user lets us circle back to the important concept of defense in depth, but we don’t have it easier for the attackers. (The audit log provided by requiring specific sudo(8) invocations is another beneficial side-effect.)
We mustn’t let “perfect” be the enemy of the “good” and give up when we cannot solve 100% of the problems. At the same time, though, it is also worth noting that we equally mustn’t let “good enough” become the enemy of the “good”: a half-assed solution that “stops the bleeding” will all too quickly become the new permanent basis for a larger system. As all sysadmins know too well, there is no such thing as a temporary solution.
If these demands seem conflicting to you… you’re right. Striking the right balance here is what is most difficult, and senior engineers of both camps will distinguish themselves by understanding the benefits and drawbacks of either approach.
Understanding your threat model
As we’ve seen above, and as you no doubt will experience yourself, we constantly have to make trade-offs. We want defense in depth, but we do not want to make our systems unusable; we require encryption for data in transit even on trusted systems, because, well, we don’t actually trust these systems; we require authentication and authorization, and desire to have sufficient fine-grained control to abide by the principle of least privilege, yet we can’t let “perfect” be the enemy of the “good”.
Deciding which trade-offs to make, which security mechanisms to employ, and when “good enough” is actually that, and not an excuse to avoid difficult work… all of this, infosec engineers will sing in unison, depends on your threat model.
But defining a “threat model” requires a deep understanding of the systems at hand, which is why System Administrators and their expertise are so valued. We need to be aware of what is being protected from what threat. We need to know what our adversaries and their motivations and capabilities are before we can determine the methods with which we might mitigate the risks.
Do as DevOps Does
As system administrators, it is important to understand the thought process and concepts behind security requirements. As a by-and-large self-taught profession, we rely on collaboration to learn from others.
As you encounter rules, regulations, demands, or suggestions made by your security team, keep the principles outlined in this post in mind, and then engage them and try to understand not only what exactly they’re asking of you, but also why they’re asking. Make sure to bring your junior staff along, to allow them to pick up these concepts and apply them in the so-called real world, in the process developing solid security habits.
Just like you, your information security colleagues, too, get up every morning and come to work with the desire to do the best job possible, not to ruin your day. Invite them to your team’s meetings; ask them to sit with you and learn about your processes, your users, your requirements.
Do as DevOps does, and ignite the SecOps spark in your organization.
There are far too many details that this already lengthy post could not possible cover in adequate depth. Consider the following a list of recommended reading for those who want to learn more:
Security through obscurity is terrible; that does not mean that obscurity cannot still provide some (additional) security. http://danielmiessler.com/study/security_and_obscurity/
Be aware of the differences between active and passive attacks. Active attacks may be easier to detect, as they are actively changing things in your environment; passive attacks like wire tapping or traffic analysis, are much harder to detect. These types of attacks have a different threat model. https://en.wikipedia.org/wiki/Attack_%28computing%29#Types_of_attacks
Don’t assume your tools are not going to be in the critical path. https://en.wikipedia.org/wiki/Shellshock_%28software_bug%29
Another example of why defense in depth is needed is the fact that often times seemingly minor or unimportant issues can be combined to become a critical issue. https://en.wikipedia.org/wiki/Privilege_escalation
The “Attacker Life Cycle”, frequently used within the context of so-called “Advanced Persistent Threats”, may help you understand more completely an adversaries process, and thus develop your threat model: https://en.wikipedia.org/wiki/Advanced_persistent_threat#APT_life_cycle
This old essay by Bruce Schneier is well worth a read and covers similar ground as this posting. It includes this valuable lesson: When in doubt, fail closed. “When an ATM fails, it shuts down; it doesn’t spew money out its slot.” https://www.schneier.com/essays/archives/2000/04/the_process_of_secur.html