In the team, we all managed cloud infrastructures at some level at our previous jobs, as developers with serverless function needs or DevOps managing large production systems. And we all knew that sometimes an emergency required a manual change, a customer tweaked a setting on the console, the boss activated something that had an impact later…or put more simply, that the whole infrastructure was just not totally under control by Terraform (or similar tool), just because ops life in production is rarely perfect and ideal.
And all those tiny changes sometimes got forgotten about, or partially implemented in infrastructure-as-code later on, unfortunately leading to security issues, unstable environments, failed deployments, unexpectedly important bills. …
When talking about infrastructure drift, you often get knowing glances and heated answers. Recording gaps in your infra between what you expected to be and the reality of what is, is a well known and widespread issue bothering hundreds of DevOps teams around the globe. Interesting to note though, is that depending on their context, the exact definition they will give of drift will vary.
Facing impacts and consequences ranging from intensive toil to dangerous security threats, many DevOps teams are keenly aware of the issue and actively looking for solutions.
We decided to look more closely into how they deal with it and conducted a study that will be released in the coming weeks. Here is a foretaste of this study, outlining some of the key facts we recorded. …
The need to manage multiple Terraform environments is very common. Indeed, getting started is one thing but then you end up with various environments that you need to manage, several teams, etc… So how do you manage terraform when you start having several environments like dev, staging, prod, and how do you manage the complexity?
If you need to manage several Terraform environments, there are a lot of ways to ramp up your approach and get started.
Getting started step by step with single tf files
. The task can be daunting to know all the good practices at first, and probably you might get at best distracted, and at worse discouraged. So you can get started very naively with one single tf state
. Create a simple Terraform file, call it production.tf
, write your VPCs
, your VMs
or whatever in it. Very soon you will be able to create another environment. Let’s call it staging.tf
. It is going to be still a single tf state file
which is not bad, but it won’t scale. …
People often ask: can you, and should you declare variables in Terraform?
One of the biggest issues I had in my “Chef” days was that I could multiply strings
by booleans
which used to create very nice issues in production.
So, yes you can type variables in Terraform. Let me show you an example :
Terraform code quality is important and there are a lot of tools to improve it. A lot of them are quite difficult to use. Here are a few tools that we find really useful and can be set up in minutes for you.
Terraform works with providers for each cloud and has resources. Basically, you can see it as an instance to launch in which you describe what you want. Let’s see how internal tools can help you improve your Terraform code quality.
Terraform validate
is a subcommand in Terraform that will only address structure and coherence, which means that an obviously bad code like this one will be perfectly right in the eyes of Terraform…
There are different ways to manage manual changes on your infrastructure in Terraform, depending on the case. Here are 3 options :
Let’s say you have you have a security group that was changed manually by one of your team members, like opening an HTTP port for a specific subnet, and you discover this at the next terraform apply
. This is an easy case.
You will have the diff in the terraform output
(be it on CI, or on your laptop). That means that you can add this difference as a snippet directly on your terraform code and apply it. So you will manually add the code after the manual deployment, which means that at the next terraform apply
, Terraform will notice that the state doesn’t have this new item (in our example, allowing HTTP for a sub network) and will require your cloud provider (AWS for example) to do it, and AWS will return : “I already have it”. So, in this easy case, the difference is just going to be written to the TFstate
file, and you solved your case manually. …
The TFState
file in Terraform is what makes it very different from other systems. You can spin and launch infrastructures with other configuration management tools like Chef, Saltstack, and Ansible, but the biggest difference with Terraform relies on this state.
You can see your TFState file as a big JSON
structure of the reality of your infrastructure working together with the Terraform code in which you declare the so-called “desired state” you want to achieve.
This desired state is declarative, which means that when you declare within your code that you want a specific resource with a specific configuration and when you apply
this code, Terraform will “talk” to your Cloud provider’s API, and then spawn all those resources. …
Yes! You should be testing your Terraform code. There is a lot of ways you can do it. It is very similar to standard software engineering processes. It comes from the same culture, so there is no surprise here.
You can execute a linter directly on your laptop if that is what you want or execute it on your CI/CD system because you want to ensure others respects some coding conventions or standardization.
One linter in Terraform that I really like is named TFlint. It is open-source and available on Github. This linter even has a deep linting feature that goes beyond simple linting. You can request an existing correct VM, let’s say a T3 large
instance, but if your account is not allowed any more capacity (like for example you already launched all the VMs you were allowed to request), the linter will send a request to the AWS API for the correctness of what you request plus the ability of your request to be executed within your account limits. So, it obviously does all the things that a proper linter in development will do, like single quotes, quality, tabs, etc… and, it goes beyond the validator and the formatter that you can find directly on Terraform. …
Using a version manager makes it way less painful to deal with multiple Terraform versions locally, and will make sure that:
tfenv
is a good one, inspired by rbenv
. Plus it has a few convenient commands, such as tfenv install min-required
that will recursively go through your terraform files to determine the minimally required version.
When they happen, Terraform upgrades can hurt. They hurt, even more, when you have some random exception that you don’t understand, and discover that they are happening because you are running the wrong Terraform version on that specific machine. …
This article is an transcript from a video interview series : Ask Me Anything on Infrastructure as Code with the Author of “Infrastructure as Code — cookbook”
Testing is a very common practice today in engineering.
We’ve all heard about test driven developments and that kind of things. It is a quite widely spread practice in some areas of the DevOps world, like let’s say in Chef or Puppet environments. There’s a lot of testing frameworks and a big Ruby culture around tests.
In this world we had a lot of tools like InSpec, ChefSpec, Serverspec. It’s all written in Ruby. Quite robust as well. It’s been released probably like 4–5 years ago. …
About