• MPSimmons 2 hours ago

    Cloud providers in general haven't gone very far toward providing hooks for validation.

    It seems easier for the cloud provider to implement the equivalent of a dry-run flag in API calls that validate that the call would succeed (even if it's best effort determination) which could be used by tools like Terraform during the planning and dependency tree generation.

    Instead, you have platform providers like AzureRM that squint at the supplied objects and make a guess as to whether that looks valid, which causes a ton of failures upon actual application. For instance, if you try to create storage with a redundancy level not supported by the region you're adding it to, Terraform will pass a plan stage, but the actual application of the resource will fail because the region doesn't support that level of redundancy.

    There are unlimited other examples in a similar vein, all of which could be resolved if API providers had a dryrun flag.

    • willi59549879 2 hours ago

      I am not a fan of abreviations, this article didn't even have terraform written out once.

      • parpfish 2 hours ago

        I assumed it was going to be about tensorflow

      • akersten 3 hours ago

        The most confusing part of terraform for me is that terraform's view of the infrastructure is a singleton config file that is often stored in that very infrastructure. And then you have to share that somehow with your team and be very careful that no one gets it out of sync.

        Why don't cloud providers have a nice way for tools like TF to query the current state of the infra? Maybe they do and I'm doing IaC wrong?

        • cobolexpert 2 hours ago

          At $WORK we have a Git repo set up by the devops team, where we can manage our junk by creating Terraform resources in our main AWS account.

          The state however is always stored in a _separate AWS account_ that only the devops team can manage. I find this to be a reasonable way of working with TF. I agree that it is confusing though, because one is using $PROVIDER to both create things and manage those things at the same time, but conceptually from TF’s perspective they are very different things.

          • raffraffraff 2 hours ago

            There is the code, the recorded state of the infra when you applied the code and the actual state at some point in the future (which may have drifted) . You store the code in git, the recorded state (which contains unique IDs, ARNs etc) in a bucket and you read the "actual state" next time you run a plan, and you detect drift.

            These days people store the state in terraform cloud or spaceliftor env0 or whatever. Doesn't have to be the same infra you deployed.

            If you were a lunatic you could not use a state backend and just let it create state files in the terraform code directory, check the file into git with all those secrets and unique ids etc.

            • don-code 2 hours ago

              > Why don't cloud providers have a nice way for tools like TF to query the current state of the infra? Maybe they do and I'm doing IaC wrong?

              This is technically how Ansible works. Here's an extensive list of modules that deploy resources in various public clouds: https://docs.ansible.com/projects/ansible/2.9/modules/list_o...

              That said, it looks like Ansible has deprecated those modules, and that seems fair - I haven't actually heard of anyone deploying infrastructure in a public cloud with Ansible in years. It found its niche is image generation and systems management. Almost all modern tools like Terraform, Pulumi, and even CloudFormation (albeit under the hood) keep a state file.

            • mooreds 2 hours ago

              > The most confusing part of terraform for me is that terraform's view of the infrastructure is a singleton config file that is often stored in that very infrastructure.

              These folks also have an article about that: https://newsletter.masterpoint.io/p/how-to-bootstrap-your-st...

              • bigstrat2003 2 hours ago

                That article is way overkill. One should just manually create the backend storage (S3 bucket or whatever you use). No reason to faff about with the steps in the article.

                • catlifeonmars 2 hours ago

                  This is excellent advice.

                  When you have a hammer… as the expression goes. It’s crazy how many times that even knowing this, I have to catch myself and step back. IaC is a contextually different way of thinking and it’s easy to get lost.

              • colechristensen 2 hours ago

                There are three things:

                * Your terraform code

                * The state terraform holds which is what it thinks your infrastructure state is

                * The actual state of your infrastructure

                >Why don't cloud providers have a nice way for tools like TF to query the current state of the infra?

                What a terraform provider is is code that queries the targeted resources through whatever APIs they provide. I guess you could argue these APIs could be better, faster, or more tuned towards infrastructure management... but gathering state from whatever resources it manages is one of the core things terraform does. I'm not sure what you're asking for.

                • fragmede 2 hours ago

                  for the plan file to be updated to the state of the world in a non-conusing way so that apply does the right thing without a chance it's gonna blow things up.

                  • colechristensen 2 hours ago

                    This is really up to the writer of the provider (very often the service itself) to have the provider code correctly model how the service works. It very often doesn't and allows you to plan error-free what will fail during apply.

                    It's not an API issue but a terraform provider issue having missing or incomplete code (i.e. https://github.com/hashicorp/terraform-provider-aws )

                • cyberax 2 hours ago

                  > Why don't cloud providers have a nice way for tools like TF to query the current state of the infra?

                  They do! In fact, this is my greatest pet peeve with TF, it adds state when it's not needed.

                  I was doing infra-as-code without TF with AWS long time ago. It went like this:

                    env_tag = "${project_name}-${env_name}"  
                    aws_instances = conn.describe_instances(filter_by_tag={"env_tag": env_tag})
                    if len(aws_instances) != 1:
                      conn.launch_aws_instances(tags={"env_tag": env_tag})
                  
                  AWS has tag-on-create now, making this sort of code reliable. Before that, you could do the same with instance idempotency tokens. GCP also has tags.
                • jdalsgaard 2 hours ago

                  Most tools, frameworks and articles in IT, SaaS in particular, are about spinning up things. It is what people find exciting.

                  Work a few years in Ops and you learn that spinning up things is not a big part of your work. It's maintenance, such as deleting stuff.

                  Unfortunately this process is the hardest, and there's very little to help you do it right. Many tools, framework and vendors don't even have proper support for it.

                  Some even recommend 'rinse and repeat' instead of adjusting what you have - and this method is not great if you value uptime, nor if you have state that you want to preserve, such as customer data :-)

                  Deleting stuff, shutting services down, turning off servers - those are hard tasks in IT.

                  • sshine an hour ago

                    I love how terraform can describe what I’ve got. Sort of. Assuming I or my colleagues or my noob customers don’t modify resources on the same account.

                    I don’t love how unreliable providers are, even for creating resources. Clouds like DigitalOcean will 429 throttle me for making too many plans in a row with only 100+ resources. Sometimes the plan goes through, but the apply fails. Sometimes halfway through.

                    I’d rather use a cloud-specific API, unless I’m certain of the quality of the specific terraform provider.

                    • otterley 29 minutes ago

                      "Because referential integrity is a thing, and if you don't have all dependencies either explicitly declared or implicitly determinable in your plan, your cloud provider is going to enforce it for you."

                      • based2 2 hours ago

                        Because TF is lacking sequentials state descriptions in rare cases - ex: Termination protections in AWS.

                        • dpkirchner 2 hours ago

                          Hell, let's talk about why ^c'ing the plan phase sucks.