Automating SaltStack Tasks with Webhooks

SaltStack is a fantastic tool for provisioning and configuring machines across all sorts of infrastructure setups. It’s been a staple for me for a long time when it comes to configuration management. One of the struggles that I’ve seen with people is how to utilize it though in a more event driven system. This is one place Salt really shines but it needs a little help from an outside tool

Before I begin I just want to take a minute to note that SaltStack itself has built in support for ingesting webhooks, but I will say it’s one place that is somewhat lacking in my opinion. The only real options you have are enabling the webhooks engine which is unauthenticated or using the salt net-api to ingest them. Both of these then result in the event going through the salt event reactor which can be, well, quirky.

It can also make ingesting some of the weird data schemes that things use to fit them into the shape that you need.

Thankfully there is a solution – Webhook. This is a very lightweight Go webhook engine for ingesting, parsing, and executing commands. It comes included in Debian, Ubuntu and Arch for easy install simply

apt-get install -y webhook

Once installed you will now need to configure a webhook. For this example I’ll configure a very simple webhook which will update the local fileserver.

To start we will need to configure the webhook, in Debian based repositories the service is setup to load the config file located in /etc/webhook.conf. It will accept json or yaml configuration and I usually use yaml as I find it easier to read.

- id: update-fileserver
  execute-command: "/etc/webhook/scripts/"
  command-working-directory: "/etc/webhook/scripts"
    - match:
        type: value
        value: some-secret-value-here
          source: url
          name: token
    - match:
        type: value
        value: refs/heads/master
          source: payload
          name: ref

To break down the important parts here:

  • id: This is the name of the webhook and will be the URL required to activate it
  • execute-command: This is telling the webhook what command to execute
  • trigger-rule: This is telling the webhook what it needs to match to activate this webhook. This is where to set filters and most important a secret.

Once this is configured you can startup the webhook service with

systemctl enable --now webhook

Now we need to configure where the magic happens. Most examples I see tend to use the salt reactor or bash scripts to execute salt commands, but why limit yourself? Salt provides a very convenient to use so you can combine the power of Python and Salt to do your dark bidding

First thing is first we’ll create the location to store all our excellent scripts

mkdir -p /etc/webhook/scripts/

And now we’ll create our first script. This one is fairly simple, it just uses the RunnerClient to update the Salt fileserver

import salt.runner

opts = salt.config.master_config("/etc/salt/master")
runner = salt.runner.RunnerClient(opts)


print("Filerserver updated")

Update the script file to mark it executable

chmod +x /etc/webhook/scripts/

Then all you need to do is configure your Git host to submit a webhook on push events to your webhook URL:

Now simply make a push to your git host’s Salt repository and see it in action

In my next post I’ll cover some basics of using Datadog to do event driven infrastructure tasks.

Self Hosting and Natural Disasters

I’m a huge proponent of self hosting and data ownership. I host my website, my cloud file syncing, my email – everything. All of it runs on some servers in a server rack in my house. This let’s me control and own my data without relying on third parties. Sure it’s more work for me and I have to spend time here and there maintaining it, making sure backups are working and everything is up to date, but I kind of like to do it. Plus given I use Salt it’s mostly an entirely automated process anyway.

There is one big problem I do have with self hosting though and it’s that I live in an area that is prone to very destructive hurricanes. This does make self hosting things at home very problematic when it’s entirely possible that I could lose my email and important documents during a time when I probably need both the most. It also means that if my internet goes down, no email – and while sure I wont really lose any email, not getting any is also a problem.

So ultimately I had put together a “Hurricane Preparedness” plan for my data, which would involve shifting my Nextcloud and Email services to a VPS somewhere while I would go out and batten down the hatches in preparation for the oncoming weather onslaught. I had not actually ever had to enact the plan luckily, we’ve had a few lucky years with avoiding any hurricanes in general. But I knew at some point I should just run through the process to test it (really you should also test your backups).

What I learned from the process are these things:

  1. I had no automation to deploy Nextcloud and it took a long time to re-create exactly what I needed to deploy it. I’m glad I didn’t have to do it in a rush
  2. I didn’t even consider having to migrate Bitwarden up to the cloud but losing that would be a tremendous issue.
  3. Email is insanely critical and losing access to it can be absolutely crippling in an emergency, that said the mailcow backup/restore process worked great and it was very easy to move.
  4. Raw block storage is really fuckin expensive

Ultimately what I realized during this was that my plan was just about a failure and really the last thing I want to do while I am getting ready for a hurricane is to worry about making sure my email is working (fucking DNS) and I’m not losing my password manager.

So I’ve decided that I’m going to just shift my email and password manager to permanently live on a VPS – this ultimately saves me from even having to worry about moving it and also stops me from losing these services if I lose power or internet at home. I consider these two things to be mission critical and losing access to them for even an hour is an extreme problem.

I’m still hosting my nextcloud locally because it is very expensive to run in the cloud with raw block storage. Also given my local setup with a raid disk array with replicated ZFS snapshots, it does feel very safe. But I have spent time automating my local to cloud migration to where it’s entirely automated now. One single salt formula will run the entire process and in a few hours it will be up and running in the cloud. It’s not perfect and not as nice as not having to worry about it, but given the cost, I think this one component is a fair tradeoff.

I’m not sure people always think about this fact that a natural disaster could take out their home hosted data. And even when I did think I had a plan ultimately it wasn’t a very good one once I put it into practice.

The Crossroads of Life

Note: This is a rambling article and probably not worth a read. I was going through a point in my life and just needed to write how I was feeling.

There’s no denying that 2020 was a rough year for most people, including me. There were a lot of things in the year that I neglected or ignored just because it was easy while also dealing with the effects of the pandemic. Both my physical and mental health were largely ignored because it was easier to mask it with substances or distractions. It’s easier to mask it than to deal with it, and sometimes that’s okay. Sometimes that is what you just need to do to get temporarily get through to the next day.

The problem with temporary solutions though is that they have this problem of often becoming permanent ones. The longer that it goes on the easier it is to just self medicate or distract than it is to address what is really the problem. The actual solutions start moving further and further into being too difficult to address and perpetuating the continuous negative feedback loop to avoid it.

The interesting thing though is, as you transverse down this path, life has a tendency to bring you to crossroads. They can be hard to identify sometimes, subtle little signs here and there that are so easy to miss. Sometimes they are obvious but one path is rocky and over grown and the other is smooth and inviting. Then sometimes you are just trucking along so hard that you just blow right through it.

It’s easy to choose the easy path – it wouldn’t be called the easy path if it wasn’t. Sometimes though, you really need to stop and look at both paths that life is presenting to you. You really need to sit and mull over the choices you are presented. Sure the easy path looks good now, but maybe it’s just because you can’t see further down the road to understand where it’s going to lead much further down – and it’s very likely not a place you want to be.

I feel that’s where I am at after last year, sitting at a crossroads looking down the easy path – the comfortable one I’ve continuously followed. The one that I feel I know so well. But it’s different this time. In the distance of it I see the sky darkening. The comfort of it weakening and a subtle tinge of apprehension in my heart.

It’s at this point, at this crossroads, I’ve decided it’s time to go down the other path. The one filled with rocks and trees, the one I have been avoiding to go down because of fear. I know it will be difficult and treacherous at times, but most things worthwhile in life are.

So don’t be afraid when life gives you a crossroads to sit for a second and look at both directions. Maybe right now the easy path is just the one you need to take – that’s okay. But be on guard for when the allure of the easy route is just hiding the difficulties further down the road.

New Year, New Goals

I’ve never been a big “New Years Resolution” person, but I do like to set new goals every year. Mostly these goals tend to be more about learning new things and expanding my experience – rather than the usual “Get Healthy” (shouldn’t that always be a goal?). I like to then end the year to review what my goals were and reflect on how I did – so without further adieu let’s do that. 2020 Goals and Progress

In 2020 I really had two main goals:

  • Learn to sew
  • Learn networking fundamentals

So how did I do? Well to be honest, I picked up some basic sewing projects but never even started them. Honestly I didn’t work towards that goal what so ever – which seems odd because pandemic time should have been a good time to start working on it, but unfortunately I got busy and it just got pushed down the priority list. I really still want to try this one though.

Learning networking fundamentals is something I have wanted to do for years actually and I did actually make some solid progress on it. Before I started I could maybe vaguely explain to you what a VLAN or subnet was and now I know how to use them and what CIDR notation means! I would actually generally consider this goal a success because I feel so much more confident in this aspect.

So overall I was 50% successful in achieving my goals in 2020, and honestly I’m happy with that. 2021 Goals

And now with 2021 right around the corner (and soon my application to join the Sealab 2021 crew going out) I am setting some new goals.

  • Learn 3d modeling
  • Finish at least one of the house remodeling tasks
  • Bonus: Learn to sew

I’ve been looking at getting a 3D printer and it’s kind of renewed my interest in 3d modeling. I’ve never really been a very artistic or creative person but I’ve always enjoyed the art of 3d modeling but never had time to really get into it. I’m hoping that this year I can find the time to start picking up the basics on it so I can build things with a 3d printer as well as learn some character modeling for game modding.

I also have three big house renovation tasks that need to be done and one that has been started but kind of stalled out. I want to try to set a goal for myself to finish at least one of these because it would make our living experience much nicer and it’s become even more important with the pandemic quarantines happening.

The last one is a bonus but I still really want to learn to sew and design clothes, and so I’m going to tack that one back on to this year as a bonus if I can get to it. If I can’t I won’t be upset but maybe I can push myself a bit this year!

Slow Down During Failures

A few weeks ago one of my primary VM host nodes experienced a disk failure on the disk the Hypervisor was installed on. This normally wouldn’t really be that big of a deal, as I keep a second server as a cold spare but I ran into a few problems.

I keep my main two compute nodes in a Proxmox cluster. The way it essentially works is by using a tool called corosync to sync the configuration data of VMs and Containers to all of the nodes. Each node then essentially has a copy of all this information so that in the event of a node failure, you can just move that configuration onto the new node. This process is somewhat simple but has worked great.

The only real caveat of doing this is that the cold spare is obviously off most of the time, so it does require a boot up to sync every now and then. I don’t really consider this a big problem as it should be booted up and patched every so often anyway.

Unfortunately though things have been busy and I have generally neglected taking care of that cold spare, and when that disk failed on my hot node, I knew I was in trouble. I was not concerned exactly about losing any data – all of that is stored on a mirrored ZFS array. But I knew I lost all the metadata about the VMs (think the config file specifying the details about the vm memory, disk, cpu, networking ect).

Mistake Number One: Keep your backups and maintain your recovery strategies.

At this point I shut down the dead hot server, moved all the data disks to the new one and booted up the new server to see where I was. Looking at the corosync data it was very obvious that it had easily been two months since I last booted up this machine as the VM config files were grossly out of date and many of the changes were missing.

I did luck out in that some of the more consistent stuff was still there (like my mail server which is super important and really hasn’t changed in a long time) – so I was able to move those over and get them back online fairly quickly.

The two biggest issues was that I had migrated from a Kubernetes setup to a basic VM/Compose setup and had shutdown my Gitlab some other associated instances. So I knew I had to re-create the Docker VM config file. I did have a template I could use (really just copying another VMs config and adjusting it) so I set about that.

While doing I realized I had a bunch of VM disk datasets that were no longer applicable for anything and thought that was a good time to clean those things up. Unfortunately for me, my docker VM was brand new and I wasn’t used to seeing it and since I was already stressed and in a hurry I didn’t realize that I was cleaning that VM up as well.

Mistake Number Two: Don’t do cleanup during a failure recovery phase.

Thankfully I had at least spent some time the previous week setting up a deployment pipeline to push out changes to this VM, so once I realized my mistake, I was able to spin up a new VM and at least get the configuration stuff replaced (mainly Traefik which routes EVERYTHING) but I had lost my website and accidentally wiped out my Bitwarden database.

Mistake Number Three: Not having backups.

All in all it was messy but my overall recovery strategy proved that it does work as long as you maintain it, and I discovered a few problems that I realized I needed to address before they became a real issue (like not having backups :facepalm:) so losing my website wasn’t the end of the world as it could have been a lot worse.