Stelligent has a variety of projects running on Google Cloud Platform, and we want to be diligent about controlling our costs. As a long-time remote-first enterprise, our communication centers around Slack, and we want all of our alerts to be delivered there. We’ve developed a simple way to connect Google Cloud Billing to Slack by running a Google Cloud Function, and we’d like to share how we did it.

Our solution follows the pattern outlined in Google’s guide to managing programmatic budget alert notifications. When your account is over a budget limit, alert events are regularly sent to Cloud Pub/Sub, which sends them to a Cloud Function we’ll create, and that function sends an alert to Slack. You can follow Google’s diagram for that pattern by starting in the top left corner at “Budget alert” then continuing down:

Diagram of budget alert notifications

To limit the scope of our guide to the unique parts that need to be solved, our explanation starts by relying on Google and Slack’s documentation for some of the prep work. Many of those steps require escalated privileges that you or someone else in your organization may have, but we wouldn’t want to grant in our deployment automation:

  1. Follow Google’s guide to create a budget and then to create a Pub/Sub topic for that budget. In the examples below, we created a topic called “billing-alerts”.
  2. Follow Slack’s guide to create a bot token for your own Slack workspace.

With that done, you can deploy a Google Cloud Function that subscribes to the new topic, which sets it up to receive budget alerts as they’re generated, create helpful messages with useful context, and post them to the Slack destination of your choice.

If you’d like to follow along with the code samples below, see the public copy of our Google Cloud Function at https://github.com/oshaughnessy/gcp-slack-billing-alerts.

Code Overview

Our Cloud Function’s entry point – the function GCP will call whenever the program is invoked – is called notify_slack. It starts out like this:

def notify_slack(payload, context):
    """Entry point for Cloud Function that receives event data from a Cloud Billing alert.

    Args:
      payload (dict): `attributes` and `data` keys. See
        https://cloud.google.com/billing/docs/how-to/budgets-programmatic-notifications#notification_format
      context (google.cloud.functions.Context): event metadata. See
        https://cloud.google.com/functions/docs/writing/background#function_parameters
    """

As you can see in the code, it receives 2 dictionaries, “payload” and “context”. We use the event payload to include useful information in our Slack message and the event context to keep some state. Google’s programmatic API for budget alerts will send notifications repeatedly, so the code keeps track of the latest threshold it has reported on to avoid sending repeat alerts. This just about doubles the size of the code, but we quickly discovered that it was a necessary feature!

Message Details

A fair bit of the “notify_slack” function spends time sorting through the event info to build a Slack message. At the beginning of the function, the event metadata and message details are consumed:

    # payload metadata comes in `attributes`, actual event message comes in `data`
    alert_attrs = payload.get("attributes")
    alert_data = json.loads(base64.b64decode(payload.get("data")).decode("utf-8"))

The metadata is largely used for state-tracking. The message details (“alert_data”) supply most of the information for the Slack post:

    # extract relevant info from the alert data for our Slack message
    budget_name = alert_data.get("budgetDisplayName")
    cost = "${:,.2f}".format(float(alert_data.get("costAmount")))
    budget = "${:,.2f}".format(float(alert_data.get("budgetAmount")))
    currency = alert_data.get("currencyCode")

All of those pieces are combined into a message for Slack. This would be a great place to customize the message for your own team’s communication style. Make sure the emoji you use (the names surrounded by colons, “:gcp:” and “:money_with_wings:”) are installed in your Slack workspace too.

    # Compose our Slack alert
    # https://api.slack.com/reference/surfaces/formatting#basics
    slack_msg = (
        f":gcp: _{budget_name}_ billing alert :money_with_wings:\n"
        f"*{cost}* is over {threshold}% of budgeted {budget} {currency} "
        f"for period starting {interval_str}"
    )
    if threshold > 100:
        slack_msg += ":sad: https://media.giphy.com/media/l0HFkA6omUyjVYqw8/giphy.gif"

Avoiding Repeat Alerts

The function keeps track of the most recent alert threshold by storing a Python object in Google Secret Manager. Near the top of “notify_slack” you can see where a secret is created:

    # we (re)store state info in a Google Cloud Secret because it's already
    # being used for our Slack token
    billing_id = alert_attrs.get("billingAccountId")
    budget_id = alert_attrs.get("budgetId")
    secret = MySecret(
        project_id,
        context={
            "billing_id": billing_id,
            "budget_id": budget_id,
            "topic_id": topic_id,
        },
        secret_client=SECRET_CLIENT,
    )
    alert_state = restore_state(secret)

If you look in the file mysecret.py, a class is created that encapsulates all of the hoops one must jump through to make use of Secret Manager for this purpose. The class definition starts like this:

# See: https://googleapis.dev/python/secretmanager/latest/gapic/v1/api.html
class MySecret:
    """Manage a secret string in Google Cloud Secret Manager

    Attributes:
        client: google.cloud.secretmanager client
        secret: Secret Manager object
        name: full key name of Secret Manager object, including the project path
        project_id: Google Cloud Platform project in which the secret is stored
        parent: first path of path to Secret Manager object, before the relative name
        relative_name: short name of Secret Manager object, after "projects/_ID_/secrets/"
    """

An example of using that class is illustrated above, where an object is created to track state at “secret = MySecret(…)” in “notify_slack”. Because all of the Secret Manager logic is separated out into its own class, and because the Secret Manager API accepts binary-encoded data, saving and restoring state info is as simple as assigning or fetching a Python dictionary object to or from the MySecret object:

def restore_state(secret):
    """Restore our alert state from a Secret.

    Args:
        secret (MySecret object): has data with secret info and manages access
                                  to Google Cloud Secret Manager

    Returns:
        dict with state info
    """

    logging.debug("restoring state from secret")
    if secret.data:
        state = secret.data
    else:
        state = dict()
    return state


def save_state(secret, state):
    """Save our alert state in a Secret so we can pull it again the next time we run."""

    secret.data = state

To recap, we want to avoid sending the same budget alert to Slack more than once. Every time the function is invoked, it pulls information about the last invocation from Secret Manager and then compares that with the current event to see if the same alert threshold is being reported again. If the alert is new, a Slack message is constructed and sent out.

Deploying a Cloud Function

 

Cloud Functions in the Google console

Once you have all of your prerequisites ready, you can deploy your function to GCP as a Cloud Function. At this point, you should have a handful of tasks complete:

  1. You’ve created a budget in your Google Cloud project.
  2. You’ve created a Pub/Sub topic that will receive budget alerts.
  3. You’ve reviewed or customized the message that the Cloud Function will send out.
  4. You’ve configured your GitHub repo with some relevant secrets or set them in a local “Makefile.dev.env” file if you’re not going to use GitHub Actions, as described in our project’s installation steps. That Makefile would also be a great place to define SLACK_CHANNEL while you’re working out the deployment. Consider sending messages to your own @-user instead of a channel with other users.

Using GitHub Actions for a Deployment Pipeline

At Stelligent, we’ve grown very fond of GitHub Actions for deployment pipelines, so we’re using one to deploy our function to GCP as well. If you look at the top of the workflow config, you can see that the pipeline runs every time a code file is committed to GitHub:

on:
  push:
    paths:
      - 'Makefile*'
      - 'Pip*'
      - 'requirements.txt'
      - '*.py'

Every run creates a new event in the GitHub Actions console:

Example run from GitHub Actions

When it runs, it reads some important configuration and credential information from GitHub Secrets set up in the repository:

env:
  CLOUDSDK_CORE_DISABLE_PROMPTS: 1
  CLOUDSDK_CORE_PROJECT: ${{ secrets.CLOUDSDK_CORE_PROJECT }}
  CLOUDSDK_COMPUTE_REGION: ${{ secrets.CLOUDSDK_COMPUTE_REGION }}
  CLOUDSDK_COMPUTE_ZONE: ${{ secrets.CLOUDSDK_COMPUTE_ZONE }}
  CLOUDSDK_RUN_REGION: ${{ secrets.CLOUDSDK_COMPUTE_REGION }}
  CLOUDSDK_SERVICE_ACCOUNT: ${{ secrets.CLOUDSDK_SERVICE_ACCOUNT }}
  GOOGLE_APPLICATION_CREDENTIALS: gcp-slack-notifier.key

Details about creating those secrets are included in the project’s documentation.

Deployment Steps

The workflow then checks the code, prepares the container for deployment, authenticates to the Google Cloud API, and deploys the function:

      - id: gcloud-prep
        name: Grab Google Cloud service account key
        run: echo '${{ secrets.GOOGLE_APPLICATION_CREDENTIALS_JSON }}' >${{ env.GOOGLE_APPLICATION_CREDENTIALS }}
      ...
      - id: gcloud-auth
        name: Authenticate to Google Cloud Platform
        run: make gcloud-auth

      - id: deploy-function
        name: Deploy the budget alerts code as a Google Cloud Function
        run: make deploy

The actual actions taken in those steps, “make gcloud-auth” and “make deploy”, can be found in the repository’s Makefile:

gcloud-auth:
	gcloud auth activate-service-account "$(CLOUDSDK_SERVICE_ACCOUNT)" --key-file "$(GOOGLE_APPLICATION_CREDENTIALS)"
...

deploy:
	gcloud functions deploy $(CLOUD_FUNCTION) --trigger-topic=$(PUBSUB_TOPIC) --set-env-vars=SLACK_CHANNEL=$(SLACK_CHANNEL) --runtime=$(PYTHON_RUNTIME_VER) --entry-point=notify_slack
	@echo "Cloud Function $(CLOUD_FUNCTION) deployed"

The auth step ensures that the code running at GitHub can talk to the relevant GCP account. It makes use of the service key that was created prior to deploying the function. The contents of that service key are stored in a GitHub secret that is extracted at runtime and stored within the workflow’s running container.

The deploy step uploads the Python function to GCP, associates the fucntion with the Pub/Sub topic that was created manually as a prerequisite to this solution, tells the function to use Python 3.7 and run the “notify_slack” function every time it’s triggered, and configures it with the Slack channel where alerts should go.

example of "make deploy" executing in GH Actions

When all of these steps are complete, the Cloud Function is ready to run, and Slack will get one (and only one!) message every time a new budget threshold is crossed.

Wrapping Up

Surfacing budget alerts in Slack has been a great way to keep our Google teams cost-sensitive. It’s given us just the right amount of communication about what we’re spending in GCP and reminds our developers to be thoughtful about the resources they leave running in our demo and dev environments. This project has solved a real business need for us, and we’d love to hear from you if you find it useful too. Please reach out and let us know how it works out, or use the project’s GitHub repository if you have questions, feature requests, or problems to report.

Thank you for reading!

References

Stelligent Amazon Pollycast
Voiced by Amazon Polly