Skip to content

Serverless Web Applications

This page documents how we use the Cloud Run service to deploy web applications in a serverless manner.

How can there be no server?

Serverless is a term of art which refers to a method of executing code in the Cloud without having to directly manage the servers which the code runs on. Clearly, there are servers but their management is delegated to the Cloud provider who specifies a common interface which code running on the platform should support.

Even before the term was coined, "serverless" computing had been around since at least the 1980s. Nowadays, rather that having a broad, complex set of supported libraries and runtimes, we define a thin contract between the hosting platform and a container which packages the application code.

Where possible we architect our applications to follow the serverless computing contract. This ensures that we can easily port our applications between any hosting platform which supports the Knative specification. The serverless computing contract is also a good target to aim for even when deploying applications in kubernetes clusters as it provides a clean, orthogonal and well-specified interface between the application and the hosting environment.

Cloud Run

Cloud Run is a managed Knative service with some extensions which make things convenient for our needs. Aside from simply hosting a containerised application on the web with auto-scaling, Cloud Run also supports:

  • Automatic exposure of a Cloud SQL instance to the container.
  • Associating the workload with a Cloud IAM identity allowing the use of default application credentials within the cluster.
  • Wrapping the service in a Cloud Load Balancer allowing for custom TLS certificates, HTTP to HTTPS redirect and content caching.

Our boilerplate

We have some example code for deploying a web application within our boilerplate (Developer Hub users only). This makes use of our standard Cloud Run application terraform module.

Upcoming changes

Our standard module configures everything about the Cloud Run application apart from the Docker image URL specifying the version of the application to deploy. Historically we have separated deployment of an application (creating the Knative revision) from deployment of infrastructure (creating the Knative service). As such the actual application deployment often happens in GitLab CI jobs rather than in terraform configuration.

As we move to a more GitOps model, this will most likely change. In a GitOps model we specify the exact version of the application image to deploy, the terraform is run by GitLab CI and "releasing" involves merging a change to master which changes the container image URL.

Our boilerplate splits web-application configuration into two parts: the service, the application configuration. The load balancer configuration was removed in a recent revision of the boilerplate. Now these resources are managed by Google Cloud Run terraform module.

Our boilerplate configuration creates a dedicated service account identity for the application. This service account will be used by Google API libraries which use application-default credentials.

This service account is granted the following permissions:

  • connecting to the SQL instance,
  • reading the sensitive settings secret, and
  • reading the non-sensitive settings storage object.

DNS records are created for the application within the project's DNS zone. TLS certificates are provisioned automatically but the domain must first have been verified.

TLS certificates will also be provisioned for any domain specified in local.webapp_custom_dns_name but records will not be created. Usually this local is used to host the "friendly" .cam.ac.uk domain for the service and records under .cam.ac.uk must be created by other means. Similarly the project admin terraform service account must be verified as an owner of the domain with Google.

Service configuration

The Cloud Run service itself is configured in webapp.tf. This file configures:

  • a Google Secret Manager secret to hold sensitive configuration,
  • a Google Cloud Storage object to hold non-sensitive configuration,
  • a database user and password within the SQL database instance,
  • the Cloud Run service itself, and
  • a DNS record for the application if it is not behind a load balancer.

There is some terraform magic in the file to only set a custom DNS name if the application is not behind a load balancer. If the application is behind a load balancer, the Cloud Run service is configured to be "internal and load balancer only" and not accessible from the public Internet.

We start with default values for max_scale and min_scale suitable for a lightly-used web application. In particular, min_scale is initially zero which allows the web-application to use no web hosting resources when it is not being used. Generally we would increase max_scale increase as the application gets more use and would increase min_scale if we are seeing latency spikes due to application startup delays.

Important

Even if the web application uses database connection pooling there is a minimum of one connection per server process. As such one needs to make sure that max_scale multiplied by the number of server processes in the container is less than the maximum connection count of the SQL instance.

For our webapp boilerplate, there are usually four server processes per container instance.

Application configuration

To avoid secrets appearing in the environment, we have re-architected our applications to load some configuration at runtime. For our Django projects, we make use of the externalsettings Python module and code in our settings modules which load settings from YAML-formatted documents. These documents are located at a set of comma-separated URLs passed in the EXTRA_SETTINGS_URLS environment variable. The URLs can use use any schemes supported by our geddit library.

Our boilerplate passes two URLs: a gs://... URL pointing to non-sensitive settings stored in a Cloud Storage object and a sm://... URL pointing to sensitive settings stored in a Secret Manager secret.

Note that the Cloud Storage object is non-public; it can only be read by the web application's service account. Despite this, it is not suitable for storing sensitive values since they will be visible to anyone browsing the bucket in the Google Cloud console.

When would we ever use the Cloud Storage object?

Secret manager secrets can support a maximum of 64KiB of content. For most of our applications the sensitive and non-sensitive configuration fits well within this limit and we put all settings within the secret for convenience. The Cloud Storage object is there to provide an "overflow" for non-sensitive values if we breach the 64KiB limit.

The settings themselves are encoded in a YAML document specified in webapp_settings.tf. We use terraform's yamlencode function to let us interpolate values without worrying about character escaping problems. Common secrets such as database credentials and Django secret keys are managed entirely by terraform using the random_password resource.

Future changes

We have an open issue to make use of functionality which has been added to Cloud Run which supports loading secrets automatically. This may cause our configuration method to change in the future.

Third-party applications

When deploying third-party applications it is usually non-trivial to modify them to load configuration from Secret Manager secrets. In this case we make use of a tool called berglas. This tool wraps the third-party applications and detects environment variables which contain sm://... formatted URLs. These URLs are fetched and then, depending on their format, the content is used to either replace the environment variable or is written to a file on disk.

The UIS Technical Design Authority have an example on their site of how to use berglas with a third-party application.

Cloud Load Balancer

In our boilerplate, if locals.webapp_use_cloud_load_balancer is true, the application will be hosted behind a Cloud Load Balancer.

Using a Cloud Load balancer has the following advantages:

  • We can have a static ingress IP which is occasionally useful if we need to have long-lived DNS records or if it is non-trivial to have dynamic records. (For example, the IP register database only refreshes the live configuration once per hour.)
  • We can make use of Cloud Armor [sic] rules to provide dynamic protection for the application.
  • Using Cloud CDN allows us to cache application static assets in Google's Content delivery network.
  • We can bring our own TLS certificates if we cannot make use of Google's auto-provisioning or there is a requirement to support EV/OV certificates.

Future work

While we don't make use of the feature yet, Cloud Load Balancer allows us to weight incoming traffic and direct it to multiple backends which aids with smoothly moving load between services when using Blue-green deployment strategies.

Cloud Load Balancer is configured in webapp_load_balancer.tf and makes use of Google's terraform module. The configuration is pretty much a carbon-copy of the example in the upstream module. We create a DNS record for the application if Load Balancing is enabled.

Multiple web-applications

Our boilerplate assumes there is a single web application named "webapp". For some products this will be fine. For others we will need multiple applications. Currently we support multiple applications by copying and renaming the various webapp*.tf files and duplicating the local.webapp_... settings.

Example

An example of this can be see in the identity platform infrastructure (DevOps only) where two applications are configured: card and photo.

For the moment products with multiple web-applications are rare and the overhead associated with manual copy-and-paste is manageable. In future we'd like to provide a cleaner solution for this, possibly by means of a custom terraform module.

Summary

In summary,

  • We use Cloud Run to host our web applications where possible.
  • Our standard boilerplate contains example terraform configuration to:
    • create the Cloud Run service,
    • place application configuration in a Secret Manager secret,
    • connect the application to a SQL database,
    • place it behind a Cloud Load Balancer, and
    • provision TLS certificates.
  • We use a serverless platform to allow for "scale to zero" workloads where we can tune the number of active instances, and thus the cost, automatically with demand.
  • Third-party applications who cannot load their configuration directly from a Secret Manager secret are wrapped with the berglas tool.
  • Creating multiple web-applications within a single product is currently a process of copy and pasting configuration.