Cache Using Cloudflare Workers’ Cache API

As we all know that the caching is a process that everyone uses using different topologies like caching at application node, geographical caching, even some organizations set up a completely dedicated cluster of nodes only for caching.

In this blog, we will discuss setting up Geographical caching using Cloudflare’s Content delivery network and Cloudflare’s workers.

Why caching?

The cache is a component that is used to store portions of data sets that would otherwise either take a long time to calculate/process or originate from another underlying backend system, where caching is used to prevent additional requests for round trips for frequently used data. In both cases, caching could be used to gain performance or decrease application latencies.

Geographical Caching

Geographical Caches are located in strategically chosen locations to optimize latency on requests. Therefore, this kind of cache will mostly be used for website content. It is also known as CDN (Content Delivery Network)
Typically, a Geographical Cache will transparently go down to a central cache that acts as the main source for content and cache retrieved data locally. This works great for static content or content that changes less often.

What is serverless JavaScript/ serverless function?

Serverless JavaScript is JavaScript code that comprises all or part of an application, is only run when requested, and is not hosted on proprietary servers. Serverless JavaScript is hosted on an edge network( content delivery networks) or on an HTTP caching service, which stores content to respond quickly to HTTP requests. Developers can write and deploy JavaScript functions that process HTTP requests before they travel all the way to the origin server.

What Are Cloudflare Workers?

Cloudflare Workers are serverless javascript which runs as close as possible to Cloudflare Workers are serverless javascript which runs as close as possible to the end-user means serverless code itself is ‘cached’ also on the all Cloudflare’s content delivery network. Cloudflare Workers are written in JavaScript against the service workers API, meaning they can use all the functionality offered by service workers.

Content Delivery Network

Cloudflare’s Content Delivery Network (CDN) is a geographically distributed group of servers that ensure fast delivery of Internet content.

Cloudflare workers (servers javascript) are distributed on these content delivery networks in return these Cloudflare workers will cache HTML pages, JavaScript files, stylesheets, and images. These HTML pages, JavaScript files, stylesheets, and images are data that users get from CDN cache after making an HTTP request to Cloudflare workers. Caching static resources on these delivery networks reduces your server load and bandwidth, with no extra charges for bandwidth spikes.

Why Workers Cache API

Figuring out what to cache and how it can get complicated. Consider an e-commerce site with a shopping cart, a Content Management System (CMS) with many templates and hundreds of articles, or a GraphQL API. Each contains a mix of elements that are dynamic for some users but might stay unchanged for the vast majority of requests.

Over the last 8 years, Cloudflare added more features for user’s flexibility and control over what goes in the cache. However, Cloudflare needs to offer more than just adding settings in paging rules.

Using Cache API user will be able to express their cache ideas in code.

How the Cache API works

Cache API unleashes a huge amount of power. Because Workers gives the ability to modify Request and Response objects, we can control any caching behavior like TTL or cache tags. We can implement customer Vary logic or cache normally-uncacheable objects like POST requests.

The Cache API expects requests and responses, but they don’t have to come from external servers. Your worker can generate arbitrary data that’s stored in our cache. That means you can use the Cache API as a general-purpose, ephemeral key-value store!

Cache API is strongly influenced by the web browsers’ cache API, but there are some important differences. For instance, Cloudflare Workers runtime exposes a single global cache object.
Global cache object: let cache = caches.default

Cache API At Its Core It Offers Three Methods:

put(request, response): places a Response in the cache, keyed by a Request.
Syntax: cache.put(request, response)

match(request: returns a given Response that was previously put()
Syntax: cache.match(request, options)

delete(request): deletes a Response that was previously put()
Syntax: cache.delete(request, options)

Cloudflare Proxied DNS And ‘DNS ONLY’ DNS

If ‘proxied’ is turned on does that mean that Cloudflare will proxy my HTTP requests? The proxied means it will be shown a Cloudflare IP if you look it up. Thus all attacks at that domain will DDoS Cloudflare and not you host directly.
DNS Only means all traffic goes directly to your own IP without Cloudflare being a safety net in front.
The upside of proxied is that you will enjoy the Cloudflare benefits but you can not make a direct connection to your IP, which means any custom ports won’t work. DNS Only has the advantage of being able to use custom ports to connect as it will connect to your IP directly.

This image has an empty alt attribute; its file name is screenshot-from-2020-06-15-16-32-16.png

Cloudflare Route

Cloudflare Site routes are comprised of:

  1. Route URL (see Matching Behavior)
  2. Worker script (response-cache) to execute on matching requests.

This image has an empty alt attribute; its file name is screenshot-from-2020-06-13-16-59-59-2.png

Page Rule

Page Rules trigger certain actions whenever a request matches one of the URL patterns you define.

Page Rules Settings

Page rule settings control the action Cloudflare takes once a request matches the URL pattern defined in a page rule. You can use settings to enable and disable multiple Cloudflare features across several of the dashboard apps.

Below is the list of settings available in the Cloudflare Page Rules UI.

Always Use HTTPS Turn on or off the Always Use HTTPS feature of the Edge Certificates tab in the Cloudflare SSL/TLS app. If enabled, any http:// URL is converted to https:// through a 301 redirect.
Browser Cache TTL Control how long resources cached by client browsers remain valid. 
Cache Level Apply custom caching based on the option selected:

Bypass – Cloudflare does not cache.
No Query String – Delivers resources from the cache when there is no query string.
Ignore Query String – Delivers the same resource to everyone independent of the query string.
Standard – Caches all static content that has a query string.
Cache Everything –  Treats all content as static and caches all file types beyond the Cloudflare default cached content.  Respects cache headers from the origin web server unless Edge Cache TTL is also set in the Page Rule. When combined with an Edge Cache TTL > 0, Cache Everything removes cookies from the origin web server response. 

This image has an empty alt attribute; its file name is screenshot-from-2020-06-15-16-29-37.png

This image has an empty alt attribute; its file name is screenshot-from-2020-06-14-12-56-19.png

Understand Cloudflare Cache Responses

Cloudflare uses CF-Cache-Status header to shows whether a resource is cached or not.

HIT The resource was found in Cloudflare’s cache.
MISS The resource was not found in Cloudflare’s cache and was served from the origin web server.
EXPIRED The resource was found in cache but has since expired and was served from the origin web server.
DYNAMIC The resource was not cached by default and your current Cloudflare caching configuration doesn’t instruct Cloudflare to cache the resource.  Instead, the resource was requested from the origin web server. Use Page Rules to implement custom caching options.   

With this basic knowledge of Content Delivery Network, Page Rule, Workers, Worker Route and Response Headers, Now we move to implementation for caching with workers.

Sample Worker JavaScript

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event))
})
async function handleRequest(event) {
    let cache = caches.default
    // Get this request from this zone's cache
    let response = await cache.match(event.request)
    if (!response) {
      //if not in cache, grab it from the origin
      response = await fetch(event.request)


      // must use Response constructor to inherit all of response's fields
      response = new Response(response.body, response)
      
      //Set Header of origin repsonse
      response.headers.set('Cache-Control', 'max-age=600')
    
      event.waitUntil(cache.put(event.request, response.clone()))
    } else {
       return response
    }
  }
  }

A picture is worth a thousand words, below a flow diagram of how Cloudflare worker responds to an HTTP request.

Describing each step:

Step 1: DNS only requests (Non-proxied request), directly to the origin server.
DNS only requests are not qualified for Cloudflare Caching.

Step 2: DNS only response (Non-proxied request), directly from the origin server.

Step 3: Proxied request, Cloudflare Content delivery Network comes into play for DNS records that have Proxied enabled.

Step 4: Matching each proxied request URI in the Worker Route table.
Each proxied request is passed through the worker’s route table to check whether the request is qualified for worker javascript.

Step 5: Lets request is api.opstree.com/product/getinfo
Request URI matched in route table and javascript’s listener Request URI matched in route table and javascript’s listener addEventListener(‘fetch’, event =>{  event.respondWith(handleRequest(event))}) is triggerd.

Step 6: Declare Global cache object let cache = caches.default

Step 7: response = await cache.match(event.request).
Check whether the response is in a Content delivery network or not.

Step 8: Response found in Content delivery Network and return Response data without going to the origin. Also, Cloudflare set CF-Cache-Status header value to “HIT”.

Step 9: if (!response) Response not found in Cache.

Step 10: Request is direct to the origin server using response = await fetch(event.request)

Step 11: Response is checked against the Paging rule.
The response is a match for Page rule *.opstee.com/product/* which sets the cache level to cache everything.

Step 12: Change Response by cloning response using response = new Response(response.body, response).

Cloudflare’s workers are used to handle request and response data programmatically.

Step 13: Change response header Cache-Control value to max-age=600

Step 14:  event.waitUntil(cache.put(event.request, response.clone())) places a Response in the cache for next hit.

Step 15: Cloudflare set CF-Cache-Status header value to “MISS” and send responses from origin to user.

Step 16: Return to step 5, Lets request is *.opstree.com/feature/ means request does not match in Worker’s route table and content delivery network is called using traditional method fetch().

Step 17: If Response is found in cache and Cloudflare set CF-Cache-Status header value to “HIT” and send a response from origin to the user.

Step 18: If Response is not found in the cache, Request is directed to the origin server.

Step 19: Response from the server is checked against the Paging rule.
The response is a match for Page rule *.opstee.com/feature/* which sets the cache level to cache everything.

Step 20: Response is store in the Content Delivery Network and Cloudflare sets CF-Cache-Status header value to “MISS” and sends a response from origin to the user.

Unlike Workers, we did not change responses or requests in the traditional caching method.

Lets request is *.opstree.com/testing/another means request does not match in Worker’s route table and content delivery network is called using traditional method fetch().

Step 21: Again, Response is not found in the cache, the response is directed to the origin server.

Step 22: Response is checked against the Paging rule. The response is a match for Page rule *.opstee.com/* which sets the cache level to cache Bypass.

Step 23: Response is not cached in the Content delivery network and Cloudflare sets CF-Cache-Status header value to “DYNAMIC” and sends the response from origin to the user.

More on Cloudflare?

Understanding and Configuring Cloudflare Page Rules 
Introducing the Workers Cache API: Giving you control over how your content is cached
Using the Cache
What is a CDN?

Opstree is an End to End DevOps solution provider

 

One thought on “Cache Using Cloudflare Workers’ Cache API”

  1. Do you need to set the cache headers in the worker to avoid the CF-CACHE being DYNAMIC?

    I would have thought that would be avoided by setting Cache Everything and EDGE Cache TTL set to 2 hours?

Leave a Reply