How we reduced 75% of Prismic billing by caching API calls

When I joined the team at IndiaHikes, I was needed to help reduce overall expenditure related to tech. We were spending a lot of money on Prismic.

Prismic was the central CMS for the whole platform. The whole of content team (about 40 odd folks), frequently updates and rely on Prismic for content. Content is truely king for IndiaHikes and we can't afford to not have a central CMS.

The problem we faced

  1. High number of API calls to Prismic to pull content during build times. Although only few pages were built during the build time. It was still in huge number. We kept hitting the limits of Prismic plans and ended up on the highest plan.
  2. Due this concern, we were not comfortable to deploy frequently as it would increase the billing.
  3. Since Prismic doesn't have production / dev env or any ways to mock it locally. Even the local development would use the main Prismic API. Several developers almost daily consuming the API calls, was also an issue. We wanted more room here.
  4. All the preview builds on vercel, on every push of code by devs would hit Prismic again.

Solution

Cache it!

The team at Prismic suggests not to cache the API calls (https://community.prismic.io/t/for-how-long-is-it-safe-to-cache-a-prismic-api-ref/5962). It makes sense because there are lot of moving parts and hard to figure out the invalidation.

Every API call is tagged with a version number of sorts called as masterRef in Prismic. This masterRef changes every time a content is updated. So, if we cache the API call for a long time, we might end up showing stale content.

Cache invalidation strategy

Prismic provides webhooks support for triggering endpoints on various events. We used this to invalidate the cache. Still, the issue would be that once the masterRef is updated we would've to update the whole of cache to reflect the new masterRef.

So, we decided to cache the URLs as keys (encoded and gzipped - using Zlib https://nodejs.org/api/zlib.html), but without the masterRef and content as the value. That's it.

Now, this cache-server became our source of truth. When the request is made to the cache server, we strip the masterRef from the request url and then check for cache. So, we always serve the data regardless of the masterRef worries.

Now, how to update the content without relying on masterRef?

We have another endpoint which revalidates the pages. Whenever content is updated and the Prismic hits the webhook, we trigger a revalidation request with the details we get in the incoming request. This revalidation request, pulls data from Prismic and updates the cache.

Since the masterRef is stripped away from the all the incoming URLs and while storing the keys. The new URLs are always returning updated content as well. As long as the cache is updated through the webhook flow properly.

General flow of the legacy system

Legacy system flow

General flow of the new cache system

New cache system flow

Tracking some numbers on Axiom for observability

We used Axiom for tracking the cache efficiency. We could see the number of hits, misses, and the efficiency of the cache.

The errors are usually some of the stale pages and as of now. It's been couple months already, we've not seen any issues.

This has given, more confidence to the team to deploy frequently and not worry about the billing in anyways. We've reduced the billing by 75% and shipping almost every few days or multiple times a day.

Axiom dashboard showing statistics of our cache service efficiency

Tech, libs and frameworks

  • Express.js
  • Node.js
  • Redis
  • Zlib
  • Pino.js and Axiom for logging

Hosted on Digitalocean


This was ideated by me and implemented by my colleague, Ebrahim. The whole process was really interesting to see. From discussion of various approaches, to the implementation and finally seeing the results.