Understanding Caching and Fastly with Magento 2

author

Eric Marsh

min read

  • Share
  • Tweet
  • Email
  • Print

A-A+

0.86%
Read

Understanding Caching and Fastly with Magento 2

09/07/2020 Categories: ARTICLES

What is Fastly?  Obtaining quick online responses is a key factor for any webstore’s success. Amazon discovered a decade ago that every 100ms in delayed response time cost them 1% in sales. This is even truer today; over 50% of traffic for most clients are on mobile devices, which generally use slower connections (like cellular data) when compared to personal computers (Wi-fi or Ethernet). One tool that sophisticated merchants leverage is Fastly, a CDN (Content Delivery Network) that operates between your Magento store and your customers. Its primary role is to provide a caching layer in front of your Magento site and provide a CDN that both speeds up responses from the site as well as prevents many requests from making it to the webserver.  First, A Word About Caching Put simply, caching is saving website responses from a time-consuming or resource-consuming operation and saving it for later. That way, if the same request is made later, the site or application can respond quickly because the answer or information is already cached. Here is a simple example: suppose someone asked you to calculate 127 x 53 by hand. Unless you are Matt Damon’s character in Good Will Hunting, it is going to take a minute or two to figure out the solution and reply to the person. Caching the solution is simply remembering the inputs (127, 53, multiplication function) and the output (6,731). We can put this into a table like this: Input 1 Input 2 Function Answer 127 53 Multiply 6731 Now suppose other people asked you some more math problems, and you wrote down their answers in the table as well. Now you have built a cache, which is simply an answer bank you can pull from quickly and easily when asked the same question again. Input 1 Input 2 Function Answer 127 53 Multiply 6,731 248 33 Multiply 8,184 574 154 Divide 3.7272 823 12 Multiply 9,876 Since it is difficult to calculate all these math problems by hand, when you get a new problem, it is a good idea to see if you have already solved the problem in your table. If you find that you have solved the problem before, you can respond to the questioner in a few seconds instead of the few minutes it will take to do the math again. What you have done is create a “cache” of your math problems and associated solutions. Given the same set of inputs, you can quickly find the output if you have calculated it before. With a CDN such as Fastly, substitute “webserver requests” for “math problems” from our example to understand what it does. The request from the web browser goes through Fastly first, at which point Fastly will check if it has responded to the same request before. If it has, (this is called a HIT) then Fastly will respond with the same answer that it gave the other person. This could be the HTML for your homepage, a product image, a CSS file, etc. However, if this request is a question that has not been asked before, this is called a cache MISS. Fastly will then perform a FETCH, in which the program asks the backend webserver or service to calculate the answer. In webserver terms, this means returning a brand-new file or unique page. Once Fastly has received this response from the backend, the webserver will ask if it wants Fastly to save (“cache”) the answer or not before sending the answer on to the original requester (the web browser). Not all information should be cached, however. In eCommerce, there is some data that you don’t want to share with other users. For example, it would be terrible if everyone shopping at the same eCommerce store could access each other’s shopping carts. All “private” or unique user-centric data like this is not cached by Fastly and is only tracked by Magento.  Features of Fastly Fastly is what I refer to as “Distributed Varnish as a Service”. Varnish is open source software that Fastly uses as its caching engine. Varnish can be downloaded and installed locally, which has become a common practice for many Magento developers. Further information about Varnish can be found at https://varnish-cache.org.  Another helpful feature of Fastly is its worldwide datacenter presence. You can even see a list of them at https://status.fastly.com/. Every website is assigned a “home” datacenter, which is usually the closest one to the webserver. In addition, all requests for a site working with Fastly gets routed through their closest node, thus setting up a two-tier caching system. How Magento Uses Fastly Fastly integration is built into the core of the Magento 2 platform, along with Varnish. Setup is straightforward. There are only two pieces of information needed for basic configuration: Fastly Service ID Fastly API Key All configuration details can be found in the Magento DevDocs. You will also need to configure your DNS so it points to Fastly. Click here to find out more. The (Simplified) Anatomy of a Request in Fastly At the core of Fastly and Varnish is a Finite State Machine. It may sound like a doomsday device, but what it actually does is (1) take a request from a web browser, (2) performs an action based on the information that is provided to it, and (3) returns the result. I have provided a simplified version of the Varnish state machine diagram that hits most of the critical pieces. Each box in the diagram is a function in Varnish VCL that has its own specific role in the lifecycle, so I’ve detailed each below in

Web clock

them 1% in sales. This is even truer today; over 50% of traffic for most clients are on mobile devices, which generally use slower connections (like cellular data) when compared to personal computers (Wi-fi or Ethernet). One tool that sophisticated merchants leverage is Fastly, a CDN (Content Delivery Network) that operates between your Magento store and your customers. Its primary role is to provide a caching layer in front of your Magento site and provide a CDN that both speeds up responses from the site as well as prevents many requests from making it to the webserver. 

cookies and cache

First, A Word About Caching

Put simply, caching is saving website responses from a time-consuming or resource-consuming operation and saving it for later. That way, if the same request is made later, the site or application can respond quickly because the answer or information is already cached.
Here is a simple example: suppose someone asked you to calculate 127 x 53 by hand. Unless you are Matt Damon’s character in Good Will Hunting, it is going to take a minute or two to figure out the solution and reply to the person. Caching the solution is simply remembering the inputs (127, 53, multiplication function) and the output (6,731). We can put this into a table like this:

Input 1 Input 2 Function Answer
127 53 Multiply 6731

Now suppose other people asked you some more math problems, and you wrote down their answers in the table as well. Now you have built a cache, which is simply an answer bank you can pull from quickly and easily when asked the same question again.

Input 1 Input 2 Function Answer
127 53 Multiply 6,731
248 33 Multiply 8,184
574 154 Divide 3.7272
823 12 Multiply 9,876

Since it is difficult to calculate all these math problems by hand, when you get a new problem, it is a good idea to see if you have already solved the problem in your table. If you find that you have solved the problem before, you can respond to the questioner in a few seconds instead of the few minutes it will take to do the math again. What you have done is create a “cache” of your math problems and associated solutions. Given the same set of inputs, you can quickly find the output if you have calculated it before.

With a CDN such as Fastly, substitute “webserver requests” for “math problems” from our example to understand what it does. The request from the web browser goes through Fastly first, at which point Fastly will check if it has responded to the same request before. If it has, (this is called a HIT) then Fastly will respond with the same answer that it gave the other person. This could be the HTML for your homepage, a product image, a CSS file, etc. However, if this request is a question that has not been asked before, this is called a cache MISS. Fastly will then perform a FETCH, in which the program asks the backend webserver or service to calculate the answer. In webserver terms, this means returning a brand-new file or unique page.

Once Fastly has received this response from the backend, the webserver will ask if it wants Fastly to save (“cache”) the answer or not before sending the answer on to the original requester (the web browser).

Not all information should be cached, however. In eCommerce, there is some data that you don’t want to share with other users. For example, it would be terrible if everyone shopping at the same eCommerce store could access each other’s shopping carts. All “private” or unique user-centric data like this is not cached by Fastly and is only tracked by Magento. 

speedup magento

Features of Fastly

Fastly is what I refer to as “Distributed Varnish as a Service”. Varnish is open source software that Fastly uses as its caching engine. Varnish can be downloaded and installed locally, which has become a common practice for many Magento developers. Further information about Varnish can be found at https://varnish-cache.org

Another helpful feature of Fastly is its worldwide datacenter presence. You can even see a list of them at https://status.fastly.com/. Every website is assigned a “home” datacenter, which is usually the closest one to the webserver. In addition, all requests for a site working with Fastly gets routed through their closest node, thus setting up a two-tier caching system.

How Magento Uses Fastly

Fastly integration is built into the core of the Magento 2 platform, along with Varnish. Setup is straightforward. There are only two pieces of information needed for basic configuration:

  • Fastly Service ID
  • Fastly API Key

All configuration details can be found in the Magento DevDocs.
You will also need to configure your DNS so it points to Fastly. Click here to find out more.

The (Simplified) Anatomy of a Request in Fastly

fastly diagram
At the core of Fastly and Varnish is a Finite State Machine. It may sound like a doomsday device, but what it actually does is (1) take a request from a web browser, (2) performs an action based on the information that is provided to it, and (3) returns the result. I have provided a simplified version of the Varnish state machine diagram that hits most of the critical pieces. Each box in the diagram is a function in Varnish VCL that has its own specific role in the lifecycle, so I’ve detailed each below in respective sections.

vcl_recv

This is the first function that gets called. Its role is to prepare the request for further processing later. Some URL parameters are removed, including common ones that Google usually adds. You may have seen them before in URLs while browsing the internet: 

  • utm_
  • gclid
  • _ga

Removing these URL parameters is very important because their inclusion will prevent the caching of any content. 

Vcl_recv  also marks some URLs as not cachable. Remember our example of accidentally sharing carts between users? That outcome is prevented due to this function. Varnish looks for URLs with the terms “catalogsearch”, “checkout,” or “customer” in the URL. Then Varnish instructs those states not to cache the result, to avoid sharing the private data. 

Varnish also looks at the URL string to prevent caching of anything from the admin.
The last piece of code sorts the URL parameters, which will help get more cache hits.

cache code

One last important point is that Varnish will only cache GET requests, and not cache POST or PUT commands.

vcl_hash

The hash function takes all the relevant information for a request and creates a hash (the output of a mathematical function) that is unique and repeatable given the same set of inputs.
The hash function looks at all relevant inputs and works to see if we have already figured out the output. If we have, then we simply return the result. If we haven’t, we will fetch the result from the origin (i.e. the Magento) server.

The purpose of the vcl_hash function is to figure out what relevant pieces we can use to look up the result. 
Some key components are

  • URL
  • Cookie values 
  • The last time the page was requested, also called the time-to-live or TTL for short

Remember earlier when we discussed removing some URL parameters because it would invalidate the cache? Imagine that we had a large email campaign and we blasted out emails to 100K subscribers. Part of the email campaign creates unique URLs for every single user so you can track open rates. Usually, this is done by adding a parameter to the end of the URL, such as email id.
For example, it could look like this: Acme.com/sale-landing-page.html?email_id=12345

If Fastly just looked at the URL as it is, every time a person clicked on the email link, they would bypass the cache, register a cache MISS, and fetch the page directly from Magento. If all 100K people opened it at once this would quickly overload your Magento server. Congratulations! You just DDoS’d yourself.

We can get around this by having Fastly ignore the “email_id” parameter in the URL and return the cached page for every user, thus saving your webserver from crashing.

Vcl_hit, vcl_miss

These functions get called depending on if the hash was already found in Fastly. It’s a vcI_hit if it is found, vcI_miss if it is not. Not a lot of customization is done here; it is just important that they exist.

Vcl_fetch

If there is a cache miss, the fetch function is responsible for making the request out to the backend, where the requested page, image, css file, and so on, are located. It also does some cleanup of HTTP headers to make sure that only the necessary data is cached. What we can do here is fetch different types of data from different backends. Here are a couple of potential examples:

  • Fetch static assets (js, css, jpg, gif etc…) from a different server 
  • Fetch anything from the “/blog/” url path from a separate Wordpress server
  • Direct all traffic going to the “admin” subdomain to a separate admin server

The fetch function also tells Fastly what to do with the response once we get it back from the backend. Fastly will cache it, or if it is private information (like your name or the contents of your shopping cart) just pass it along to the browser.

Vcl_deliver

The final function called in our state machine is deliver, or vcI_deliver. This function is called no matter if the request generated a cache hit or miss. This is also the last time Fastly has an opportunity to clean up or modify the response before sending it back to the browser. In Magento, this is primarily used to remove some additional HTTP headers that get added to the response, which we may not want sent to the end user.

One change that is frequently done to aid in debugging is making sure that the X-Magento-Cache-Debug or X-Cache headers are not unset anywhere. Otherwise we won’t be able to tell by looking at the response if we had a cache hit or not.

fast website

Summary

Hopefully, this has given you a good basic understanding of how caching, and more specifically Fastly, works with relation to Magento 2. In further blog posts, we will get into more detail about how to configure and customize Fastly, and maybe even how-to setup a separate backend. Let us know if you have any questions or comments! And if you want more information on implementing Magento 2 for your business, check out our Magento 2 page here. 

Tags

Body

Uptime speed

What is Fastly? 

Obtaining quick online responses is a key factor for any webstore’s success. Amazon discovered a decade ago that every 100ms in delayed response time cost them 1% in sales. This is even truer today; over 50% of traffic for most clients are on mobile devices, which generally use slower connections (like cellular data) when compared to personal computers (Wi-fi or Ethernet). One tool that sophisticated merchants leverage is Fastly, a CDN (Content Delivery Network) that operates between your Magento store and your customers. Its primary role is to provide a caching layer in front of your Magento site and provide a CDN that both speeds up responses from the site as well as prevents many requests from making it to the webserver. 

cookies and cache

First, A Word About Caching

Put simply, caching is saving website responses from a time-consuming or resource-consuming operation and saving it for later. That way, if the same request is made later, the site or application can respond quickly because the answer or information is already cached.
Here is a simple example: suppose someone asked you to calculate 127 x 53 by hand. Unless you are Matt Damon’s character in Good Will Hunting, it is going to take a minute or two to figure out the solution and reply to the person. Caching the solution is simply remembering the inputs (127, 53, multiplication function) and the output (6,731). We can put this into a table like this:

Input 1 Input 2 Function Answer
127 53 Multiply 6731

Now suppose other people asked you some more math problems, and you wrote down their answers in the table as well. Now you have built a cache, which is simply an answer bank you can pull from quickly and easily when asked the same question again.

Input 1 Input 2 Function Answer
127 53 Multiply 6,731
248 33 Multiply 8,184
574 154 Divide 3.7272
823 12 Multiply 9,876

Since it is difficult to calculate all these math problems by hand, when you get a new problem, it is a good idea to see if you have already solved the problem in your table. If you find that you have solved the problem before, you can respond to the questioner in a few seconds instead of the few minutes it will take to do the math again. What you have done is create a “cache” of your math problems and associated solutions. Given the same set of inputs, you can quickly find the output if you have calculated it before.

With a CDN such as Fastly, substitute “webserver requests” for “math problems” from our example to understand what it does. The request from the web browser goes through Fastly first, at which point Fastly will check if it has responded to the same request before. If it has, (this is called a HIT) then Fastly will respond with the same answer that it gave the other person. This could be the HTML for your homepage, a product image, a CSS file, etc. However, if this request is a question that has not been asked before, this is called a cache MISS. Fastly will then perform a FETCH, in which the program asks the backend webserver or service to calculate the answer. In webserver terms, this means returning a brand-new file or unique page.

Once Fastly has received this response from the backend, the webserver will ask if it wants Fastly to save (“cache”) the answer or not before sending the answer on to the original requester (the web browser).

Not all information should be cached, however. In eCommerce, there is some data that you don’t want to share with other users. For example, it would be terrible if everyone shopping at the same eCommerce store could access each other’s shopping carts. All “private” or unique user-centric data like this is not cached by Fastly and is only tracked by Magento. 

speedup magento

Features of Fastly

Fastly is what I refer to as “Distributed Varnish as a Service”. Varnish is open source software that Fastly uses as its caching engine. Varnish can be downloaded and installed locally, which has become a common practice for many Magento developers. Further information about Varnish can be found at https://varnish-cache.org

Another helpful feature of Fastly is its worldwide datacenter presence. You can even see a list of them at https://status.fastly.com/. Every website is assigned a “home” datacenter, which is usually the closest one to the webserver. In addition, all requests for a site working with Fastly gets routed through their closest node, thus setting up a two-tier caching system.

How Magento Uses Fastly

Fastly integration is built into the core of the Magento 2 platform, along with Varnish. Setup is straightforward. There are only two pieces of information needed for basic configuration:

  • Fastly Service ID
  • Fastly API Key

All configuration details can be found in the Magento DevDocs.
You will also need to configure your DNS so it points to Fastly. Click here to find out more.

The (Simplified) Anatomy of a Request in Fastly

fastly diagram
At the core of Fastly and Varnish is a Finite State Machine. It may sound like a doomsday device, but what it actually does is (1) take a request from a web browser, (2) performs an action based on the information that is provided to it, and (3) returns the result. I have provided a simplified version of the Varnish state machine diagram that hits most of the critical pieces. Each box in the diagram is a function in Varnish VCL that has its own specific role in the lifecycle, so I’ve detailed each below in respective sections.

vcl_recv

This is the first function that gets called. Its role is to prepare the request for further processing later. Some URL parameters are removed, including common ones that Google usually adds. You may have seen them before in URLs while browsing the internet: 

  • utm_
  • gclid
  • _ga

Removing these URL parameters is very important because their inclusion will prevent the caching of any content. 

Vcl_recv  also marks some URLs as not cachable. Remember our example of accidentally sharing carts between users? That outcome is prevented due to this function. Varnish looks for URLs with the terms “catalogsearch”, “checkout,” or “customer” in the URL. Then Varnish instructs those states not to cache the result, to avoid sharing the private data. 

Varnish also looks at the URL string to prevent caching of anything from the admin.
The last piece of code sorts the URL parameters, which will help get more cache hits.

cache code

One last important point is that Varnish will only cache GET requests, and not cache POST or PUT commands.

vcl_hash

The hash function takes all the relevant information for a request and creates a hash (the output of a mathematical function) that is unique and repeatable given the same set of inputs.
The hash function looks at all relevant inputs and works to see if we have already figured out the output. If we have, then we simply return the result. If we haven’t, we will fetch the result from the origin (i.e. the Magento) server.

The purpose of the vcl_hash function is to figure out what relevant pieces we can use to look up the result. 
Some key components are

  • URL
  • Cookie values 
  • The last time the page was requested, also called the time-to-live or TTL for short

Remember earlier when we discussed removing some URL parameters because it would invalidate the cache? Imagine that we had a large email campaign and we blasted out emails to 100K subscribers. Part of the email campaign creates unique URLs for every single user so you can track open rates. Usually, this is done by adding a parameter to the end of the URL, such as email id.
For example, it could look like this: Acme.com/sale-landing-page.html?email_id=12345

If Fastly just looked at the URL as it is, every time a person clicked on the email link, they would bypass the cache, register a cache MISS, and fetch the page directly from Magento. If all 100K people opened it at once this would quickly overload your Magento server. Congratulations! You just DDoS’d yourself.

We can get around this by having Fastly ignore the “email_id” parameter in the URL and return the cached page for every user, thus saving your webserver from crashing.

Vcl_hit, vcl_miss

These functions get called depending on if the hash was already found in Fastly. It’s a vcI_hit if it is found, vcI_miss if it is not. Not a lot of customization is done here; it is just important that they exist.

Vcl_fetch

If there is a cache miss, the fetch function is responsible for making the request out to the backend, where the requested page, image, css file, and so on, are located. It also does some cleanup of HTTP headers to make sure that only the necessary data is cached. What we can do here is fetch different types of data from different backends. Here are a couple of potential examples:

  • Fetch static assets (js, css, jpg, gif etc…) from a different server 
  • Fetch anything from the “/blog/” url path from a separate Wordpress server
  • Direct all traffic going to the “admin” subdomain to a separate admin server

The fetch function also tells Fastly what to do with the response once we get it back from the backend. Fastly will cache it, or if it is private information (like your name or the contents of your shopping cart) just pass it along to the browser.

Vcl_deliver

The final function called in our state machine is deliver, or vcI_deliver. This function is called no matter if the request generated a cache hit or miss. This is also the last time Fastly has an opportunity to clean up or modify the response before sending it back to the browser. In Magento, this is primarily used to remove some additional HTTP headers that get added to the response, which we may not want sent to the end user.

One change that is frequently done to aid in debugging is making sure that the X-Magento-Cache-Debug or X-Cache headers are not unset anywhere. Otherwise we won’t be able to tell by looking at the response if we had a cache hit or not.

fast website

Summary

Hopefully, this has given you a good basic understanding of how caching, and more specifically Fastly, works with relation to Magento 2. In further blog posts, we will get into more detail about how to configure and customize Fastly, and maybe even how-to setup a separate backend. Let us know if you have any questions or comments! And if you want more information on implementing Magento 2 for your business, check out our Magento 2 page here.