0

My website on the Astro framework (Node.js SSR adapter) is deployed on 1 shared-cpu-1x@256MB fly.io instance in the Amsterdam region, which automatically handling gzip, TSL termination.

Initial setup includes Varnish on port 80 -> Nginx 8080 -> Node.js 3000.

Varnish handles all cache for both static assets and dynamic requests, Nginx is mostly for rewriting/redirecting URLs, serving error pages on top of the main application.

After some research, I found that Nginx is better suited for serving static content, so Varnish will receive the already changed (if needed) URL and only serve dynamic content. Also, in previous configuration I had trouble with the Vary header being duplicated for static assets marked by Varnish. Is this the right way to setup instead of previous one?

New setup: Nginx port 80 -> Varnish 8080 -> Node.js 3000.

How to properly configure caching for static assets var/www/html/client for a year? Will this interfere with the dynamic routes served by Varnish? Thank you very much.

nginx/nginx.conf

events {
    worker_connections 1024;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';

    access_log stdout;
    error_log stderr info;

    upstream varnish {
        server localhost:8080;
    }

    server {
        listen 80 default_server;
        listen [::]:80 default_server;

        root /var/www/html/client;
        index index.html;

        server_tokens off;

        error_page 404 /404.html;

        location = /404.html {
            internal;
        }

        location = /robots.txt {
            log_not_found off; access_log off; allow all;
        }

        location ~* \.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$ {
            log_not_found off;
            add_header Cache-Control "public, max-age=31536000, immutable";
            add_header X-Static-File "true";
            expires max;
        }

        # Redirect URLs with a trailing slash to the URL without the slash
        location ~ ^(.+)/$ {
            return 301 $1$is_args$args;
        }

        # Redirect static pages to URLs without `.html` extension
        location ~ ^/(.*)(\.html|index)(\?|$) {
            return 301 /$1$is_args$args;
        }

        location / {
            try_files $uri $uri/index.html $uri.html @proxy;
        }

        location @proxy {
            proxy_http_version 1.1;
            proxy_cache_bypass $http_upgrade;

            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection 'upgrade';
            proxy_set_header Host $http_host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            proxy_redirect off;
            proxy_pass http://varnish;

            proxy_intercept_errors on;
        }
    }
}

varnish/default.vcl

vcl 4.1;

import std;

backend default {
    .host = "127.0.0.1";
    .port = "3000";
}

acl purge {
    "localhost";
    "127.0.0.1";
    "::1";
}

sub vcl_recv {
    // Remove empty query string parameters
    // e.g.: www.example.com/index.html?
    if (req.url ~ "\?$") {
        set req.url = regsub(req.url, "\?$", "");
    }

    // Remove port number from host header
    set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");

    // Sorts query string parameters alphabetically for cache normalization purposes
    set req.url = std.querysort(req.url);

    // Remove the proxy header to mitigate the httpoxy vulnerability
    // See https://httpoxy.org/
    unset req.http.proxy;

    // Only handle relevant HTTP request methods
    if (
        req.method != "GET" &&
        req.method != "HEAD" &&
        req.method != "PUT" &&
        req.method != "POST" &&
        req.method != "PATCH" &&
        req.method != "TRACE" &&
        req.method != "OPTIONS" &&
        req.method != "DELETE"
    ) {
        return (pipe);
    }

    // Only cache GET and HEAD requests
    if (req.method != "GET" && req.method != "HEAD") {
        return (pass);
    }

    // Purge logic to remove objects from the cache.
    if (req.method == "PURGE") {
        if (client.ip !~ purge) {
            return (synth(405, "Method Not Allowed"));
        }
        return (purge);
    }

    // Mark static files with the X-Static-File header, and remove any cookies
    // X-Static-File is also used in vcl_backend_response to identify static files
    if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
        set req.http.X-Static-File = "true";
        unset req.http.Cookie;
        return (hash);
    }

    // No caching of special URLs, logged in users and some plugins
    if (
        req.http.Authorization ||
        req.url ~ "^/preview=" ||
        req.url ~ "^/\.well-known/acme-challenge/"
    ) {
        return (pass);
    }

    // Remove any cookies left
    unset req.http.Cookie;

    return (hash);
}

sub vcl_pipe {
    // If the client request includes an "Upgrade" header (e.g., for WebSocket or HTTP/2),
    // set the same "Upgrade" header in the backend request to preserve the upgrade request
    if (req.http.upgrade) {
        set bereq.http.upgrade = req.http.upgrade;
    }
    return (pipe);
}

sub vcl_backend_response {
    // Inject URL & Host header into the object for asynchronous banning purposes
    set beresp.http.x-url = bereq.url;
    set beresp.http.x-host = bereq.http.host;

    // Set the default grace period if backend is down
    set beresp.grace = 1d;

    // Stop cache insertion when a backend fetch returns an 5xx error
    if (beresp.status >= 500 && bereq.is_bgfetch) {
        return (abandon);
    }

    // Cache 404 response for short period
    if (beresp.status == 404) {
        set beresp.ttl = 60s;
    }

    // Create cache variations depending on the request protocol and encoding type
    if (beresp.http.Vary) {
        set beresp.http.Vary = beresp.http.Vary + ", X-Forwarded-Proto, Accept-Encoding";
    } else {
        set beresp.http.Vary = "X-Forwarded-Proto, Accept-Encoding";
    }

    // If the file is marked as static cache it for 1 year
    if (bereq.http.X-Static-File == "true" && beresp.http.Cache-Control == "public, max-age=0") {
        unset beresp.http.Set-Cookie;
        set beresp.http.X-Static-File = "true";
        set beresp.ttl = 1y;
    }
}

sub vcl_deliver {
    // Check if the object has been served from cache (HIT) or fetched from the backend (MISS)
    if (obj.hits > 0) {
        // For cached objects with a TTL of 0 seconds but still in grace mode, mark as STALE
        if (obj.ttl <= 0s && obj.grace > 0s) {
            set resp.http.X-Cache = "STALE";
        } else {
            // For regular cached objects, mark as HIT
            set resp.http.X-Cache = "HIT";
        }
    } else {
        // For uncached objects, mark as MISS
        set resp.http.X-Cache = "MISS";
    }

    // Set the X-Cache-Hits header to show the number of times the object has been served from cache
    set resp.http.X-Cache-Hits = obj.hits;

    // Unset certain response headers to hide internal information from the client
    unset resp.http.x-url;
    unset resp.http.x-host;
    unset resp.http.x-varnish;
    unset resp.http.via;
}

1 Answer 1

0

Nginx is a great web server, Varnish is a great cache, both are great reverse proxy servers.

If you're only using Nginx for URL rewriting, redirection & error handling, you don't really need Nginx. Varnish can do this just as good.

VCL template

The basic VCL configuration I would recommend is the following: https://www.varnish-software.com/developers/tutorials/example-vcl-template/

It's Varnish Software's recommended non-framework-specific VCL. It covers the following items:

  • Stripping of campaign parameters from the URL
  • Sorting query strings
  • Header cleanup
  • Static file caching
  • Backend health checking
  • Edge Side Include parsing
  • Setting the X-Forwarded-Proto header
  • Stripping off tracking cookies
  • Create protocol-aware cache variations

URL rewriting

If you want to perform URL rewriting, you can write if-statements in VCL and reset the URL via set req.url = "...". You can also perform find/replace using the regsuball() function and use regular expressions.

See https://www.varnish-software.com/developers/tutorials/varnish-configuration-language-vcl for a basic VCL tutorial.

Here's a VCL interpretation of the 2 rewrite rules in your Nginx config:

sub vcl_recv {
    if(req.url ~ "^(.+)/$") {
        return(synth(301,regsuball(req.url,"^(.+)/$","\1")));
    }

    if(req.url ~ "^/(.*)(\.html|index)(\?|$)") {
        return(synth(301,regsuball(req.url,"^/(.*)(\.html|index)(\?|$)","/\1")));
    }
}

sub vcl_synth {
    if(resp.status == 301) {
        set resp.http.Location = resp.reason;
        set resp.reason = "Moved Permanently";
        set resp.body = "Redirecting.";
        return(deliver);
    }
}

This example code will redirect /test/ to /test and /test.html to /test, just like in your Nginx config.

Error handling

Errors coming from the backend are handled in VCL's vcl_backend_error subroutine and can be customized.

You also have the ability to generate your own errors in Varnish based on an incoming request. You do this by returning return(synth(INT status, STRING reason); in your VCL code. We already did this in the URL redirection example.

Customizing the output of synthetic responses is similar to backend errors and happens in the vcl_synth subroutine.

Here's an example how you can modify the output template of backend & synthetic errors. The example uses an HTML template: https://www.varnish-software.com/developers/tutorials/vcl-synthetic-output-template-file/

This should give you a clear indication on how to handle errors coming from your NodeJS app.

Keep Nginx in the setup or not?

Based on how you're describing the situation, you don't really need Nginx. All the caching and reverse proxy logic can easily be done in Varnish.

However there are 2 reasons that would justify the use of Nginx in this project:

  • TLS handling
  • Caching large volumes of static data

Let's talk about TLS first: the open source version of Varnish currently doesn't support native TLS. While the commercial version does, but for the open source version of Varnish, you need to terminate TLS.

We developed our own TLS proxy. It's called Hitch and works really well with Varnish. See https://www.varnish-software.com/developers/tutorials/terminate-tls-varnish-hitch/ for a tutorial.

But one could argue that if you're already committed to using Nginx, you might as well use it to terminate the TLS session before connecting to Varnish.

The other reason could be static data. Don't get me wrong: Varnish is great at caching static data and might even be faster than Nginx at it. However, in Varnish caching of large volumes of static data might eat away from your caching space.

In Varnish you need to assign how much memory will be used for caching. If you only have 1GB of memory assigned and have 2GB of static files to cache, your cache may end up completely full. That's not a big issue, because the Least Recently Used algorithm will automatically clear space by removing long tail content. But if that is not acceptable, Nginx can still be used.

If your static file collection is 1GB, but your cache is bigger, you don't really need to add Nginx.

4
  • Very thank you. Great explanation.
    – Predaytor
    Aug 10 at 10:32
  • Does Nginx use memory like Varnish or disk directly (is it fast?) to serve static assets? Should I only use something like location ~ {regex} { add_header Cache-Control "public , max-age=31536000"; }? Will this rule only target the client browser or does it work like in Varnish (for max-age, shared cache for browser and CDN)? Thank you.
    – Predaytor
    Aug 10 at 14:59
  • Nginx will only send the Cache-Control headers to the browser. However, it's fast enough as a web server to serve static content directly to the client without the need to proxy it. However, dynamic requests coming from your NodeJS app will need proxy caching. See nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache for more info. Aug 11 at 8:26
  • Thanks for answer. I've seen your videos on Varnish, it's fantastic! Also the website documentation is top-notch!
    – Predaytor
    Aug 11 at 9:22

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .