Nginx fundamentals

Introduction to Nginx

What is nginx

Nginx is a is free and open-source web server that accelerates content and application delivery, improves security, facilitates availability and scalability for the busiest web sites on the Internet. Nginx can be also used as a reverse proxy, load balancer, mail proxy and HTTP cache (we will talk about that later in this article). According to W3Techs, it was used by 37.7% of the top 1 million websites, 49.7% of the top 100,000 websites, and by 57.0% of the top 10,000 websites[...].

Why Nginx come to life

Problem

Let's go back to apache and the way it's handling requests. In the past apache handle each request made by the client by creating a new process called child processes to serve that request so it was a memory and cpu lost.

Solution

Nginx come to life to solve the old apache problemes with serving and concurrently handling ten thousand connections requests at once using predefined processes (slave processes) that are controlled by a master processes (the nginx process). In conclusion Nginx uses dramatically less memory than Apache, and can handle roughly four times more requests per second.

The number of slave processes in the normal case is the number of cpu cores of your system. As each slave process will run in a different core.

Nginx configuration file

Basic configuration

user nobody; # a directive in the 'main' context

events {
    # configuration of connection processing
}

http {
    # Configuration specific to HTTP and affecting all virtual servers  

    server {
        # configuration of HTTP virtual server 1       
        location /one {
            # configuration for processing URIs starting with '/one'
        }
        location /two {
            # configuration for processing URIs starting with '/two'
        }
        #define your path from where start serving files
        root /path/to/files
    } 
    
    server {
        # configuration of HTTP virtual server 2
    }
}

stream {
    # Configuration specific to TCP/UDP and affecting all virtual servers
    server {
        # configuration of TCP virtual server 1 
    }
}

Keywords

  • Directive: a single configuration option to be set in a context. Example: listen, root, server_name, etc.
  • Context: A few top‑level directives, referred to as contexts, group together the directives that apply to different traffic types. Example: events, http, server, mail, stream.
  • Virtual Servers: In each of the traffic‑handling contexts, you include one or more server blocks to define virtual servers that control the processing of requests. The directives you can include within a server context vary depending on the traffic type.

Basic Functionality

Setting Up Virtual Servers

A virtual server is defined by a server directive in the http context, for example:

#change user
user www-data;
http {
    server {
        listen 8080;
        server_name www.yamicode.com *.example.com;
        root /path/to/static/files;
    }
}
  • The listen directive used to specify the host and port on which the server listen for requests: Listen <ip> <port>. To define a default server we add the following parameter: Listen <ip> <port> default_server;
  • The server_name directive used to specify the request host name.
  • The root directive used to specify the root folder of our content

Location Directive

A location block lives within a server block and is used to define how Nginx should handle requests for different resources and URIs for the parent server.

http {
    server {
        listen 80;
        server_name www.yamicode.com;

        #Prefix match
        #Usage: "www.yamicode.com/url/", "www.yamicode.com/url/something"
        location /url/{
        	return 200 'returned from /url/ ';
        }

        #Exact match
        #Usage: "www.yamicode.com/url/"
        location = /url/{
        	return 200 'returned from = url';
        }

        #Regex match - case insensitive
        #Usage: "www.yamicode.com/url/1", "www.yamicode.com/UrL/20"
        location ~* url/\d+{
        	return 200 'returned from ~ url/\d+';
        }

        #Regex match - case sensitive
        #Usage: "www.yamicode.com/url/yes", "www.yamicode.com/Url/no" = not working
        location ~ url/\w+{
        	return 200 'returned from url ';
        }

        root /path/to/static/files;
    }
}
  • The prioririty of location are defined as: Exact match -> Regex match -> Prefix match.
  • If two location had the same priority the first are served.

Mime types

This directive is used to specify the mime type of served files elsewhere the files will be returned and served by the browser as plein text.

http{
	#define your mime types
	types{
		#mimetype extension
		text/css css;
	}
	#or just include the already existed ones
	include mime.types;
}

Nginx Variables

On Nginx there is two types of variables:

User variables

http{
	server{
		set $var 'value';
		#...
		location /var{
			# url/value
			return 200 "success from $var";
		}
	}
}

Nginx variables

http{
	server{
		#...
		location /uri{
			#return the uri = https://www.yamicode.com/uri/test
			return 200 $uri;
		}
		location /args{
			#return the args = https://www.yamicode.com/args?name=test
			return 200 $args;
		}
		location /custom-args{
			#return the args = https://www.yamicode.com/cookie
			return 200 $args_argname;
		}
         location /cookie{
            #return the cookiename value = https://www.yamicode.com/date
            return 200 $cookie_cookiename;
        }
        location /date{
            #return the date = https://www.yamicode.com/date
            return 200 $date_local;
        }
        location /remote-addr{
            #return the remote ip address
            return 200 $remote_addr;
        }
        location /request{
            #return the request https://www.yamicode.com/request/ddf&name=value
            return 200 $request;
        }
	}
}

For more variables visit the nginx website Nginx variables

Rewrite and Redirect

Redirect

The redirect is used by returning a redirect status code (for codes 301, 302, 303, and 307) to the browser that will handle the redirections (Header redirection).

http{
    server{
        #...
        location /request{
            return 301 http://www.yamicode.com/moved/here;
        }
    }
}

Rewrite

A request URI can be modified multiple times during request processing through the use of the rewrite directive, which has one optional and two required parameters. The first (required) parameter is the regular expression that the request URI must match. The second parameter is the URI to substitute for the matching URI. The optional third parameter is a flag that can halt processing of further rewrite directives or send a redirect (code 301 or 302).

http{
    server{
        #...
        rewrite ^(/download/.*)/media/(.*)\..*$ $1/mp3/$2.mp3 last;
        rewrite ^(/download/.*)/audio/(.*)\..*$ $1/mp3/$2.ra  last;
        location /users/{
            rewrite ^/users/(.*)$ /show.php?user=$1 break;
        }
        location ~* /\w+/posts/\w+.html{
            rewrite ^(\w+)/posts/(\w+).html$ /show.php?postSlug=$2&categorieSlug=$1 break;
        }
    }
}

When a rewrite directive is matched by our regex the server context is re-evaluated from the start.

 

To get a specific value from rewrite regex you should encapsule it between () and get it using $n where n is the index of the word in the list of encapsulated words.

The last property in the rewrite directive is used to tell nginx that if the url matches the regex than consider it as the last rewrite that you should do before returning response.

The difference between redirection using redirect and rewrite is that the last one is handled using nginx

Logging

There is two types of logs in nginx: the access log which handle the logging of requests and headers and error log which handle the errors part.

http{
    server{
        access_log /var/log/nginx/yamicode/access.log;
        error_log /var/log/nginx/yamicode/error.log;
        location /assets/{
            #disable logs in specific urls
            access_log off;
            error_log off;
        }
    }
}

Master and Worker processes

The master process can be defined as the one intercepting requests and than dispatch the execution to the worker processes. The number of worker processes can vary depends on the configuration but for a better usage it's recommended that the number of worker processes should be equal to the number of cpu cores.

{
    #manually defined
    worker_processes 3;
    #take the number of cores
    worker_processes auto;
}

Cache and compression

headers and expires

location ~* \.(css|js){
        add_header Cache-Control public;
        add_header Pragma public;
        add_header Vary Accept-Encoding;
        #1 month
        expires 1M;
}   

Gzip compression

gzip is a file format and a software application used for file compression and decompression.

server{
    #...
    gzip on;
    gzip_comp_level 3;
    gzip_min_length 1000;
    gzip_proxied expired no-cache no-store private auth;
    gzip_types text/plain text/css application/json application/javascript application/x-javascript text/xml application/xml application/xml+rss text/javascript;
}

Nginx Security

Https

server {
        listen 80;
        listen 443 ssl;
        server_name ...;
        ssl_certificate /path/to/certif;
        ssl_certificate_key /path/to/privatekey;

        #redirection
        if ($scheme != "https") {
            return 301 https://$host$request_uri;
        }

        # whatever you do to return response 
        root /home/admin/web;
}

Basic Auth

server{
        location /admin{
            auth_basic "Secure area";
            auth_basic_user_file /etc/nginx/.htpasswd;
        }
}

Some additionnal Nginx directive

{
    # the maximale number of concurrently connections the server should respond to
    worker_connections 1024;
    #buffer size for post submissions
    client_body_buffer_size 10k;
    #body post request size. if too large = status 413 "request entity too large"
    client_max_body_size 8m;
    #limit number of connection in a defined while by ip remote address to stop brute force attacks
    #burst = don't reject the n next responses if number of connections is passed but keep them in the buffer until they will be served
    #60request/minutes
    limit_req_zone $binary_remote_addr zone=MYZONE:60m rate=1r/s burst=5;
    server{
        limit_req zone=MYZONE;
    }
}

Reverse proxy

Reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client, appearing as if they originated from the proxy server itself.

server {
      listen      80;
      server_name your.server.name;
      location / {
          proxy_pass  http://<host>:<port>/;
      }
}

Load balancer

A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancing across multiple application instances is a commonly used technique for optimizing resource utilization, maximizing throughput, reducing latency, and ensuring fault‑tolerant configurations.

http {
    upstream backend {
        server backend1.example.com weight=5;
        server backend2.example.com;
        server 192.0.0.1 backup;
    }
    server {
        location / {
            proxy_pass http://backend;
        }
    }
}