Night Hour

Reading under a cool night sky ... 宁静沉思的夜晚 ...

Developing an Nginx URL Whitelisting Module

Growing tree

Premature optimization is the root of all evil. ,
Donald Knuth.


29 Oct 2019


Introduction

One of the challenges of securing web applications and websites is preventing the accidental exposure of sensitive parts of an application or website, such as administrative interfaces. A common technique is to blacklist an application path and prevent access to resources starting with that path; other techniques include disabling unneeded administrative interfaces, or removing unwanted features. This article shows how to develop an Nginx module that allows access only to whitelisted URLs or web resources. Any URLs that are not in the whitelist will be blocked.

This approach offers better protection as all resources are denied access by default. A security administrator or web developer has to explicitly whitelist each web resource to enable access. Whitelisting can also mitigate against application frameworks that exposes sensitive interfaces by mistake, such as the case of vulnerable Spring Boot actuators. It can also mitigate against accidental uploading of sensitive files to a website.

Whitelisting may sound tedious; but for a web application or web based api, the developers will know exactly what are the web resources that should be accessible by normal end users. It will be relatively easy to create a listing of such URLs to be whitelisted. A whitelisting approach may not be suitable for all use cases, but it will prove useful in cases where security is essential. It can greatly reduce the attack surface of a website or web application.

Design and Approach

One of the first thing to consider is what are the data structures required for such a whitelist. Nginx exposes a HTTP Request structure (ngx_http_request_t) to modules. It contains a uri field (ngx_str_t) holding the URL string starting from web root. The whitelisted URLs can be stored as an array of Nginx string (ngx_str_t) and a comparison be done with the uri field in a loop.

The problem with using an array is that if the number of whitelisted URLs are large, many comparisons will be required. For performance, using a hash table will ensure a faster lookup. But memory requirement and the hashing function have to be considered. For the case here, we will use a tree like data structure, just like the file directory tree. It should offer a faster lookup than an array of URL strings; and we don't have to overly worry about memory allocation or hashing function.

The following diagram illustrates this.

Directory like Structure
Fig 1. Directory like Structure

A URI or URL is broken into parts, starting from the root '/'. The root has children either sub directories or files. Each sub directory again has its own children, either files or sub directories. A sub directory ends with a "/"; for example, "scripts/".

To whitelist http://www.nighthour.sg/articles/index.html. The hostname portion is not included, the syntax starts from the webroot "/", follows by "article/" and then "index.html". The module requires a URL syntax like this

/articles/index.html

This will be further broken down into the following parts in the tree structure.

/ articles/ index.html

If a URL ends with a subdirectory, a trailing forward slash is required when specifying to the module that the subdirectory should be whitelisted and accessible. For example, https://www.nighthour.sg/articles/. The URL syntax required for the module to whitelist this will be

/articles/

This will give the following parts in the directory tree

/ articles/

If a webresource ends with a file, the trailing slash is not required. For example, https://www.nighthour.sg/myapi/myapplication. The URL syntax for the module will be

/myapi/myapplication

This will give the following parts in the tree.

/ myapi/ myapplication

We will use a node structure that represents a part of the URL. A node may have other child nodes. A node contains a string holding its path segment, example "/" or "scripts/". Using this we can build a tree like structure that can represent all the whitelisted URLs of a website or web application.

For each HTTP request, the module compares the uri string against the tree structure, part by part. Once a part doesn't match, we know it is not in the whitelist. A directory tree like structure minimizes the comparisons required. Conversely, if all parts matched, then it is in the whitelist and access should be granted. The module returns a HTTP 404 (not found) error for URLs that are not whitelisted.

Note that the module doesn't compare against the URL query string or query parameters. For example,

https://www.nighthour.sg/myapi/myapplication?queryid=338899&type=abc

The portion starting from the question mark is the query string. This is not used by the module when checking the whitelist. When specifying the URL syntax for the module to whitelist; do not include the query string.

The URI whitelisting module can be used on a site hosted directly by nginx or with nginx configured as a reverse proxy. The reverse proxy option is particularly useful as an additional layer of protection for web applications or api end points.

Implementation

This section will run through the source code of the Nginx URL whitelisting Module. It will not explain the basics of writing Nginx modules. Refer to the Nginx Development Guide for details on Nginx development. Another good beginner resource is Emiller's Guide to Nginx Module Development.

The full source code of the module is available at the Github link at the end of the article.

The code snippet below shows the a few macro constants and the node data structure for building the URI tree. NGX_WHL_INIT_CHIDREN_SZ is the number of initial children for each node. The child nodes can be expanded when necessary until the maximum defined in NGX_WHL_MAX_CHILDREN.

NGX_WHL_MAXPATHSZ sets the maximum length for a URL. It is currently defined as 2048. A web administrator may want to reduce this number if he or she is sure that the web application or website does not have URLs that are this long. For example, I can set a value of 100. Any URLs that exceed 100 in length will be blocked by the module with HTTP 404 error.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#define  NGX_WHL_INIT_CHIDREN_SZ  64
#define  NGX_WHL_MAXPATHSZ  2048
#define  NGX_WHL_MAX_CHILDREN  65536

typedef struct ngx_whl_pnode_s  ngx_whl_pnode_t;

struct ngx_whl_pnode_s
{
    ngx_str_t *segment;
    size_t num_child;
    ngx_whl_pnode_t **children;
    size_t maxchild;
    size_t end_slash_allowed;
};

The following shows the configuration structure of the module. This structure is used by Nginx for storing the configuration options. The uri_tree variable holds the URL tree. This tree is built as Nginx reads in the configuration options.

bp_extens is an array containing file extensions that will be bypassed by the module. A list of extensions such as jpg, gif etc... can be provided in a bypass configuration option. This URI whitelist module will skip URLs with such extensions and allow access. The enabled flag sets whether the module is turned on or off.

1
2
3
4
5
6
/* Configuration struct */
typedef struct {
    ngx_flag_t enabled;
    ngx_array_t *bp_extens;
    ngx_whl_pnode_t *uri_tree; 
} ngx_http_uri_whitelist_loc_conf_t; 

The following shows the code snippet for the module configuration directives. wh_list directive can be set to on|off, to determine whether the module is enabled or disabled. The wh_list_uri directive takes a URL string starting with "/" , these are the URLs that will be whitelisted. wh_list_bypass is for specifying the extensions that will be bypassed by the module.

The functions for handling each directive in the configuration file are specified in this ngx_command_t array as well. ngx_http_wh_list_cfg() is a function to process each wh_list_uri directive and builds up the URI tree. ngx_http_wh_list_bypass_cfg() populates the bypass array with the file extensions that will be skipped by the module.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
/* Module Directives */
static ngx_command_t  ngx_http_uri_whitelist_commands[] = {

    { ngx_string("wh_list"),
      NGX_HTTP_LOC_CONF | NGX_CONF_FLAG,
      ngx_conf_set_flag_slot,
      NGX_HTTP_LOC_CONF_OFFSET,
      offsetof(ngx_http_uri_whitelist_loc_conf_t, enabled),
      NULL },
      
    { ngx_string("wh_list_uri"),
      NGX_HTTP_LOC_CONF | NGX_CONF_TAKE1,
      ngx_http_wh_list_cfg,
      NGX_HTTP_LOC_CONF_OFFSET,
      0,
      NULL },
      
    { ngx_string("wh_list_bypass"),
      NGX_HTTP_LOC_CONF | NGX_CONF_1MORE,
      ngx_http_wh_list_bypass_cfg,
      NGX_HTTP_LOC_CONF_OFFSET,
      0,
      NULL },
      
    ngx_null_command
};

The following are the code snippets for the Module context and Module definition. This article will not go into details on what these are. Refer to the earlier links on Nginx development for more information.

The ngx_http_uri_whitelist_init() function initializes the module after the configuration has been read. ngx_http_uri_whitelist_create_loc_conf() and ngx_http_uri_whitelist_merge_loc_conf() are for creating and merging the configuration structure.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/* Module Context */
static ngx_http_module_t  ngx_http_uri_whitelist_module_ctx = {
    NULL,                                  /* preconfiguration */
    ngx_http_uri_whitelist_init,              /* postconfiguration */

    NULL,                                  /* create main configuration */
    NULL,                                  /* init main configuration */

    NULL,                                  /* create server configuration */
    NULL,                                  /* merge server configuration */

    ngx_http_uri_whitelist_create_loc_conf,/* create location configuration */
    ngx_http_uri_whitelist_merge_loc_conf  /* merge location configuration */
};


/* Module Definition */
ngx_module_t  ngx_http_uri_whitelist_module = {
    NGX_MODULE_V1,
    &ngx_http_uri_whitelist_module_ctx,       /* module context */
    ngx_http_uri_whitelist_commands,          /* module directives */
    NGX_HTTP_MODULE,                       /* module type */
    NULL,                                  /* init master */
    NULL,                                  /* init module */
    NULL,                                  /* init process */
    NULL,                                  /* init thread */
    NULL,                                  /* exit thread */
    NULL,                                  /* exit process */
    NULL,                                  /* exit master */
    NGX_MODULE_V1_PADDING
};    

The following is the code snippet for the ngx_http_uri_whitelist_init() function. It registers the module handler, ngx_http_uri_whitelist_handler(), to Nginx HTTP Access phase. At this phase of Nginx, the handler can choose whether to accept or reject a HTTP request.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
/* Module initialization */
static ngx_int_t
ngx_http_uri_whitelist_init(ngx_conf_t *cf)
{
    ngx_http_handler_pt        *h;
    ngx_http_core_main_conf_t  *cmcf;

    cmcf = ngx_http_conf_get_module_main_conf(cf, ngx_http_core_module);

    /* Add our module handler to the HTTP ACCESS phase */
    h = ngx_array_push(&cmcf->phases[NGX_HTTP_ACCESS_PHASE].handlers);
    if (h == NULL) {
        return NGX_ERROR;
    }

    *h = ngx_http_uri_whitelist_handler;
    
    return NGX_OK;
}

The following is the code snippet for the module handler, ngx_http_uri_whitelist_handler(). The handler function checks whether the whitelist module is set to enable or disable. If it is disabled, it will pass control back to nginx; otherwise it will proceed to check the URL for bypass file extensions. If an extension matches, it will pass control back to nginx.

The handler then calls ngx_http_wh_check_path_exists() function to see if the URL is in the whitelist URI tree. It returns HTTP 404 error if the URL is not whitelisted. If the URL is whitelisted, control is passed back to Nginx.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
/* Module Handler */
static ngx_int_t
ngx_http_uri_whitelist_handler(ngx_http_request_t *r)
{
    size_t                             i; 
    ngx_str_t                          *ext; 
    ngx_http_uri_whitelist_loc_conf_t  *slcf;
    
#if WHL_DEBUG    
    ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
                "[URI_WHITELIST]: %V",&r->uri);
    ngx_log_debug1(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
                "[URI_WHITELIST] extension: %V",&r->exten);
#endif

    if (r->uri.len == 0) {
        return NGX_HTTP_BAD_REQUEST;
    }

    slcf = ngx_http_get_module_loc_conf(r, ngx_http_uri_whitelist_module);
    
    if (slcf == NULL) {
        return NGX_HTTP_INTERNAL_SERVER_ERROR;
    }
    
    if (slcf->enabled != 1) {
        ngx_log_error(NGX_LOG_WARN, r->connection->log, 0,
            "[URI_WHITELIST] : White list module disabled !"); 
        return NGX_DECLINED;
    }
    
 
    /* Check for extensions bypass */
    ext = slcf->bp_extens->elts;
    for (i=0; i < slcf->bp_extens->nelts; i++) {
        
        if (r->exten.len == ext[i].len 
            && ngx_strncmp(r->exten.data, ext[i].data, r->exten.len) == 0) 
        {
            return NGX_DECLINED; 
        }
        
    }
    
    
    if (!ngx_http_wh_check_path_exists(r->uri.data, 
        r->uri.len, slcf->uri_tree)) 
    {
        /* If uri is not present in whitelist */
        ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,
            "[URI_WHITELIST] : Access Denied for [ %V ] ", &r->uri);
        return NGX_HTTP_NOT_FOUND;
    }
    
                   
    return NGX_DECLINED;
}

The following code snippet are the functions for building up the URI tree. ngx_http_wh_create_node() function creates a new node. ngx_http_wh_add_child() function adds a child node to a parent. If the url path passed in is a single "/", ngx_http_wh_add_child() returns the parent. This is to skip repeated "/" in the URL. ngx_http_wh_add_child() returns the child node either if the child node already exists or it is added successfully to the parent node.

If the parent node runs out of space for storing child nodes, ngx_http_wh_add_child() calls the ngx_http_wh_resize_children() function. ngx_http_wh_resize_children() function resizes the children array of the parent node doubling the capacity each time. The maximum number of children nodes is limited to NGX_WHL_MAX_CHILDREN (65536), defined earlier in the source.

ngx_http_wh_add_path() function adds a URL or URI to the URI tree. It loops through the URL string, breaking it into its constituent parts and add each to the URI tree.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
/* Creates a path node based on a part of the uri */
static ngx_whl_pnode_t * 
ngx_http_wh_create_node(const u_char* path,  size_t plen, ngx_conf_t *cf)
{
    size_t           sz;
    ngx_str_t        *sgmt;
    ngx_whl_pnode_t  *node; 
    
    if (path == NULL)
        return NULL;
    
    if (plen == 0 || plen >= NGX_WHL_MAXPATHSZ) 
        return NULL;
    
    sgmt = ngx_pcalloc(cf->pool, sizeof(ngx_str_t));
    if (sgmt == NULL) { 
        return NULL;
    }
    
    sz = plen + 1;
    sgmt->data = ngx_pcalloc(cf->pool, sz * sizeof(u_char));
    if (sgmt->data == NULL) {
        return NULL;
    }
    
    ngx_memcpy(sgmt->data, path, sz);
    sgmt->len = plen;
        
    node = ngx_pcalloc(cf->pool, sizeof(ngx_whl_pnode_t));
    if (node == NULL) {
        return NULL;
    }
    
    node->children = ngx_pcalloc(cf->pool, 
        NGX_WHL_INIT_CHIDREN_SZ * sizeof(ngx_whl_pnode_t *));
        
    if (node->children == NULL) {
        return NULL;
    }
    
    node->segment = sgmt; 
    node->num_child = 0;
    node->maxchild = NGX_WHL_INIT_CHIDREN_SZ;
    node->end_slash_allowed = 0;
    
    return node; 
}

/* Adds a uri path to the uri tree */
static ngx_whl_pnode_t *
ngx_http_wh_add_child(const u_char *path, ngx_whl_pnode_t *parent, 
    ngx_conf_t *cf)
{
    size_t           plen, i;
    ngx_whl_pnode_t  *node;
    
    if (path == NULL || parent == NULL) {
        return NULL;
    }
  
    plen = ngx_strlen(path);
    if (plen == 0 || plen >= NGX_WHL_MAXPATHSZ) {
        return NULL; 
    }
    
    /* Ignore additional '/' */   
    if (plen == 1 && ngx_strncmp(path, "/", plen) == 0) {
        return parent;
    }
      
    for (i = 0; i < parent->num_child; i++) {
    /* check if segment path already exists */   
        node = parent->children[i];
        if(node->segment->len == plen && 
            ngx_strncmp(path, node->segment->data, plen) == 0) 
        {        
            return node;
        }
    }
    
    /* uri segment path does not exists allocate new child */
    node = ngx_http_wh_create_node(path, plen, cf);
    
    if (node == NULL) {
        return NULL;
    }
    
    if (i >= parent->maxchild) {
        if (!ngx_http_wh_resize_children(parent, cf)) {
            return NULL;
        }
    }
    
    parent->children[i] = node;
    parent->num_child ++;
    
    return node;    
}


/* Resizes a node children array if original space is insufficient */
static size_t
ngx_http_wh_resize_children(ngx_whl_pnode_t *parent, ngx_conf_t *cf)
{
    size_t           new_sz, i;
    ngx_whl_pnode_t  **old, **new; 
    
    if (parent == NULL) {
        return 0;
    }
    
    new_sz = parent->maxchild * 2;
    
    if (new_sz > NGX_WHL_MAX_CHILDREN) {
        return 0;
    }
    
    new = ngx_pcalloc(cf->pool, new_sz * sizeof(ngx_whl_pnode_t*));
    
    if (new == NULL) {
        return 0;
    }
    
    old = parent->children; 
    
    for (i=0; i<parent->num_child; i++) {
        new[i] = old[i];
    }
    
    parent->children = new;
    parent->maxchild = new_sz; 
    old = NULL; 
    
    return 1;
}


/* Adds a uri to the uri tree */
static size_t
ngx_http_wh_add_path(u_char *path, ngx_whl_pnode_t *root, ngx_conf_t *cf)
{
    size_t           plen, last, index;
    u_char           *p, c, tmp[NGX_WHL_MAXPATHSZ];
    ngx_whl_pnode_t  *node; 
    
    if (path == NULL || root == NULL) {
        return 0;
    }
    
    plen = ngx_strlen(path);
    if (plen == 0 || plen >= NGX_WHL_MAXPATHSZ) {
        return 0;
    }
    
    p = path; 
    index = last = 0;
    node = root; 
  
    while ((c=*p++) != '\0') {
    
        switch(c) {            
        case '/':
            if (index + 1 >= NGX_WHL_MAXPATHSZ) {
                return 0;
            }
            
            tmp[index] = c;
            index++;
            
            tmp[index] = '\0';
            node = ngx_http_wh_add_child(tmp, node, cf);
            
            if (node == NULL) {
                return 0; 
            }
            
            index = last = 0;     
            break;
            
        default:
            if (index >= NGX_WHL_MAXPATHSZ) {
                return 0; 
            }
            
            tmp[index] = c; 
            index++; 
            last = 1; 
        
        }
       
    }
    
    if (last) {
        if (index >= NGX_WHL_MAXPATHSZ) {
            return 0; 
        }
        
        tmp[index] = '\0';
        node = ngx_http_wh_add_child(tmp, node, cf);
        if (node == NULL) {
            return 0;
        }
        
    } else {
        /* node ends with '/' */
        node->end_slash_allowed = 1;
    }
    
    return 1;
    
}

The following is the code snippet for the ngx_http_wh_list_cfg() function. This function is called to process each wh_list_uri directive containing the URL to be whitelisted. It calls the ngx_http_wh_add_path() function to add each URL to the whitelist URI tree. It also creates the root node using ngx_http_wh_create_node() function, if it doesn't exist.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
/* Process the white list uri configuration */
static char *
ngx_http_wh_list_cfg(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    size_t                             len;
    u_char                             *uri;
    ngx_str_t                          *value; 
    ngx_whl_pnode_t                    *root;
    ngx_http_uri_whitelist_loc_conf_t  *slcf;
    
    if (cf->args->nelts < 2) {
        return NGX_CONF_ERROR;
    }
    
    value = cf->args->elts;
    uri = value[1].data; 
    len = value[1].len; 
    
    if (uri[0] != '/') {
        ngx_log_error(NGX_LOG_EMERG, cf->log, 0, "[URI_WHITELIST]: "
        "Error uri must starts with '/'");
        return NGX_CONF_ERROR;
    }
    
    if (uri[len] != '\0') {
        ngx_log_error(NGX_LOG_EMERG, cf->log, 0, "[URI_WHITELIST]: "
        "Error uri does not end with '\0'");
        return NGX_CONF_ERROR;
    }
    
    slcf = conf; 
    if (slcf->uri_tree == NULL) {
        slcf->uri_tree = ngx_http_wh_create_node( (u_char *)"/", 1, cf);
        if (slcf->uri_tree == NULL) {
            return NGX_CONF_ERROR;
        }
    } 
  
    root = slcf->uri_tree;
    
    if (!ngx_http_wh_add_path(uri, root, cf)) {
        ngx_log_error(NGX_LOG_EMERG, cf->log, 0, "[URI_WHITELIST]: "
            "Error cannot add uri to whitelist");
        return NGX_CONF_ERROR;
    }
    
    return NGX_CONF_OK;
}

The ngx_http_wh_check_path_exists() function checks if a URL string is present in the URI tree. The following shows the code snippet. It breaks down a URL string into its parts. It checks that a URL string always begin with a "/" (must always have a root node). Then for each of its subsequent child parts, it checks whether the parent node contains the child part.

If the node is the root node "/" or if the node ends with a slash like "scripts/", then the end_slash_allowed flag of the node is checked. When end_slash_allowed is set to 1, it means that the node (URL) is present, otherwise it is not. The end_slash_allowed flag is set only when there is an explicit whitelist directive (wh_list_uri) for a URL that ends with "/".

This is required because when a URL like "/mydirectory/myfile.php" is whitelisted; the nodes "/", "mydirectory/" and "myfile.php" are created in the URI tree. However, this doesn't mean that the URL string "/" , or "/mydirectory/" should be accessible, since these 2 URLs are not whitelisted explicitly. To make "/" or "/mydirectory/" accessible, they must be specified explicitly using the whitelist directive.

Notice that in the parsing code, there is no handling of "./" or "../". This is not necessary in our case as Nginx normalizes the request URL before passing it to the module.

The ngx_http_wh_check_path_exists() function calls ngx_http_wh_check_path_seg() to check that a child node exists under a parent node. We will not go through the ngx_http_wh_check_path_seg() function. Refer to the Github link at the end of the article for the full module source code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
/* Checks if a uri path is present in the uri tree */
static size_t 
ngx_http_wh_check_path_exists(u_char* path, size_t len, ngx_whl_pnode_t *root)
{
    size_t           plen, index, last;
    u_char           c, *p, tmp[NGX_WHL_MAXPATHSZ]; 
    ngx_whl_pnode_t  *node;
    
    if (path == NULL || root == NULL) {
        return 0;
    }
    
    
    if (len == 0 || len >= NGX_WHL_MAXPATHSZ) {
        return 0;
    }
    
    p = path; 
   
    c = *p++;
    if( c != '/') {
        return 0;            
    }
    
    plen = len - 1; 
        
    node = root; 
    index = last = 0; 
    
    while (plen-- > 0) {
        
        c = *p++;

        switch(c) {            
        case '/':
            if (index + 1 >= NGX_WHL_MAXPATHSZ) {
                return 0;
            }
            
            tmp[index] = c;
            index++;
            tmp[index] = '\0';
            
            node = ngx_http_wh_check_path_seg(tmp, index, node); 
            if (node == NULL) {
                return 0;
            }
            
            index = last = 0;
            break;
        
        default:
            last = 1;
            
            if (index >= NGX_WHL_MAXPATHSZ) {
                return 0; 
            }
            tmp[index] = c;
            index++;
            
        }
        
    }
    
    
    if (last) {
        if (index >= NGX_WHL_MAXPATHSZ) {   
            return 0; 
        }
        
        tmp[index]='\0';
        node = ngx_http_wh_check_path_seg(tmp, index, node); 
        
        if (node == NULL) {
            return 0; 
        }
        
    } else {
        /* node ends with '/' */
        if (node->end_slash_allowed == 0) {
            return 0; 
        }
        
    }
    
    return 1; 
    
}

Installation and Testing

To install the module, obtain a copy of the module source code from github.

git clone https://github.com/ngchianglin/ngx_http_uri_whitelist_module.git

To verify the integrity and signature of the module source code, refer to this link. Obtain a copy of my public key; follow the page instructions on how to import it and verify the git commit.

Download the latest stable nginx source code from https://nginx.org. Verfiy the integrity of the source code using the pgp signature.

wget https://nginx.org/download/nginx-1.16.1.tar.gz

The downloaded gzipped file should have the following SHA256 checksum.

f11c2a6dd1d3515736f0324857957db2de98be862461b5a542a3ac6188dbe32b nginx-1.16.1.tar.gz

Extract the nginx source and compile nginx with the URI whitelisting module. Install it into /usr/local/nginx.

tar -zxvf nginx-1.16.1.tar.gz
cd nginx-1.16.1/
./configure --with-cc-opt="-Wextra -Wformat -Wformat-security -Wformat-y2k -Werror=format-security -fPIE -O2 -D_FORTIFY_SOURCE=2 -fstack-protector-all" --with-ld-opt="-pie -Wl,-z,relro -Wl,-z,now -Wl,--strip-all" --add-module=../ngx_http_uri_whitelist_module
make
sudo make install

We can now test the URI whitelist module. It is assumed that there is already an apache website set up on the system and apache httpd is configured to listen on port 8080. We can configure nginx as a reverse proxy for the apache website. The module can also be used on a website hosted directly by nginx.

Edit the /usr/local/nginx/conf/nginx.conf with the following.

user  nginx nginx;
worker_processes  1;
error_log  /var/log/nginx/error.log warn;
pid        /var/log/nginx/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for" "$gzip_ratio"';

    sendfile        on;
    keepalive_timeout  65;
    server_tokens off;
    
    proxy_cache_path /usr/local/nginx/cache levels=1:2 keys_zone=webcache:2m max_size=150m;
    proxy_cache_key "$scheme$request_method$host$request_uri$is_args$args";
    proxy_cache_valid 200 302 30m;
    proxy_cache_valid 404 1m;

    gzip  on;
    
    map $sent_http_content_type $cachemap {
        default    no-store;
        ~text/html  "private, max-age=900";
        text/plain  "private, max-age=900";
        text/css    "private, max-age=7776000";
        application/javascript "private, max-age=7776000";
        ~image/    "private, max-age=7776000";
    }

    server {
        listen 80;
        server_name localhost;
        root   /opt/nginx/www;
        
        charset utf-8;
        access_log  /var/log/nginx/access.log  main;
        
        location / {
            
            index  index.html index.htm;
            
            proxy_cache webcache;
            proxy_cache_bypass $http_cache_control;
            
            proxy_set_header HOST $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_pass http://127.0.0.1:8080;
            
            add_header Cache-Control $cachemap;
            wh_list off;
        }

        

        # redirect server error pages to the static page /50x.html
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }

    }

}

Create a web root directory for nginx.

sudo mkdir -p /opt/nginx/www

Make sure that the nginx user and group are present, otherwise create them.

sudo mkdir -p /opt/nginx/home
sudo chmod 750 /opt/nginx/home
sudo groupadd -g 9870 nginx
sudo useradd -d /opt/nginx/home -u 9870 -g 9870 -s /bin/false nginx

On the apache website, make sure that you have a index.html with some test content inside. Create another test file, test.txt and put in some test content. The Nginx URI whitelist module is currently turned off in the nginx.conf. So these urls should be accessible from the Nginx proxy. Make sure apache httpd is running and listening on port 8080. Start up Nginx.

sudo /usr/local/nginx/sbin/nginx

Access http://localhost, http://localhost/index.html and http://localhost/test.txt. All three URLs should be accessible and return the right content.

devuser1@devmachine:~$ curl -i http://localhost
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 29 Oct 2019 04:38:06 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 170
Connection: keep-alive
Last-Modified: Tue, 29 Oct 2019 04:36:42 GMT
Vary: Accept-Encoding
Cache-Control: private, max-age=900
Accept-Ranges: bytes

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Testing html page</title>
</head>
<body>
<p>
This is a test for Nginx URI whitelisting !
</p>
</body>
</html>


devuser1@devmachine:~$ curl -i http://localhost/index.html
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 29 Oct 2019 04:43:04 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 170
Connection: keep-alive
Last-Modified: Tue, 29 Oct 2019 04:36:42 GMT
Vary: Accept-Encoding
Cache-Control: private, max-age=900
Accept-Ranges: bytes

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Testing html page</title>
</head>
<body>
<p>
This is a test for Nginx URI whitelisting !
</p>
</body>
</html>


devuser1@devmachine:~$ curl -i http://localhost/test.txt
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 29 Oct 2019 04:43:50 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 56
Connection: keep-alive
Last-Modified: Tue, 29 Oct 2019 04:37:23 GMT
Cache-Control: no-store
Accept-Ranges: bytes

This is a test text file
Testing Nginx URI whitelisting

Edit the /usr/local/nginx/conf/nginx.conf and turn on the Nginx URI whitelisting module.

wh_list on;

Reload nginx with the new configuration.

sudo /usr/local/nginx/sbin/nginx -s reload

Access the 3 URLs again using curl. This time, access should be denied with HTTP 404 error.

devuser1@devmachine:~$ curl -i http://localhost
HTTP/1.1 404 Not Found
Server: nginx
Date: Tue, 29 Oct 2019 04:49:49 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 146
Connection: keep-alive

<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

devuser1@devmachine:~$ curl -i http://localhost/index.html
HTTP/1.1 404 Not Found
Server: nginx
Date: Tue, 29 Oct 2019 05:07:19 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 146
Connection: keep-alive

<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

devuser1@devmachine:~$ curl -i http://localhost/test.txt
HTTP/1.1 404 Not Found
Server: nginx
Date: Tue, 29 Oct 2019 04:49:40 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 146
Connection: keep-alive

<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

Let's whitelist some of the URLs. Edit nginx.conf and add the following.

wh_list_uri /index.html;
wh_list_uri /test.txt;

Reload nginx.

sudo /usr/local/nginx/sbin/nginx -s reload

These 2 URLs should now be accessible again due to the whitelist.

devuser1@devmachine:~$ curl -i http://localhost/index.html
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 29 Oct 2019 05:13:50 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 170
Connection: keep-alive
Last-Modified: Tue, 29 Oct 2019 04:36:42 GMT
Vary: Accept-Encoding
Cache-Control: private, max-age=900
Accept-Ranges: bytes

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Testing html page</title>
</head>
<body>
<p>
This is a test for Nginx URI whitelisting !
</p>
</body>
</html>

devuser1@devmachine:~$ curl -i http://localhost/test.txt
HTTP/1.1 200 OK
Server: nginx
Date: Tue, 29 Oct 2019 05:16:35 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 56
Connection: keep-alive
Last-Modified: Tue, 29 Oct 2019 04:37:23 GMT
Cache-Control: no-store
Accept-Ranges: bytes

This is a test text file
Testing Nginx URI whitelisting

However, when we try to access http://localhost or http://localhost/, both show HTTP 404 error.

devuser1@devmachine:~$ curl -i http://localhost
HTTP/1.1 404 Not Found
Server: nginx
Date: Tue, 29 Oct 2019 05:17:43 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 146
Connection: keep-alive

<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>


devuser1@devmachine:~$ curl -i http://localhost/
HTTP/1.1 404 Not Found
Server: nginx
Date: Tue, 29 Oct 2019 05:17:49 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 146
Connection: keep-alive

<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>

This is because the root directory "/" has not been whitelisted. To allow access, we need to add the following to nginx.conf.

wh_list_uri /;

Reload nginx and the root directory URL should be accessible again. This is similar for subdirectory. If the root of a subdirectory is to be accessible, it has to be whitelisted. For example,

wh_list_uri /mydirectory/subdirectory2/;

Play around with the Nginx whitelist module. There is also the bypass extensions directive that will allow files with certain extensions such as jpg, gif etc... to be bypassed. The extensions bypass directive should be used carefully. For the best protection, web resources including static image files that are supposed to be accessible, should be whitelisted explicitly. The README.md at the module github repository has details on the syntax of its directives.

To whitelist a file extension, for instance ".txt", add the following to the nginx.conf

wh_list_bypass txt;

Create a new text file, mytest.txt and fill in some content. Reload nginx. This new text file will be accessible without being explicitly whitelisted. In fact, all files that end with ".txt" extensions will be accessible. The whitelist module will bypass the access checks for such extension.

devuser1@devmachine:~$ sudo /usr/local/nginx/sbin/nginx -s reload
devuser1@devmachine:~$ curl -i http://localhost/mytest.txt
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 31 Oct 2019 02:24:54 GMT
Content-Type: text/plain; charset=UTF-8
Content-Length: 70
Connection: keep-alive
Last-Modified: Thu, 31 Oct 2019 02:23:38 GMT
Vary: Accept-Encoding
Cache-Control: no-store
Accept-Ranges: bytes

This is another test file for trying on extensions
bypass directive.

The Nginx whitelist module will print warnings and alerts to the nginx error log. If a URL is denied access, an alert will be in the error log. If the module itself is turned off, a warning will be logged. This is useful for security monitoring, where a security engineer or administrator may want to know about illegal access or if the module itself got disabled.

Some examples from the error log.

2019/10/29 12:43:50 [warn] 6523#0: *6 [URI_WHITELIST] : White list module disabled !, client: 127.0.0.1, server: localhost, request: "GET /test.txt HTTP/1.1", host: "localhost"
2019/10/29 12:49:40 [alert] 6653#0: *8 [URI_WHITELIST] : Access Denied for [ /test.txt ] , client: 127.0.0.1, server: localhost, request: "GET /test.txt HTTP/1.1", host: "localhost"

Conclusion and Afterthought

Whitelisting is a useful technique in information security. It can be used in web applications to guard against invalid user input, it can be used in enterprises to prevent unauthorized applications from running. We can also use it to whitelist URLs and allow access to only specific web resources. This can be an additional layer of defense against web attacks, vulnerabilities in web application frameworks, misconfigurations and accidental uploads of sensitive files. Whitelisting can reduce the attack surface of a website or web application.

Useful References

The full source code for the Nginx URI Whitelist Module is available at the following Github link.
https://github.com/ngchianglin/ngx_http_uri_whitelist_module

If you have any feedback, comments, corrections or suggestions to improve this article. You can reach me via the contact/feedback link at the bottom of the page.