Night Hour

Reading under a cool night sky ... 宁静沉思的夜晚 ...

Creating a Finite State Php Rate Limiter

Bamboo image

My name is Sherlock Holmes. It is my business to know what other people don't know. , Arthur Conan Doyle


3rd March 2017


Introduction

Security at the application layer is increasingly important in our digital world where web and mobile applications are pervasive. Resource and rate limiting techniques are often used in networking and operating systems to prioritize and control the usage of shared resources. Such techniques can also be applied in application security, making it harder for brute force attacks, for attackers to probe an application for vulnerabilities and reducing spams from automated bots. This article shows how to create a simple finite state Php rate limiter using Mariadb database as the storage.

Basic Concepts and Approach

A google search of rate limiting algorithms will turn up concepts like token bucket or leaky bucket. In our case here, we will focus on something slightly more intuitive. To be able to limit something, we need the ability to track hits and the simplest form of this is a page counter. By adding a time interval to a counter, we can track the number of hits over a period of time. A rate limiter or throttle can then be built by simply controlling how many hits are allowed within this time interval.

The rate limiting counter will be built using php and can be easily integrated into web applications, protecting webform submissions, page, resource or login access. It can augment web captcha by offering an additional layer of defense. As machine learning and AI gets better it may become easier to defeat web captcha. Rate limiting can mitigate spamming or brute forcing attempts. The counter will be stored using a Mariadb database and it throttles based on IP addresses, allowing a limited number of accesses for each IP address within a period of time.

Using a database as storage will incur some performance hit. An denial of service (DOS) attack can also try to overwhelm the database, so it is necessary to set proper limits on the mariadb. For instance, the maximum number of connections allowed for mariadb and the maximum php threads can be set to a level that physical server can comfortably handle.

That said, a denial of service attack can still try to use up all the php threads and database connections. This can "starve" an application that is relying on the same php engine and mariadb database.

Algorithm Description

Here is a brief description of how this rate limiting counter works. When an IP address first access a page or submit a form, a counter record is created for the IP address and a start time recorded. The counter value will initially be zero. When the same IP address makes a second access or submission of the same resource, the counter value is incremented by 1. A check is done to see whether this 2nd access is within the throttle time interval. The throttle time interval is the period of time where the number of access is tracked. For example. Allow 10 accesses within 5 minutes. In the example, the 5 minutes is the throttle interval.

If the second access by the IP address is within this interval, the counter value is checked. Since this value is less than 10, access is granted, otherwise access will be denied. If the second access exceeds the throttle time interval, a new interval is created. The counter value is reset to 0 and a new start time recorded. The steps of incrementing the counter value, checking the throttle interval and the counter value are repeated for the second IP access in this newly created throttle time interval.

This process continues as long as the same IP address keep accessing/submitting the same page. The listing below shows the steps for this algorithm.

  1. IP address first access a page/submit a form.
  2. Counter record is created if it doesn't exist.
  3. A start time is recorded.
  4. The counter value is incremented by 1.
  5. Check to see if access is within throttle time interval.
  6. If it is within throttle time interval, check counter value and allow access if counter value is less than limit.
  7. If it is not within throttle time interval, create new time interval, set counter to 0 and record new start time. Repeat from step 4.

Note, to check if an access is within the throttle time interval, we just take the current access time minus the start time. If this difference is greater than the throttle time interval, the access is not within the interval, otherwise this access is within the interval. The following diagram illustrates this concept.

Throttle time Interval Image
Fig 1. Throttle Time Interval

Preparing the Database

Based on the algorithm described earlier, we create a database and table to store the ip counter records. A single table will hold the counter records. It consists of 4 columns (fields), a big integer primary key , a varchar for the ip address, a timestamp to store the throttle interval start time and an integer counter. Login to Mariadb using the mysql client and issue the following command.

create database throttledb1;

use throttledb1;

create table t1
(
id BIGINT NOT NULL PRIMARY KEY,
ip varchar(20),
stime TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
count INT
);

Next we create a database user account. The following SQL command creates a database user that has only insert, update and select privileges to this table. Replace the <Strong Secret Password> with a password that you want to set. Choose a strong and complex password of sufficient length. Notice the "REQUIRE SSL" option in the command, the Mariadb database is configured for TLS connection. This option will ensure that the user connects to it using an encrypted TLS connection.

GRANT INSERT,UPDATE,SELECT on throttledb1.t1 to 'throttleuser'@'localhost' IDENTIFIED BY '<Strong Secret Password>' REQUIRE SSL;
Flush privileges;

Php throttle script and PDO database access

The Php throttle script will make use of pdo with prepared statements (prevent sql injection) to access the database. A good tutorial on Php pdo is available at (The only proper) PDO tutorial. We will need functions for connecting to the database, creation of ip counter record, selection of the record, updating the record counter, resetting the record timestamp and counter. Functions are also needed for checking that the current access is within the throttle time interval and the counter is less than the limit.

The following lists the functions for implementing the basic algorithm described earlier. These function names are self explanatory. We will run through the code for some of the functions as well as the finite state machine. Full source code for the entire throttling script is available at the github link at the bottom of the article.

  • function getPDO()
  • function createIPCounter($pdo, $ip )
  • function getIPAddressRecord($pdo, $id)
  • function updateCount($pdo, $id)
  • function resetThrottleInterval($pdo, $id )
  • function checkWithinThrottleInterval($stime)
  • function checkCountWithinRate($count)

The getPDO() function
The is the code snippet for connecting to the Mariadb database and obtaining a pdo object. Mariadb is configured for TLS connection and client certificate verification is required. The location for the client key, client cert and ca cert (certificate authority certificate) has to be specified when creating the pdo object. To prevent concurrency issues the isolation level for the session is set to SERIALIZABLE.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/* 
Connects to the database and return the pdo database object. 
The connection is to a mariadb instance over TLS connection.
A good tutorial on php PDO,((The only proper) PDO tutorial)  
https://phpdelusions.net/pdo

Any exception will be bubbled up to the container. In production, remember to 
disable the display of errors. Errors can be sent to the error log. 
*/

function getPDO()
{
    $pdo=null;    
    $driver="mysql";
    $host = 'myhostname.localdomain';
    $db   = 'throttledb1';
    $user = 'throttleuser';
    $pass = 'my secret strong complex password';
    $charset = 'utf8';
    $dsn = "$driver:host=$host;dbname=$db;charset=$charset";
    $opt = [
    PDO::ATTR_ERRMODE            => PDO::ERRMODE_EXCEPTION,
    PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
    PDO::ATTR_EMULATE_PREPARES   => false,
    PDO::MYSQL_ATTR_SSL_KEY    =>'<client key path location>/client-key.pem',
    PDO::MYSQL_ATTR_SSL_CERT=>'<client cert path location>/client-cert.pem',
    PDO::MYSQL_ATTR_SSL_CA    =>'<client ca cert path location>/ca-cert.pem'
    ];

    $pdo = new PDO($dsn, $user, $pass, $opt);
    //Set the isolation level for the session to serializable
    $pdo->query('SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE');
    return $pdo;
}

The createIPCounter() function
The createIPCounter() function creates the ip counter record. The following shows the code snippet.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
/* 
   Creates a new ip address counter in the database. 
   The primary key is based on the integer value of the ip address by
   applying ip2long() function to the ip address. 
   Trying to add the same entry again(when another process has already added)
   will throw an exception that has to be handled.  
   Takes the pdo database object and ip address as parameters.
   Returns the rows updated, where success will be 1 and failure 0. 
   
*/
function createIPCounter($pdo, $ip )
{
    $updatedrows=0; 
    try
    {   //Newly created row timestamp field will be current time when set to NULL
        $stmt = $pdo->prepare('INSERT into t1 (id,ip,count, stime) values (?, ?, 0, NULL);');
        $ret1 = $stmt->execute([ip2long($ip) , $ip]);
        $updatedrows = $stmt->rowCount();
        $stmt = null;
    }
    catch(PDOException $e)
    { 
       error_log("Add counter exception " . $e . " \n", 0);
    }
    
    return $updatedrows; 
}

Notice that the primary key of the record is set using the Php ip2long function, coverting the IPv4 address string into a unique integer value. createIPCounter() doesn't use any pdo transaction. If a database entry already exists for the IP address, an exception will be thrown and logged. This can happen when there are multiple processes/threads attempting to create the same ip counter record at the same time. The exception can be ignored as the rest of the throttling script can continue to work so long as the record is created. It doesn't matter which process/thread creates it.

The getIPAddressRecord() function
The getIPAddressRecord() function queries the database and retrieve an ip counter record if it exists. It returns the result as an associative array or false otherwise.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
/* 
   Retrieves the current counter for an ip address in the 
   database. 
   Takes the pdo database object and a int representing the 
   ip address. The int is obtained by applying ip2long()
   to the ip address. Returns false if ip address counter
   is not present else an associative array of the result is 
   returned. 
*/
function getIPAddressRecord($pdo, $id)
{
    $result = null; 
    $stmt = null;
    //Prepared statement to prevent SQL injection
    $stmt = $pdo->prepare('Select id, ip, stime, count from t1 where id = ? '); 
    $stmt->execute([$id]);
    $result = $stmt->fetch();
    $stmt = null;
    return $result;
}

The updateCount() function
This function increments the counter for an ip address record. It returns the new updated counter value if successful.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
/* 
Increments the counter for an ip address 
Takes the pdo database object, the id representation of the 
ip address(ip2long() apply to ip)
Wraps in transaction to prevent concurrency issue
Returns false if the transaction fail or 
the updated count if successful  
*/
function updateCount($pdo, $id)
{
    $stmt = null;
    $result=null;
    try
    {
        $pdo->beginTransaction();
        $stmt= $pdo->prepare('Update t1 set count = count + 1 where id = ?');
        $stmt->execute([$id]);
        
        $stmt = $pdo -> prepare('Select count from t1 where id = ?');
        $stmt->execute([$id]);
        $result = $stmt->fetch();
        
        $pdo->commit();      
        $stmt = null;
    }
    catch(PDOException $e)
    {
         error_log("Update Count transaction error " . $e ." \n", 0);
         $pdo->rollBack();
         $stmt =null; 
    }
    
    if($result)
    {
       return $result['count'];
    }
    else
    {        
       return false;
    }
}

The resetThrottleInterval() function
The resetThrottleInterval() fucntion creates a new throttle time interval. It records a new start time and reset the counter to 0. The two sql update statements are wrapped in a transaction.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
/* 
Reset to a new throttle Interval 
Wrap in transaction as there are two updates together
Takes a pdo database object and the id representing the ip
address (ip2long() apply to ip address).
Returns true on success, false otherwise
*/

function resetThrottleInterval($pdo, $id )
{    
    $stmt=null; 

    try
    {
        $pdo->beginTransaction();
         //set a new throttle start interval
        $stmt = $pdo->prepare('Update t1 set stime = CURRENT_TIMESTAMP() where id = ?');
        $stmt->execute([$id]);
            
        //set the counter to 0
        $stmt = $pdo->prepare('Update t1 set count=0 where id = ?');
        $stmt->execute([$id]);
          
        $pdo->commit();
        //Commited successfully         
        $stmt=null;
    }
    catch(PDOException $e)
    {
        error_log("resetThrottleInterval transaction error " . $e ." \n", 0);
        $pdo->rollBack();
        $stmt=null;
        return false;
     }
     
    return true;
}

The checkWithinThrottleInterval() function
The checkWithinThrottleInterval() function checks whether an access is within the throttle time interval. It takes the current time and subtract the start time of the record. If this difference is greater than the throttle time interval, it returns false and subsequently the resetThrottleInterval() function described earlier will be called to create a new interval.

Notice that a test is done for a negative time difference in line 18. This check was added after I mistakenly cast $stime variable to an int during development, creating a negative time difference. It broke the rate limiter and valid accesses are all disallowed. If a negative time should occur for whatever reasons, the function will return false. This is in the hope that whatever causes the time issue (leap second, human errors etc...) will recover when a new throttle interval is created, hence allowing the rest of the rate limiter script to continue working. A possible alternative is to terminate the script which may prevent further access.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/* 
 Checks whether the ip counter is still within throttle interval 
 Takes an String representing the start time of the throttle
 interval. The String is converted to unix time using
 strtotime().
 
 Returns true if still within interval, false otherwise

*/
function checkWithinThrottleInterval($stime)
{
    
    $throttle_interval = THROTTLE_INTERVAL; 
    $starttime = strtotime($stime); //Time that the throttling interval start
    $currenttime = time();
    $diff = ($currenttime - $starttime);
    
    if($diff < 0 )
    {//negative time, log an error
     //Attempt to allow service to continue in this case
       error_log("Something horrible has happened, negative time occurs \n", 0);
      //Returning false will allow a new interval to be created hopefully
      //solving the time issue. 
       return false; 
    }
    
    if( $diff > $throttle_interval )
    {//Interval already lapses 
        return false;
    }
    else
    {// Still within interval
        return true; 
    }
    
}

The checkCountWithinRate() function
This function checks that the counter value is within the rate limit that is set. It returns true if the value is within the limit, false otherwise. There is a check to make sure the counter value is not negative which can be caused by an overflow. This is an additional precaution, it is quite unlikely that the counter can overflow from network access over the internet. It will take at least 2 billion accesses within the throttle interval to cause this. Nonetheless this is a good security precaution to take. It also makes the function more robust against unexpected input.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* Checks if counter value is within the permitted rate set
   for the throttle interval. 
   Takes an int representing the count value of the
   ip counter. 
   Returns true if the count is less than the rate,
   false otherwise. 
   
*/
function checkCountWithinRate($count)
{
    $interval_rate = INTERVAL_RATE ;  
    
    if($count <= $interval_rate && $count > 0)
    { //check for greater than 0 to prevent
      //possible overflow
        return true;
    }
    else
    {
        return false;
    }
}

Finite State Machine

We could have just used normal sequential and logical constructs to build the rate limiter. But this will have involved several conditional branches. A state machine offers an alternative without having to deal with many conditional branches. A state diagram also provides better clarify in understanding the logic of an application and can make implementation easier.

We define the following 7 states for the state machine

INIT The initial state of the system when a request from a remote ip address first arrive.
IPEXIST There is already an existing counter record for the remote ip address.
IPNOTEXIST There is no existing ip counter record.
WITHININT The ip record is within the throttle time interval.
EXCEEDINT The ip record has exceeded the throttle time interval.
ALLOW Access is allow for the ip address.
DISALLOW Access is disallowed for the ip address.

When a request comes in from a remote ip , it is in the INIT state. If there is already an existing record for the remote ip, the state transits to IPEXIST. Otherwise it transits to IPNOTEXIST. At the IPEXIST state, a check is done to see if the request is within the throttle time interval. If it is within the time interval, the state transits to WITHININT. Otherwise it goes to EXCEEDINT. The WITHININT state will check the counter value. If the counter value is less than the limit, it transits to the ALLOW state otherwise it transits to DISALLOW. ALLOW and DISALLOW are termination states, where user actions can be carried out and the state machine exits.

When the system is in the IPNOTEXIST state, it will create a new ip counter record and transits back to IPEXIST state where the processing will continue. In EXCEEDINT state, where the request exceeds the throttle time interval, a new throttle interval is created. The counter is reset to 0 and a new start time is recorded. The EXCEEDINT state transits back to IPEXIST state and the processing continues from there.

The following state diagram shows the various state transitions.

Throttle finite state diagram
Fig 2. The finite state diagram

The following is the code snippet for the state machine.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
/*
Finite state machine to handle the rate limiting conditions and throttling. 
Takes a ip address string as argument. 
If it is within the rate limit, the allow() function will be called 
to do the actual work. 
If rate limit is exceeded, disallow() function is called. 
These 2 functions are defined by the actual scripts that uses
this throttling finite state machine. 

*/
function startThrottleStateMachine($remoteip)
{
    $STATES = [
    'INIT' => 0,
    'IPEXIST' => 1,
    'IPNOTEXIST' => 2,
    'WITHININT' => 3, 
    'EXCEEDINT'=>4,
    'ALLOW' => 5,
    'DISALLOW'=>6
    ];
    
    $ip = $remoteip; 
    $currentstate = $STATES['INIT'];        
    $pdo = getPDO();  //Get the database object
    $end = false; 
    $result = null; 
    $terminate_counter =0;
    $maxcycle=10; 
    
    while(!$end)
    {
        switch ($currentstate)
        {
            case $STATES['INIT']:
                //initial state
                $result = getIPAddressRecord($pdo ,ip2long($ip)); 
                if($result)
                {   
                    $currentstate = $STATES['IPEXIST'];
                }
                else
                {//new ip 
                    $currentstate = $STATES['IPNOTEXIST'];
                }
            
                break;
            case $STATES['IPEXIST']:   
                 //ip counter exists
                 $newcount = null; 
                 $newcount = updateCount($pdo, ip2long($ip));
                 if(!$newcount)
                 {//Cannot update counter value successfully
                     error_log("Unable to update IP record count value", 0);
                     exit(1);
                 }
                 else
                 {//Set the result to new count
                  //this will be used by WITHININT state
                    $result['count'] = $newcount;    
                 }
                 
                 
                 if(checkWithinThrottleInterval( $result['stime'] ) )
                 { //within throttle interval
                     $currentstate = $STATES['WITHININT'];
                 }
                 else
                 { //exceeds interval
                     $currentstate = $STATES['EXCEEDINT'];
                 }

                break;
            case $STATES['IPNOTEXIST']:   
                 //Ip counter does not exists
                 createIPCounter($pdo, $ip );
                 $result = getIPAddressRecord($pdo ,ip2long($ip)); 
                 if(!$result)
                 {
                     error_log("Unable to create new IP record", 0);
                     exit(1); 
                 }
                 $currentstate = $STATES['IPEXIST'];
                break;

            case $STATES['WITHININT']:   
                 //within the throttle interval
                 if(checkCountWithinRate( (int)$result['count'] ))
                 {
                     $currentstate = $STATES['ALLOW'];
                 }
                 else
                 {
                     $currentstate = $STATES['DISALLOW'];
                     
                 }
                break;

            case $STATES['EXCEEDINT']:   
                  //exceeds the throttle interval, set a new one
                   if(!resetThrottleInterval($pdo, ip2long($ip) ))
                   {
                       error_log("Unable to create a new throttle interval!",0);
                       exit(1);
                   }
                   $result = getIPAddressRecord($pdo ,ip2long($ip)); 
                   if(!$result)
                   {
                       error_log("Unable to obtain new ip interval record after resetThrottleInterval!", 0);
                       exit(1); 
                   }
                   $currentstate = $STATES['IPEXIST'];
                break;
            case $STATES['ALLOW']:   
                 //Within the allowed rate limit
                 //The actual work and function that you want to do comes here
                 allow($ip, $result); 
                 $end=true;
                 $pdo=null;
                 break;       
            case $STATES['DISALLOW']:   
                 //Exceeds the rate limit silently ignore, log or send error message
                 disallow($ip, $result); 
                 $end=true;
                 $pdo=null;
                 break;                       
            default:
                //something is horribly wrong, shouldn't come here
                error_log("Unknown state terminating\n", 0);
                exit(1); 
            
        }
        
        //Additional safeguard to prevent infinite loop
        if($terminate_counter > $maxcycle)
        {
           error_log("Something horrible has happened, state loop exceeded max cycle, terminating \n",0);
           exit(1);           
        }       
        $terminate_counter++;
    }
    
}

The 7 states are defined using an associative array and a switch statement handles the different states. A while loop provides the iterations through the state transitions. The longest cycle through the state machine to its termination state (ALLOW or DISALLOW) should not exceed 10. There is a check at line 135 to ensure this. At the termination points, there is an allow() as well as a corresponding disallow() function where the actual user application code can be run.

Integration with Other Applications

As described earlier in the finite state machine section, there are two integration points for an actual application that wants to make use of this rate limiting php script. The rate limiter calls the allow() function when access is within the defined rate and disallow() when access exceeds the defined rate. The allow() and disallow() functions can be placed in the application script and specific application code placed inside these two functions.

There are two constants defined in the rate limiter script, throttle.php. These constants define the rate limit.

define("THROTTLE_INTERVAL", 60); //in seconds
define("INTERVAL_RATE", 5); //Number allowed in throttle interval. Eg. 5 emails per 60 seconds

The THROTTLE_INTERVAL defines the time period and INTERVAL_RATE, the number of accesses allowed in the THROTTLE_INTERVAL time period.

The following is a simple demo script, throttle-demo.php, showing how the rate limiter can be integrated. throttle.php contains the rate limiter code and is added to the script in line 41. For production you may want to place throttle.php in a location that is separate from the web document root, perhaps in a library location for Php scripts. The two functions, allow() and disallow(), are placed in the demo script and application code placed inside their function bodies.

For this case, the application code just echo out simple message of "Allowed : ip address count: N" and "Disallowed : ip address count: N" respectively. In the demo script the throttling is implemented for a HTTP GET request. This can be easily changed to HTTP POST according to what the application is using.

The demo script takes a HTTP GET parameter, ip which will allow different ip address to be used for simulation and testing purposes. There is no proper validation for the ip parameter. In production, this ip parameter shouldn't be used. The real remote client ip address, $_SERVER['REMOTE_ADDR'] should be used. If a reverse proxy is front of the web application, you may have to use some other headers such as the X-Forwarded-For.

The startThrottleStateMachine() at line 96 starts the state machine throttling. It will either call allow() or disallow() function depending on whether the rate limit has been exceeded.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
<?php

/*
* MIT License
*
* Copyright (c) 2017 Ng Chiang Lin
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/

/*
throttle-demo.php 
Simple script to demo how to use 
the php rate limiter, throttle.php

Ng Chiang Lin
Feb 2017

*/

/* Includes the main throttling script with the finite state machine
Note in production, you may want to place throttle.php into a location outside 
of the documentroot, perhaps in a php library location
*/
require_once 'throttle.php';


/* 
This function is called by the state machine in throttle.php
when the throttling rate is within limit. 
The actual work that we want to do if it is within the 
rate limit can be placed inside this function.  
*/

function allow($ip, $result)
{
    echo "Allowed : " . $ip . " count: " . $result['count'] . "<br>\n";

}


/*
This function is called by the state machine in throttle.php
when the throttling rate is exceeded.
Any work to be done if the rate limit is exceeded can be placed 
here. E.g. You can leave this empty and simply exit when the rate
limit is exceeded. 

*/
function disallow($ip, $result)
{
    echo "Disallowed : " .$ip . " count: " . $result['count'] .  "<br>\n";
    exit(1);
}



if( isset($_SERVER['REQUEST_METHOD'])  &&  strcasecmp("get", $_SERVER['REQUEST_METHOD'] ) == 0   )
{ //Check that it is a HTTP GET

    $ip = $_SERVER['REMOTE_ADDR']; //Connecting remote client ip address


    if(isset($_GET['ip']) && !empty($_GET['ip']))
    {//Warning !
     //The is only for testing, to simulate different ip
     //It will not throttle real ip addresses, can be bypassed
     //and lead to vulnerabilities with the throttling script
     //There are also no checks for malicious input
     //In Production to throttle real ip addresses, 
     //remove this and use $_SERVER['REMOTE_ADDR'] 

        $ip= $_GET['ip'];
    }

    header('Content-Type: text/html; charset=UTF-8');
    header('Cache-control: no-store');

    //Starts the finite state throttling
    startThrottleStateMachine($ip);

}


?>

Testing the Rate Limiter

Now that we have the throttle and the demo script set up, we can start testing to make sure that the rate limiter functions correctly. To do this, I have choosen to write a simple multithreaded java client. It will start multiple threads running continuously for a defined time interval. Each thread will access the demo application continuously, noting down whether access is allowed or disallowed based on the response from the demo application. The demo application as described earlier will echo a line containing "Allowed" if the rate limit is not exceeded, otherwise it will echo a line containing "Disallowed".

Each thread will print out the total number of access requests, the requests that are ok (http 200 status), the requests that encountered http errors, the allowed requests, the disallowed requests, start time of the thread, end time of the thread and the running time of the thread. The WebClient app will also print out the total number of requests that are allowed for all threads.

Comparing these results with the rate limit that has been configured for the throttling script, we can tell whether the script is working properly. The following is the source code listing of the java WebClient application. Each of the thread takes a String containing the url to connect to, the interval to run, and a name for the thread. The threadmax variable in the main() method specifies the number of threads. There is an array containing two urls at line 147. One of the url has an ip parameter, 192.168.230.76. This is to simulate two different ip address accessing the resource. The url without parameter will use the actual remote ip address of the client. These 2 urls are assigned among the 10 threads. The threads are configured to run continuously for an interval of at least 10 minutes. Each thread will sleep for 1 second before making a request to the url.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
/*
 * Simple multithreaded java client app to 
 * test rate limiting for a web application
 * 
 * Ng Chiang Lin
 * Feb 2017
 * 
 */

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;

import javax.net.ssl.HttpsURLConnection;

public class WebClient extends Thread
{
    private String name;
    private String url;
    private long interval;
    private long starttime;
    private int allowed;
    private int disallowed;
    private int total;
    private int http200ok;
    private int httperror;

    public WebClient(String name, long interval, String url)
    {
        this.name = name;
        this.interval = interval;
        this.url = url;
        this.starttime = 0;
        this.disallowed = 0;
        this.allowed = 0;
        this.total = 0;
        this.http200ok = 0;
        this.httperror = 0;

    }

    private void connecturl()
    {
        BufferedReader in = null;
        try
        {
            URL webquery = new URL(url);
            total++;

            HttpsURLConnection con = (HttpsURLConnection) webquery.openConnection();
            if (con.getResponseCode() == 200)
            {
                http200ok++;
            }
            else
            {
                httperror++;
            }

            in = new BufferedReader(new InputStreamReader(con.getInputStream()));

            String line = null;
            while ((line = in.readLine()) != null)
            {
                if (line.contains("Allowed"))
                {
                    allowed++;
                }
                else if (line.contains("Disallowed"))
                {
                    disallowed++;
                }
            }

        }
        catch (IOException e)
        {
            System.err.println(e);

        }
        finally
        {
            if (in != null)
            {
                try
                {
                    in.close();
                }
                catch (IOException e)
                {// Ignore the exception
                }

            }
        }

    }

    @Override
    public void run()
    {
        boolean end = false;

        if (starttime == 0)
            starttime = System.currentTimeMillis();

        long currenttime = starttime;

        while (!end)
        {
            connecturl();

            currenttime = System.currentTimeMillis();

            if (currenttime - starttime > interval)
            {
                end = true;
            }

            try
            {
                Thread.sleep(1000); // sleep for 1s
            }
            catch (InterruptedException e)
            {
                System.err.println(e);
            }

        }

        System.out.println("Thread: " + name + " , total: " + total + " , httpok: " + http200ok + " , httperror: "
                + httperror + " , allowed: " + allowed + " , disallowed: " + disallowed + " , starttime: " + starttime
                + " , endtime: " + currenttime + " , elapsed time: " + (currenttime - starttime));

    }

    public int getAllowed()
    {
        return allowed;
    }

    public static void main(String[] args)
    {
        long interval = 10 * 60 * 1000;// 10 minutes
        int threadmax = 10;
        int totalallowed = 0;
        String urls_array[] = { "https://www.nighthour.sg/csp-violation-report-endpoint/throttle-demo.php",
                "https://www.nighthour.sg/csp-violation-report-endpoint/throttle-demo.php?ip=192.168.230.76" };

        WebClient threads[] = new WebClient[threadmax];

        for (int i = 0; i < threadmax; i++)
        {
            String tname = "t" + i;
            int urlindex = i % urls_array.length;
            String url = urls_array[urlindex];
            WebClient cthread = new WebClient(tname, interval, url);
            cthread.start();
            threads[i] = cthread;
        }

        for (int i = 0; i < threadmax; i++)
        {
            try
            {
                threads[i].join();
            }
            catch (InterruptedException e)
            {
                System.err.println(e);
            }

            totalallowed += threads[i].getAllowed();

        }

        System.out.println("Total allowd for all threads is " + totalallowed);

    }

}

As shown earlier in the throttle script, we have defined the rate limit in two constants, THROTTLE_INTERVAL and INTERVAL_RATE. In this case, the throttle script is set to a rate of 5 requests per 60 seconds (5 requests/minute). In 10 minutes, it will only allow 50 requests per ip address. The java WebClient application uses 2 ip addresses, one of them the real client ip, the other a stimulated ip through the ip parameter in the HTTP GET. The total allowed requests from the test should be around 100.

The test website (www.nighthour.sg) uses HTTPS and a certificate issued by let's encrypt. We need to import the let's encrypt certificate into a java keystore that the WebClient uses, so that it can validate the server certificate properly. Obtain a copy of the fullchain.pem certificate from the let's encrypt directory where the TLS certs are stored. Issue the following keytool command.

keytool -importcert -alias nighthour.sg -file fullchain.pem -keystore mystore.jks

keytool is a utility that is part of the java jdk. Note that I am using java 8 jdk. When running the import command, a new keystore -- mystore.jks will be created and you will be prompted to enter a password for the keystore. When prompted to trust the certificate, enter "yes". To check whether the certificate has been imported into the keystore, you can use the following command.

keytool -list -alias nighthour.sg -keystore mystore.jks

A additional property "-Djavax.net.ssl.trustStore=mystore.jks" needs to be passed to the java jvm when running the WebClient application. The following shows the full command.

java -Djavax.net.ssl.trustStore=mystore.jks WebClient

Test Results

Finally, we are ready to run the tests and ensure that the rate limiter that we have built is working correctly. The screenshot below shows the results of running the WebClient with 10 threads.

Test Results for Php Throttle script
Fig 3. Test results for php throttle script

The total number of requests that are allowed is 102, while this may differ slightly from the limit that has been set, the rate limiter is actually working properly. In the real environment, each thread does not end exactly after 10 minutes due to its own computations, network delays etc... If you look at the running time for each thread (elapse time column), some of these values are greater than 10 minutes, so a few additional requests will have been accepted by the rate limiting php script.

Another possible cause of this is due to the algorithm that we have used. Instead of throttle intervals starting one after another; a request that comes in much later, can start a new throttle interval. This creates a skipping effect that likely allows slightly more connections.

The tests can be repeated with different number of theads and different duration. The maximum I have tried is 50 threads with 10 minutes interval, so far the rate limiter is functioning well. When testing with a larger number of threads, it is necessary to check that the Php, Mariadb database setup on the web application server is able to keep up.

I am using Php FastCGI Process Manager (Php FPM) as the php interpreter. The right number of Php FPM processes has to be configured when testing with higher number of threads, the same goes for the Mariadb database. To mitigate denial of service attacks from using up all the physical server resources, these numbers should be set carefully so that server can comfortably manage the maximum load.

Conclusion and Afterthought

In this article, we started with the various uses of rate limiting technique, how it can be used for application security and goes on to build one using Php and a Mariadb storage. The Php throttling script implements a finite state machine for doing the throttling. Finally the whole setup is tested for correct function using a Java web client software written for this purpose. Quite a fair bit of work for a simple piece of software but there is much satisfaction from completing and getting it to work properly.

In the earlier part of the article, I mention about token bucket algorithm but focus more on a counter approach which can be slightly more intuitive. If you have read the wikipedia article on token bucket and think about this counter example here, you will realize that the counter is actually very similar to a token bucket. It has a maximum rate limit (the count in the counter) like the token bucket and it has a rate (throttle time interval) just like the token bucket. The difference is that the token bucket has a regular fill rate while the counter method simply fill up fully in a new interval. The counter method though is sufficient for a simple rate limiter where regularity is not required or important.

There is scope for improvements. For instance we can modify this counter method into a token bucket or use a memory based storage instead of the more heavy weight Mariadb database. Such a memory based storage can potentially have far better performance. Mariadb though is sufficient for a basic design which can be easily supported by many web applications that use a SQL database. The token bucket, memory based storage improvements can be left as a future exercise.

Useful References

A copy of the scripts used in this survey has been put up at this respository
https://github.com/ngchianglin/PhpRateLimiter

If you have any feedback, comments, corrections or suggestions to improve this article. You can reach me via the contact/feedback link at the bottom of the page.

Article last updated on Oct 2020.