8.8.4.0 | Filters | Rate Limiting Filter

The Rate Limiting filter can be used to limit the number of requests that make it through Repose to specified resources belonging to the origin service. By imposing a rate limit, an operator can prevent the origin service from being flooded by requests, thereby bounding the potential for congestion or contention.

Additionally, this filter can expose rate limiting data by allowing users to query rate limits. With accessible rate limiting data, a user can determine the number of requests they are allowed to make per period of time to various resources.

See Additional Information for more details on how rate limiting is performed.

General Filter Information

Name: rate-limiting
Default Configuration: rate-limiting.cfg.xml
Released: v1.0
Bundle: repose-filter-bundle
Schema

Prerequisites & Postconditions

Required Request Headers

X-PP-User - Supplies a unique identifier for the user making the request. If the request matches a limit, the request count associated with this identifier will be incremented.
(Optional) X-PP-Groups - Supplies one or more identifier for the limit group(s) that the user belongs to.

Quality Factor

This filter uses the quality parameter of header values to determine the user and group(s) provided in the request. The value(s) with the highest quality will be used, while all other values will be ignored. In the case of the X-PP-User header, only the first of the highest quality values will be used.

This is useful when an operator desires to employ multiple authentication/identity mechanisms.

Required Preceding Filters

Strictly speaking, there is not a hard requirement for any filter to precede this filter. However, it is strongly recommended to utilize the following filters ahead of this filter.

URI Normalization Filter - Normalizes the URI so that limits' uri-regex match reliably, and capture groups work consistently. Normalization helps prevent cache busting that may occur due to the various possible encodings of the URI. In other words, normalization will lead to /servers/abc/instances/123 and /servers/abc/instances/%31%32%33 being handled identically.
User Identity Filter - Populates the X-PP-User and X-PP-Groups headers described in [Required Headers]. Any of the following filters can be used to fulfill this role:

Request Headers Created

Accept - Only modified if this filter is configured with a request-endpoint, include-absolute-limits is enabled, and the request was made to the request-endpoint. The value of this header will be application/xml.

Request Body Changes

This filter does not modify the request body.

Recommended Follow-On (Succeeding) Filters

This filter is not strictly required by any other filters.

Response Body Changes

This filter will only modify the response body if this filter is configured with a request-endpoint. In that case, in response to requests made to the request-endpoint, this filter will populate the response body with details of the current state of limits which apply to the provided user with the provided group(s).

See Querying Limits for more details.

Response Headers Created

Retry-After - Only created if the user is rate limited. The value of this header will be a date after which the user can send another request without being rate limited.
Content-Type - Only created if this filter is configured with a request-endpoint, and the request was made to the request-endpoint. The value of this header will be the preferred media type provided in the Accept header of the request.

Response Status Codes

Status Code Reasons

Status Code	Reasons
`200`	If the response to a request to the `request-endpoint` is sent successfully.
`401`	If the `X-PP-User` header is not present on the request.
`406`	If the `Accept` header on a request to the `request-endpoint` does not indicate that the user supports `application/xml` or `application/json`.
`413`	If the user is rate limited and `overLimit-429-responseCode` is set to false.
`429`	If the user is rate limited and `overLimit-429-responseCode` is set to true.
`500`	If an unexpected issue arises.
`502`	If there is an issue obtaining absolute limits from the origin service. If there is an issue committing an updated limit across nodes when using the Distributed Datastore.
`503`	If the filter has not yet initialized. If a global rate limit is reached. If a Distributed Datastore action fails.

200

If the response to a request to the request-endpoint is sent successfully.

401

If the X-PP-User header is not present on the request.

406

If the Accept header on a request to the request-endpoint does not indicate that the user supports application/xml or application/json.

413

If the user is rate limited and overLimit-429-responseCode is set to false.

429

If the user is rate limited and overLimit-429-responseCode is set to true.

500

If an unexpected issue arises.

502

If there is an issue obtaining absolute limits from the origin service.
If there is an issue committing an updated limit across nodes when using the Distributed Datastore.

503

If the filter has not yet initialized.
If a global rate limit is reached.
If a Distributed Datastore action fails.

Examples

Querying Limits Example

In this example, we have configured this filter to allow users to query their limits. Provided below are sample requests to give you a better idea of what that looks like.

Configuration

<rate-limiting datastore="distributed/hash-ring" use-capture-groups="false" xmlns="http://docs.openrepose.org/repose/rate-limiting/v1.0">
    <request-endpoint (1)
        uri-regex="/limits" (2)
        include-absolute-limits="false"/> (3)

    <limit-group id="limited" groups="BETA_Group IP_Standard" default="false">
        <limit uri="*" uri-regex="/something/(.*)" http-methods="PUT" unit="MINUTE" value="10" />
        <limit uri="*" uri-regex="/something/(.*)" http-methods="GET" unit="MINUTE" value="10" />
    </limit-group>

    <limit-group id="limited-all" groups="My_Group" default="true">
        <limit uri="*" uri-regex="/something/(.*)" http-methods="ALL" unit="HOUR" value="10" />
    </limit-group>
</rate-limiting>

1	By including this element, the limit query feature is enabled.
2	The `uri-regex` attribute specifies the URI at which queries will be handled. All requests that match this URI regular expression will be handled as limit query requests.
3	The `include-absolute-limits` attribute specifies whether or not Absolute Limits should be queried.

Limit Query Request (without group)

curl http://localhost:8020/limits -H "x-pp-user: 123456" -H "accept: application/xml"

Limit Query Response (without group)

<limits xmlns="http://docs.openstack.org/common/api/v1.0">
    <rates>
        <rate regex="/something/(.*)" uri="*">
           <limit next-available="2012-06-22T14:39:33.832Z" unit="HOUR" remaining="10" value="10" verb="ALL"/>
        </rate>
    </rates>
</limits>

Limit Query Request (with group)

curl http://localhost:8020/limits -H "x-pp-user: 123456"  -H "x-pp-groups: IP_Standard" -H "accept: application/xml"

Limit Query Response (with group)

<limits xmlns="http://docs.openstack.org/common/api/v1.0">
    <rates>
        <rate regex="/something/(.*)" uri="*">
            <limit next-available="2012-06-22T15:38:17.956Z" unit="MINUTE" remaining="10" value="10" verb="PUT"/>
            <limit next-available="2012-06-22T15:38:17.956Z" unit="MINUTE" remaining="10" value="10" verb="GET"/>
        </rate>
    </rates>
</limits>

Full Feature Utilization Example

This configuration provides per-user limits, global limits, and an endpoint to query limit data.

rate-limiting.cfg.xml

<rate-limiting xmlns="http://docs.openrepose.org/repose/rate-limiting/v1.0"
    overLimit-429-responseCode="true" (1)
    datastore="distributed/hash-ring" (2)
    datastore-warn-limit="1200" (3)
    use-capture-groups="true"> (4)

    <request-endpoint (5)
        include-absolute-limits="true" (6)
        uri-regex="/limits/?"/> (7)

    <global-limit-group> (8)
        <limit
            id="global-resource-per-minute" (9)
            uri="a global resource" (10)
            uri-regex="/global/resource/?" (11)
            http-methods="GET" (12)
            value="100" (13)
            unit="SECOND" (14)
            query-param-names="filter"/> (15)
    </global-limit-group>

    <limit-group (16)
        id="customer-limits" (17)
        groups="group-one group-two" (18)
        default="true"> (19)

        <limit
            id="user-resource-one"
            uri="users resource one"
            uri-regex="/users/one/([^/]*)/?"
            http-methods="GET"
            unit="MINUTE"
            value="10"/> (20)

        <limit
            id="user-resource-two"
            uri="users resource two"
            uri-regex="/users/two/[^/]*/?"
            http-methods="POST"
            unit="DAY"
            value="2"/> (21)
    </limit-group>
</rate-limiting>

1	Specifies whether this filter should return a 413 or 429 response status code when a request is limited.
2	Specifies which datastore should be used.
3	Specifies the upper bound for the number of user-specific limits to store in the datastore before a warning is logged.
4	Specifies whether or not the content captured by regular expression capture groups should distinguish multiple, separate limits.
5	Specifies that the this filter should provide an endpoint where limit data can be queried.
6	Specifies that the origin service will supply absolute limits which this filter will use to enrich the response to limit data queries.
7	Specifies the regular expression used (i.e., matched against the URI) to determine whether or not a request is a limit data query request.
8	Specifies Global Limits, which are applied irrespective of user.
9	Specifies a unique ID which identifies this limit. No two limits may have the same ID, even if those limits are in different limit groups (including the global limit group).
10	Describes the request URIs for which this limit will be applied.
11	Specifies the regular expression used (i.e., matched against the URI) to determine whether or not the limit should be applied to a request.
12	Specifies the HTTP methods for which this limit will be applied.
13	Specifies the number of requests to allow through this filter before a limit is hit. After the limit is hit, all requests will receive a 413 or 429 response until the limit resets.
14	Specifies the unit of time after which the limit resets.
15	Specifies query parameter names used to determine whether or not the limit should be applied to a request.
16	Specifies group-based limits.
17	Specifies a unique ID which identifies this limit group. No two limit groups may have the same ID.
18	Specifies which user groups the limits in this limit group will be applied to. User groups are extracted from the `X-PP-Groups` header described in [Required Headers].
19	Specifies whether or not this is the default limit group. The default limit group will be used when no other limit group matches the request.
20	Defines a limit of ten HTTP GET requests per minute on resources located at `/users/one/*`. This means that a single user could make ten requests to both `/users/one/foo` and `/users/one/bar` within the span of one minute.
21	Defines a limit of two HTTP POST requests per minute on resources located at `/users/two/*`. This means that a single user could make two requests between `/users/one/foo` and `/users/one/bar` within the span of one minute.

Additional Information

Limit Groups

A limit group is just a group containing one or more limit(s). The motivation behind limit groups is to provide a way of applying different limits to different types of users. The limits that will be applied to the request are the limits in the limit group configured with a group that matches a value in the X-PP-Groups header. If no limit group matches the request, the default limit group will be used.

When a limit in a group is reached, succeeding limits within the same limit group will never be applied. In other words, the most restrictive limit will prevent updates to succeeding limits. Preceding limits within the group will continue to apply (i.e., continue to increment).

Only the limits in the first limit group to match the request are applied.

Limits

A limit is the number of requests per unit of time that are allowed to pass through Repose. Limits are applied selectively to requests matching certain criteria such as a request’s HTTP method and query parameters. Multiple rate limits that match the same request will all apply. A limit is reached when a user has sent a number of requests equal to the value of the limit within the span of one unit (configurable on the limit) of time. If a limit has been reached, this filter will prevent requests from passing through Repose. If a limit has not been reached, the request will pass through this filter. Limits will reset after the configured unit of time has passed.

Global Limits

Global limits are very similar to normal Limits, except that they are applied to all requests, irrespective of the user making the request. That is, global limits can protect the end service as a whole rather than just from individual users. For example, if a service is known to only handle up to 500 requests per second in total, a global rate limit can be set to 500 requests per second to prevent further requests from reaching and compromising the origin service. Any request which breaks that limit will receive a response with status code 503.

Global limits are not currently queryable via the request-endpoint.

Time Block Algorithm

This is the algorithm used to determine whether or not a limit has been reached. In this algorithm, a time unit is considered discrete and independent from other units. Therefore, only requests that occur during such a time unit are counted towards the limit. Once the end of a time unit is reached, the count toward a limit will be reset. This is in contrast to a leaky bucket rate limit, wherein time units are continuous.

It is important to note that, with this approach, it is possible that 2x - 1 (where x is the configured limit value) requests will pass through this filter over the period of one time unit. If that occurs, however, fewer requests will be allowed to pass through surrounding time units. Which is to say that, on average, the configured rate limit of x will be enforced.

Consider the following example. Five requests are allowed per minute for each time-block, but the one-minute window that lands in between the two units has seven requests. This shows that limits may be broken for a given window of time, however, across multiple units of time, the allowed requests average less than or equal to five requests per minute.

Querying Limits

This filter tracks limits by user, and consequently, requires a user be provided when querying limits. The exception is global limits, however, since global limits are not currently query-able, they will be omitted from this explanation. To query a limit, a GET request should be made to the request-endpoint. If no request-endpoint is defined, then querying limits is not possible. As always, requests must include the X-PP-User header, and may include the X-PP-Groups header. The limit details returned from the request-endpoint will be for the limits in the limit group that matches a provided group as they apply to the provided user.

The limit details themselves will be presented in either XML or JSON format. If no Accept header is present on the request, or if the Accept header has no value, the JSON format will be returned as the default. Otherwise, the returned format will be determined based on the Accept header specification defined in Section 14 of RFC 2616. If the limit query request does not accept at least one of the supported media types, a 406 response will be returned.

Take a look at the Querying Limits Example to see this feature in action!

Absolute Limits

The set of absolute limits is comprised of the set of limits defined in this filter’s configuration and the limits imposed by the origin service. These limits will only be returned if include-absolute-limits on the request-endpoint is set to true. Otherwise, only the limits defined in this filter’s configuration will be returned.

To retrieve limits from a origin service, a request will be made through the Repose filter chain to the origin service.

Distributed Datastore

If this filter is configured to use the Distributed Datastore, and the Distributed Datastore service is unable to communicate with other nodes (e.g., during a rolling upgrade to Repose), then the Distributed Datastore service will fall back on the local datastore. Consequently, rate limits may not be tracked correctly during the transition from the Distributed Datastore to the local datastore or vice-versa. To be more precise, when Distributed Datastore nodes are unreachable, limit data cannot be be retrieved, and so new limits are created in the local datastore.