Rate-limiting (rate-limit)
The rate-limit component enables developers to limit access to resources such as API endpoints and background workers.
The component implements the token bucket algorithm , which provides high performance and high flexibility.
Configuring the rate limiter
When using the rate-limit component you pass the wanted configuration on every call, giving you full flexibility to dynamically configure how you enforce limits.
The configuration lets you configure:
- Capacity: what is the maximum amount of rate-limit tokens this rate-limiter can have.
- Refill rate: how many tokens to refill on every internal.
- Refill interval: how often to refill the amount configured in refill rate (default is one second).
Here is an example configuration:
import { Temporal } from "temporal-polyfill-lite";
// Capacity: 10, refill rate: 1, refill interval: 1 second.
const config = {
capacity: 10,
refillAmount: 1,
refillInterval: Temporal.Duration.from({ seconds: 1 }),
};Using the rate limiter (rate-limit.limit)
To use the rate limiter you call the rate-limit.limit operation with a configuration like so:
import { Diom } from "diom";
import { Temporal } from "temporal-polyfill-lite";
const client = new Diom("AUTH_TOKEN");
const key = "user:42:api";
const config = {
capacity: 10,
refillAmount: 1,
refillInterval: Temporal.Duration.from({ seconds: 1 }),
};
const out = await client.rateLimit.limit({ key, config });
if (out.allowed) {
console.log(`Request allowed, ${out.remaining} tokens remaining`);
} else {
const retryMs = out.retryAfter?.total("millisecond") ?? 0;
console.log(`Request denied, retry after ${retryMs}ms`);
}You can optionally consume more than 1 token per request:
const out = await client.rateLimit.limit({ key, config, tokens: 10 });This is useful for when you have different “weights” to different requests (some are more expensive than others).
Get remaining tokens (rate-limit.get-remaining)
You can use the rate-limit.get-remaining function to get the amount of tokens remaining without consuming any.
// Peek at the remaining tokens without consuming any.
const remaining = await client.rateLimit.getRemaining({ key, config });
const retryMs = remaining.retryAfter?.total("millisecond") ?? 0;
console.log(`Tokens remaining ${remaining.remaining}, retry after ${retryMs}ms`);Reset the bucket to capacity (rate-limit.reset)
To reset the bucket back to full capacity, call the rate-limit.reset operation:
// Reset the bucket back to full capacity.
await client.rateLimit.reset({ key, config });Examples
Here are a few examples of common use-cases when implementing rate-limiting.
Allow temporary bursts over the limit
A common use-case with rate-limiting is to allow temporary bursts (that go over the allowed rate-limit), but still enforcing the rate-limit on an ongoing basis.
To achieve it you would want to configure the capacity to the maximum spike capacity, and the refill rate to your desired limit. For example, if you want to allow for 100 tokens per second with spikes of 200 tokens, you can configure capacity to be 200 and refill rate to 100.
This works because it lets users consume 200 tokens, but the refill happens much more slowly, which means that under load they would be limited to 100 tokens per interval, but if the tokens had enough time to replenish to capacity, they would be able to go over the limit.
Example configuration:
const config = {
capacity: 200,
refillAmount: 100,
refillInterval: Temporal.Duration.from({ seconds: 1 }),
};Tiered rate-limits
A common pattern is to have different rate-limits depending on the pricing tier. Because the rate-limit configuration is passed on each request, to achieve it with Diom you can just pass a different configuration based on the tier.
For example:
const config = tier === "free"
? { capacity: 50, refillAmount: 50, refillInterval: Temporal.Duration.from({ seconds: 1 }) }
: { capacity: 200, refillAmount: 100, refillInterval: Temporal.Duration.from({ seconds: 1 }) };
// Now use it with rateLimit.limit...Multiple rate-limits
Sometimes you may want to have multiple rate-limits checked for a specific request. For example, you may have user-wide and organization-wide rate-limits in your service. Diom will soon add a built-in operation to support that, but until then you can use multiple rate-limit calls to emulate this behavior.
// Check user rate limit first
const userOut = await client.rateLimit.limit({ key: userKey, config: userConfig });
if (!userOut.allowed) return;
// Then check org rate limit
const orgOut = await client.rateLimit.limit({ key: orgKey, config: orgConfig });
if (!orgOut.allowed) return;
// Request allowed!Smoothening rate-limiting (avoiding spikes)
A common issue with rate-limiting implementations is that traffic tends to be very spiky. The tokens are refilled every time an interval passes, which means that under load, most of the requests will happen at the window edges.
This can be solved using the token bucket algorithm by setting the configuration a bit differently. For example, if you want a rate-limit with 100 tokens that refills 100 tokens every 1 second, you can instead configure it to refill 1 token every 10 milliseconds. This still gives you a rate of 100 a second, but the refill happens more smoothly throughout the desired one second internal.
// Do this:
const smooth = { capacity: 100, refillAmount: 1, refillInterval: Temporal.Duration.from({ milliseconds: 10 }) };
// Instead of this:
const spiky = { capacity: 100, refillAmount: 100, refillInterval: Temporal.Duration.from({ seconds: 1 }) };Approximating the fixed window algorithm
The token bucket algorithm allows you to implement an equivalent to the fixed window algorithm. All you need to do is to set the capacity to the same same value as the refill amount, and set the refill interval to the window size.
For example:
// Capacity of 10 and 10 tokens are refilled every 1 second.
const config = { capacity: 10, refillAmount: 10, refillInterval: Temporal.Duration.from({ seconds: 1 }) };