Spring boot API Throttling/rate-limiting

Tarun Kumar
4 min readAug 5, 2021

Throttling or rate limiting for your API is to protect your resources and to protect from an attack. What throttling does is that it limits the number of requests your endpoint or an API can serve.

There are multiple consideration while implementing throttling

Distributed application : Is your application distributed, or is your requirement to have single throttling value for an endpoint across all the distributed nodes.

For example you have a login endpoint lets say /login, now your requirement is to have only N number of request at a time across all nodes. In this case you will need a central system to manage the state and counter. For this you can use solutions like Redis etc, or another solution is push this API level concern to API gateway refer this

Client Specific throttling : Is your requirement that for a specific client (lets say IP address) only N number of requests are allowed. In this case also you would beed a central system to manage the state. Central system could be a Redis like above or a local cache store to track the number of request for a specific client.

A good solution will be to implement throttling in the load balancer or in the API gateway, however in some case you want to implement the solution nearer to the API as you want to control the throttling value based upon different business use-cases and throttling being more closer to the code can provide more fine grained control.

Below is a simple example of implementing a rate limiter using spring boot by leveraging interceptors and semaphore. In most cases the basic api rate limiter should be sufficient to protect the application level resources and avoid any potential attack on any public facing APIs

First we will define the end-point that needs to be protected. Consider a retail banking application that has an endpoint to show all type of credit card products available

@RestController
public class BankingController {
private static final String ALL_CREDIT_PRODUCTS = "/credit-card-products";
private static final Logger logger = LoggerFactory.getLogger(BankingController.class);
@Autowired
private CreditCardProductService cardProductService;


@PublicAccess
@ApiOperation("Get all Credit Card products")
@RequestMapping(value = ALL_CREDIT_PRODUCTS, method = RequestMethod.GET, produces = "application/json")
public Response<List<Product>> creditCardProducts() {
List<Product> creditCardProducts = cardProductService.retrieveAllProducts();
return new Response<List<Product>>(creditCardProducts);
}


}

As a bank can have lot of credit card products as offering and these products would have lot of minor or major informations to be shown to customer. Any potential attack for a million requests at same time on this public endpoint will consume lot of resources. So introducing a basic rate limiter on this API would be a good business case.

Lets create a customer annotation to define the rate limit for an endpoint and also the key.

/**
* Annotation for Controller methods to indicate the rate limit for an endpoint
* as a mitigation for the brute force attacks
*
*/
@Retention(RUNTIME)
@Target(METHOD)
public @interface RateLimit {

int limit() default 100;

String key() default "";
}

Why Key is defined here ? This key is similar to key in a cache for lookup and storage. Key can be as simple as requested client IP or can be just the URL. Below we will be setting the key as URL , and limiting the request at a time for that URL for a configured requests.

After using RateLimit, the controller code would be like this. We are limiting the number of request to 10 at same time for a key that is endpoint url

@RestController
public class BankingController {
private static final String ALL_CREDIT_PRODUCTS = "/credit-card-products";
private static final Logger logger = LoggerFactory.getLogger(BankingController.class);
@Autowired
private CreditCardProductService cardProductService;


@PublicAccess
@ApiOperation("Get All Credit Card products")
@RateLimit(limit = 10, key = ALL_CREDIT_PRODUCTS)
@RequestMapping(value = ALL_CREDIT_PRODUCTS, method = RequestMethod.GET, produces = "application/json")
public Response<List<Product>> creditCardProducts() {
List<Product> creditCardProducts = cardProductService.retrieveAllProducts();
return new Response<List<Product>>(creditCardProducts);
}

}

RateLimit config can be defined as this, here we are using an interceptor to implement throttling

@Configuration
public class RateLimitConfig extends WebMvcConfigurerAdapter {
private final Logger logger = LoggerFactory.getLogger(RateLimitConfig.class);

@Override
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(new RequestRateLimiter());

}
}

Now, the RequestRateLimiter can be defined as simple as like this using semaphore cache maintained in JVM.

/**
*
* Provides protection from thread pool starvation or potential attack by rejecting calls to the endpoints annotated with RateLimit
* if ther're too many requests to such endpoints are currently being processed
* The Semaphore is defined with configured permits at throttled endpoints
*/
public class RequestRateLimiter extends HandlerInterceptorAdapter {
private static final Logger logger = LoggerFactory.getLogger(RequestRateLimiter.class);

private final ConcurrentHashMap<String,Semaphore> localSemaphoreCache;

RequestRateLimiter() {
this.localSemaphoreCache = new ConcurrentHashMap<>();

}

@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
if (!(handler instanceof HandlerMethod)) {
return true;
}

HandlerMethod handlerMethod = (HandlerMethod) handler;

RateLimit rateLimit = handlerMethod.getMethod().getAnnotation(RateLimit.class);
if (rateLimit != null) { // target method annotated
Semaphore localSemaphore = computeIfAbsent(rateLimit.key(), rateLimit.limit());
if (!localSemaphore.tryAcquire()) {
logger.warn("Too many calls to {}: {}", handlerMethod.getMethod().getDeclaringClass().getName(), //
handlerMethod.getMethod().getName());
throw new RuntimeException( "Too many calls");
}
}

return true;
}

private Semaphore computeIfAbsent(String key, int rateLimit) {
return localSemaphoreCache.computeIfAbsent(key, keyKey -> new Semaphore(rateLimit));
}

@Override
public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) {
if (handler instanceof HandlerMethod){

HandlerMethod handlerMethod = (HandlerMethod)handler;
RateLimit rateLimit = handlerMethod.getMethod().getAnnotation(RateLimit.class);
if (rateLimit != null) { // target method annotated
Semaphore localSemaphore = localSemaphoreCache.get(rateLimit.key());
if (localSemaphore != null) {
localSemaphore.release(1);
}
}
}
}

}

As described, there are multiple ways of implementing throttling, the above is a simple one. This solution can be extended by using distributed cache instead of a JVM level cache in a distributed application. In most cases, the above code will meet most of the application level throttling requirements. For a more controlled solution across your micro-service architecture would be API gateway.

--

--

Tarun Kumar

Technologist, who believe in learning and sharing