Controlled chaos for testing your microservice

If you run a distributed application you know that even you have 100% code coverage, integration and acceptance tests, there are components you cannot control. You cannot control hardware failure, you cannot control network latency, DNS failure and so on.

Today’s post is about a library that can help testing your Spring Boot application with production scenarios: Chaos Monkey. The entire documentation starts here.

Overall, it’s very detailed, with many examples, but we found one important aspect missing: the Chaos Monkey endpoints require Spring Boot Actuator dependency. Anyway, if you want to inspect how we integrated it in a Spring Boot app, implementation is available here. The README file describes all steps for compiling and running this demo.

Let’s say you cloned the repo and you started it. Next step is to enable a Chaos Monkey attack and then to make several calls.

curl -X POST 'http://localhost:8080/actuator/chaosmonkey/enable'
Chaos Monkey is enabled

You’ll see that some requests are done immediately while some takes more to be done. This is an example where latency is added by Chaos Monkey:

time curl -X GET 'http://localhost:8080/sum'
Hello! Chaos monkey stage. Sum of first 2863572 numbers is:  4100020867806. Took: 7
real    0m4.167s
user    0m0.008s
sys     0m0.008s

You can add other testing scenarios by calling endpoints provided by this framework (doc here).

In the end, 2 advices: don’t use this framework in production and don’t use it without discussing with your discussing first in your team. Otherwise, your experiments could affect others!

As always, we want to hear your comments!