In that link above I have created a github repo with various things that I have been reading, working or showing interest in these days of covid19.

They are a lot of homework, I know. Lately I have been reading more about how to try to evolve the microservices architecture to see if we can leave REST web services behind as an entry point for such microservices.

The main reason is that I maintain that a web server by definition is synchronous, you invoke it, you wait and the web server will return a result and response code, 200, 404, 500, etc …

The market sells you as microservices as an asynchronous architecture pattern and that allows high scalability, something that is not totally true because to try to achieve the illusion of being asynchronous, we use event engines (rabbitmq, kafka, zeromq) or messaging to convey the state of the current invocation of a microservice, such as accepted state, or rejected state, etc.

Since we use asynchronous event engines, they sell us that microservices are asynchronous even though we have to use multithreaded computing with Threads CompletableFuture to do this by wrapping the REST request in a thread. I admit it is ingenious, but my argument is that it is unnecessary to do a REST invocation when we can do the same using the same messaging engine, which doesn’t have those problems with the limit of possible invocations, and if it does it is much, much higher than a web server. To do this, I encourage you to do your tests to see how many requests per second your microservice can support behind a web server. Take the test without a load balancer or k8s or open shift behind it. Keep that number. Please do not do a test with the typical controller that returns a Hello World, do it with one of your microservices, a real one that has to do something useful. Then, I encourage you to modify your microservice a little so that it can be invoked by sending a message to a queue with the appropriate payload to start said execution. Choose the messaging technology that you like the most, kafka, rabbit, zero, etc …
When you have it, make a script that sends at least as many messages as the number it gave you before in the stress test using REST. Then go up, until you find the limit. The number is much higher than what a web server can manage? Don’t be too surprised, it’s normal.

Now think that these messaging systems are designed to be naturally asynchronous and that they scale naturally adding nodes because their effective load will be divided equally, we can even create partitions within each mailbox.

At each invocation, the microservice will do what it has to do and write a message with its current state so that another microservice (choreography pattern) or a state manager (orchestration pattern) read that state and decide to act. This action may be to continue executing your business logic or decide to rollback in the previous states so as not to leave any data hanging. The case is that all this is done invoking the different microservices in the way of REST, because these services need us to be able to do RETRY (Eureka-Ribbon) in case for some reason an invocation has failed or we will need to make a quick exit from the circuit or CIRCUIT BREAKER (Hystrix) in case we have incorrectly invoked said microservice and we will also need that the microservice itself can be discovered or presented to the rest of microservices and we will also need to be able to check its health status, something that day to day we can do with Eureka. We can also use the state of the art in the form of apache Istio to have all those necessary characteristics to guarantee correct operation in a microservices architecture, such as discovery, retry, quick exit in case of error and health check of said microservices, but said state of the art is totally focused on the REST style to invoke said microservices.

The point I’m going to is that we need the same thing but for a style that allows a totally asynchronous invocation leaving behind the REST style, which is already seen the seams as soon as we reach its fixed limit of requests per second that each web server can manage . In addition we need something that is capable of automatically lifting new containers in case the number of invocations per second reaches a critical number so that we can be prepared to meet that future demand and vice versa, stop containers when according to rules we can determine that so many containers are not necessary, because it will not be the same to attend to requests when the United States is sleeping than when it is not.

The question is, how do we do it? As I see it, this should be at the k8s or open shift level, since they are in charge of managing the containers (pods) that make up each node of the messaging system, and at the end of the day, this tries to do pass all http traffic to pure tcp, and we need to be able to encrypt it from point to point, as well as be able to scale it, auto partitioning, do discovery, health check, retry and fail fast, though, I suspect at least it wouldn’t be necessary to do retry because in the event that a message has not been processed, that is, it has been queued but not processed, as soon as the messaging system returns to action, it would deliver the message, it would not be necessary to retry the delivery. The message is already delivered and by checking the health of the service, we can decide to act. In the best case, such a system could decide that since I have an available node to satisfactorily deliver said message, I deliver it to him and the logic of the microservice will decide that an idempotent message has already been delivered and we can discard it quickly. Should it be at the level of a tool next to k8s or openshift? in the style of how Istio works? probably.

Do I forget something? have I made a mistake in my logic?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s