Wiring Communication Between Microservices

Choosing a mean to connect microservices is never an easy task, many factors are taken into account before resorting to an option. If you are building a production-ready system, I guess the principle of weighing all factors hold true. Yes, I know this doesn’t apply to visionaries :)

In this article, I will run through some common communication means, briefly describe the background of our project, and my arguments on choosing RPC over the remaining options.

Before deciding on how we should wire our microservices, we have to understand two concepts:

  1. Architectural Style
  2. Transport Protocol

Architectural Style

Think about: How the payload is formed when consuming a service? Is it stateless or stateful? Should we use REST, SOAP, JSON, XML, or some other messaging formats?

Transport Protocol

Think about: Which transport protocol should we use? Should we call a remote service over HTTP, HTTP2, a message bus, TCP socket or even UDP?

Popular Communication Means

Let us look at some relatively popular options available:

  1. REST over HTTP(S)
  2. Messaging over Message Broker
  3. RPC (cross-language or single-language)

REST over HTTP(S)

Even since RESTful architecture style was proposed by Roy Fielding, we’ve been seeing a huge wave of adoption especially in web application development. The constraints proposed by Fielding despite not being a standard, shall always be adhered before declaring our API as RESTful.

There are variety of REST over HTTP(S), since there is no standard to be enforced. Developers are free to choose forming a request payload in JSON, XML or some self-defined format.

REST over HTTP(S) simply means using REST architectural style and send requests over HTTP(S).

Eg: JSON-RPC

Messaging over Message Broker

This basically works by connecting microservices to a centralize message bus and all communications between services are done by sending messages through the backbone.

Eg: Nameko in Python

RPC (cross-language or single-language)

Remote Procedural Call is not a new thing in distributive systems, it works by executing functions/methods/procedures on another device over the network.

According to the standard of RPC, RPC 5531:

  1. RPC should be transport protocol agnostic: TCP, UDP, egal! Thus, reliability is not guatanteed.
  2. Transaction ID is used to insure execute-at-most-once semactics and to allow client application to match replies to calls.
  3. Time-outs and reconnection required to handle server crash, even if a connection-oriented protocol (TCP) is used.
  4. Does not specify binding of services and clients, up to implementer to decide.
  5. Mandatory requirements for RPC implementation: (1) Unique specification of a procedure to be called. (2) Provisions for matching response messages to request messages. (3) Provisions for authenticating the caller to service and vice-versa

Eg: gRPC, RPyC

Background of The Project

In the organization we work for, we have a monolithic web application (written in Django) with acceptable performance. There are some services can be decoupled as separate services. I was taking the initiative of transforming our system architecture into microservice architecture in a gradual approach. One of the important aspects is to decide communication mean(s).

Why I chose RPC over the other two popular ones?

Take a look at RPC before ruling them out. I’ve read articles and comments advocating a replacement of RPC with REST. Some argued RPC is a stone-age technology, some said RPC is simple not easy to use. My stance is neutral, as the choice depends on individual use-case.

These are our main requirements:

  1. No single-point-of-failure -> This ruled messaging queue out
  2. Errors are propagated back to caller/client/consumer
  3. Service interface which provides native experience

Since errors propagation to callers is important to us, RPC is a good candidate as many RPC frameworks return any exception raised in server function back to RPC function caller.

Most RPC frameworks eliminate the need of message broker, thus, single-point-of-failure is avoided.

Most RPC frameworks allow remote procedural call like:

import my_remote_functiontry:
    my_remote_function.validate_user(my_user)
except ValueError as e:
    logging.error(e.message)

Given my scenario, what’s the better option than RPC?

Categories of RPC Frameworks

Within my course of exploring different RPC frameworks. I roughly categorize them into:

  1. Monolingual RPC frameworks
  2. Cross-language RPC frameworks

Monolingual framework, well, supports only a single programming language. A good candidate of such category in Python is RPyC. RPyC comes with easy-to-use pretty standard RPC features, and uses TCP as its transport protocol. The pros of using RPyC (monolingual framework) is the absence of need to write a separate service interface. The downside is insufficient support of different Python version, of course, missing support of cross-language as its name suggest.

On the other hand, cross-language RPC frameworks supports multiple programming languages with a great cost. gRPC is one of the frameworks I uses. gRPC supported by Google, comes with wide coverage of programming languages from C++, Ruby, Python to Dart. In order to support multiple programming languages, a common service contract has to be defined. It is usually a protocol buffer (.proto) file. A service contract defines functions with arguments provided by the servers and to be consumed by clients, as well as the message format to be transported. In the case of gRPC, a protocol buffer file is then compiled into language-specific file (Eg: .py file in Python), this creates a problem when you started to have multiple versions of service contracts. It makes us difficult to keep track of different versions of client stubs and service functions.

Conclusion

To wrap up, communication mediums are chosen from use-case to use-case, these are basically my humble considerations on selecting a communication medium between microservices. Feel free to suggest any improvement of my choice.

First published on 2018-09-09

Republished on Hackernoon

Comments (5)

Hipkiss's photo

#deepstream.io

Siddarthan Sarumathi Pandian's photo

Did you have a chance to explore the AMQP protocol? Melvin Koh

No single-point-of-failure -> This ruled messaging queue out Why are queues a single point of failure? You can set up automatic retries with a broker like RabbitMQ. Errors are propagated back to caller/client/consumer You can use acknowledgments for this using the brokers as well.

Show all replies
Sébastien Portebois's photo

Trying to learn something new everyday

Message queue is a single point of failure I would really say the opposite. When you plan for production, your message queue will need to support somne sort of clustering for dealing with failure. For instance (because AMQP was mentioned) RabbitMQ offers tons of knobs to control your durability/replication vs performance, and it’s one of the most stable pieces of infrastructure I ever had to maintain in production. But others solutions (MQTT, Kafka, ...) are there really to give you a durable and resilient central bus.

In my opinion though, I would consider messages bugs (or distributed logs) in a separate category than RPC or transaction-based things like REST. Message queues are really interesting when you want to decouple the load or parallelize task (you can build really awesome topologies with Rabbit or Kafk), to make the public-facing (with many definitions of public) endpoints as lightweight as possible, and smooth out and parallelize the work at the other end with workers. What I mean is that both have strengths, and should be used depending on the context. And as your platform grows, it might make sense at some point to use more than one paradigm.