.tech Podcast - What's new in NATS?

Blogs· 4min June 14, 2023

Byron is the Director of Developer Relations at Synadia. He explains what event-driven architectures are and how they can help build more resilient systems. Then, he covers the fundamentals of NATS and gives us a peek into upcoming features, which include the new Synadia Control Plane project.

Byron Ruth is Director of Developer Relations at Synadia, who are the maintainers of NATS.io. Byron is a long time NATS user and has a background in health tech. He has extensive experience developing data pipelines, integrating data, ETL and building applications. As he got more and more involved with the NATS community, the opportunity to join the team and advocate for the technology he really believed in was a no-brainer.

Introduction to event-driven architectures

Byron explains that event-driven architectures are all about the inversion of how information flows through a system. When you think of point to point interactions, you will typically think of a call stack. The second you introduce a network and multiple services, you need to concern yourself with decoupling space and time through asynchrony.

One example is a simple CRUD application, which has an API and creates or updates a record in the database. Even if this system does not record an event, something happened which mutated the state of your data. At this point, you don't have any historical reference or values of the state of the data. In an event-driven approach, you can still update state in place, but you record an event out of band. Then the processing service responsible for actioning the work can pick up the event, and perform the work when it's got the availability to do it.

The happy path happens immediately, within milliseconds, but the introduction of event-driven architectures allows us to decouple complex systems. Fundamentally, the inversion of how information flows through the system means that we don't broadcast information out synchronously and avoid failure domains. Event-driven architectures don't necessarily have to have higher latencies. There are some roundtrip times that must take place over TCP connections, but the removal of dependencies between components amortises the cost over time. The latencies can even be less perceivable by the user.

Request-Reply interactions are what everyone associates with the web and they must happen immediately to ensure a good user experience. Everything behind the scenes can be asynchronous and event-driven, but design decisions must be made according to the expectations of your system. A purely synchronous call stack will lead to a more fragile system that will require layers of backoffs and retries. Inverting the information flow through event-driven architectures will lead to much higher flexibility.

Common communication patterns

Byron references the book "Enterprise Integration Patterns" by Gregor Hohpe and Bobby Woolf, which was published 20 years ago. In this book, Gregor explains that when you design a system there are typically two components that need to talk to one another. All the work typically focuses on the components, but all the design decisions are actually in the communication between those components.

Byron mentions a few communication patterns:

  • Request-Reply defines an expected recipient of a message and the sender expects a reply back. Behind the scenes, especially with HTTP-based systems, a load balancer will distribute work in order to service that request.
  • One to Many or Fan Out broadcasts a single message to many subscribers. You will typically need a message broker that can handle the publish-subscribe semantics.
  • Similarly, the Many to One or Fan In approach defines one single consumer of a plethora of different types of events from different publishers.
  • Queue Group style of communication defines a single requestor which then sees the work distributed to multiple members of the group.

Many people will be familiar with Amazon SQS and Amazon SNS as event-driven technologies.

Introduction to NATS

Byron tells us that NATS is a high performance, open-source technology, enabling global connectivity of services and data, spanning from cloud to edge. He will unpack every single component of this one-line definition for us.

NATS was originally designed in 2011 by Derek Collison, who is also the founder of Synadia. It was designed as a low latency, high performance messaging broker to power Cloud Foundry. There was no message persistence, so it was Fire and Forget communication.

One of the key design decisions of NATS was the notion of Subject Based Addressing which makes it possible to connect components without needing to know the addresses of the other components in the system. In NATS, you address a message by a subject. Subscribers can then subscribe to subjects, which also offer wild card support. Conceptually, you get a location transparent system, without needing to pre-create them ahead of time.

NATS is a single 16MB Go binary with no other external dependencies. This makes it possible to embed it in other programs or deploy a supercluster spanning the entire globe. Leaf nodes run the same binary, but in a mode that is suitable to edge locations that might be disconnected often. Examples of these kinds of use cases are oil rigs, low orbit satellites, automobiles, etc. NATS creates a hub and spoke model that allows you to extend to the edge.

Persistence was added using the JetStream project. It is part of the same binary as NATS and allows you to create streams and consumers. Streams bind one or more subjects together and their responsibility is to persist their messages, ensuring that the messages are no longer dropped if no consumer is available. Consumers are fundamentally a view of the messages in a stream. They allow you to perform server side filtering of messages, ensuring that you only spend time delivering the messages that are relevant to the subscribing service. Key-Value abstractions and Object-Store abstractions are also built on top of stream primitives.

A stream has two internal implementations for persistence: file store and in-memory store. When you create a stream, it provides a variety of configuration options for retention policy and replication factor. You can specify the number of replicas for your data and NATS will implement the RAFT protocol. This is a purpose built implementation that relies on the NATS messaging core. The RAFT protocol provides guarantees for outage toleration without message loss. The RAFT protocol only requires a quorum of replicas in order to successfully write data whenever possible. Topology design suggestions for NATS are typically a cluster with servers spread across availability zones within a region. You can deploy the cluster across multiple regions if you want to tolerate an entire region going offline.

New features

Byron shares some upcoming NATS features in the 2.10 release:

  • The newly built auth callout allows to hook into custom identity providers alongside the NATS decentralised authorisation model. The response from the auth callout dynamically generates JWT tokens from the company specific identity providers.
  • V2 networking or routing is a performance improvement. It will make it possible to route traffic through a small number of TCP connections by allowing services to be pinned to specific TCP connections.

The Synadia Control Plane was recently announced GA. It provides an easy way to get started with NATS through a unified control plane with a single UI that is easy to manage. It includes system wide observability of servers as well as client connections, as well as best practice alerting.

NATS is open-source and you can see how to get involved on the contributor guide. Byron got started with NATS this way himself. The NATS community is very active on Slack as well, where you can get any questions answered.

Here are some further resources from Byron where you can learn more about NATS:

Written by

github-icongithub-icongithub-icon
Adelina Simion Technology Evangelist

Adelina is a polyglot engineer and developer relations professional, with a decade of technical experience at multiple startups in London. She started her career as a Java backend engineer, converted later to Go, and then transitioned to a full-time developer relations role. She has published multiple online courses about Go on the LinkedIn Learning platform, helping thousands of developers up-skill with Go. She has a passion for public speaking, having presented on cloud architectures at major European conferences. Adelina holds an MSc. Mathematical Modelling and Computing degree.