An Introduction to Amazon EventBridge – Software Architecture for Building Serverless Microservices

An Introduction to Amazon EventBridge

Amazon EventBridge is a fully managed serverless event bus that allows you to send events from multiple event producers, apply event filtering to detect events, perform data transformation where needed, and route events to one or more target applications or services (see Figure 3-39). It’s one of the core fully managed and serverless services from AWS that plays a pivotal role in architecting and building event-driven applications. As an architect or a developer, familiarity with the features and capabilities of EventBridge is crucial. If you are already familiar with EventBridge and its capabilities, you may skip this section.

Figure 3-39. The components of Amazon EventBridge (source: adapted from an image on the Amazon EventBridge web page)

The technical ecosystem of EventBridge can be divided into two main categories. The first comprises its primary functionality, such as:

  • The interface for ingesting events from various sources (applications and services)
  • The interface for delivering events to configured target applications or services (consumers)
  • Support for multiple custom event buses as event transportation channels
  • The ability to configure rules to identify events and route them to one or more targets

The second consists of features that are auxiliary (but still important), including:

  • Support for archiving and replaying events
  • The event schema registry
  • EventBridge Scheduler for scheduling tasks on a one-time or recurring basis
  • EventBridge Pipes for one-to-one event transport needs

Let’s take a look at some of these items, to give you an idea of how to get started with EventBridge.

Event buses in Amazon EventBridge

Every event sent to EventBridge is associated with an event bus. If you consider EventBridge as the overall event router ecosystem, then event buses are individual channels of event flow. Event producers choose which bus to send the events to, and you configure event routing on each bus.

The EventBridge service in every AWS account has a default event bus. AWS uses the default bus for all events from several of its services.

You can also create one or more custom event buses for your needs. In addition, to receive events from AWS EventBridge partners, you can configure a partner event source and send events to a partner event bus.

Event routing rules – Software Architecture for Building Serverless Microservices

Event routing rules

The rules you create in EventBridge are the logic behind the filtering and routing of events that you associate with an event bus. These rules are effectively part of your application logic, and are designed, documented, deployed, and tested as such. A rule comprises three parts: the event filter pattern, event data transformation, and the target(s).

To filter an event in and send it to a target, you configure an event pattern as your filter condition. The sample pattern in Example 3-2 will match events like the one in Example 3-1 based on the domain, service, type, and payment_type attribute values.

Example 3-2. An example event filter pattern
{
“detail”
:
{
“metadata”
:
{
“domain”
:
[
“ecommerce”
],
“service”
:
[
“service-payments”
],
“type”
:
[
“payment_received”
]
},
“data”
:
{
“payment_type”
:
[
“creditcard”
]
}
}

}

As part of each rule, you can perform simple data transformations. At the time of writing, for each rule you can add up to five targets to send matching events to.

An important fact to keep in mind is that EventBridge guarantees at least once delivery of events to targets. This means a target may receive an event more than once (i.e., it may receive duplicate events). You will learn how to handle this situation later in the chapter.

Event archiving and replay

In EventBridge, you can store events in one or more archives. The events you archive depend on the event filter pattern. For example, you could create an archive to store all the events that match the pattern shown in Example 3-2.

You can create multiple archives to cater to your needs. Then, based on your business requirements, you can identify the events within your bounded context that need archiving and send them to the appropriate archives using different filter conditions. Unless there is a specific requirement to archive all the events, keep your archives as lean as possible as a best practice. Figure 3-40 shows a comparison of the different approaches for a better understanding.

To replay events from an archive, you specify the archive name and the time window. EventBridge reads the events from the archive and puts them onto the same event bus that originally emitted them. To differentiate a replayed event from the original event, EventBridge adds a replay-name attribute.

Figure 3-40. Different event archiving approaches, from least to most favored

Event schema registry – Software Architecture for Building Serverless Microservices

Event schema registry

Every event has a structure, defined by a schema. EventBridge provides the schema for all the AWS service events, and it can infer the schemas of any other events sent to an event bus. In addition, you can create or upload custom schemas for your events.

Schema registries are holding places or containers for schemas. As well as the default registries for built-in schemas, discovered schemas, and all schemas, you can create your own registries to provide groupings for your schemas.

Tip

EventBridge provides code bindings for schemas, which you can use to validate an event against its schema. This is useful to protect against introducing any breaking changes that might affect the downstream event consumers.

EventBridge Scheduler

EventBridge Scheduler is a way to configure tasks to be invoked asynchronously, on a schedule, from a central location. It is fully managed and serverless, which allows scheduling of millions of tasks either for one-time invocation or repeatedly. The schedules you configure are part of your architecture.

The EventBridge Scheduler can invoke more than 270 AWS services; it has a built-in retry mechanism and a flexible invocation time window.

EventBridge Pipes

Earlier, we discussed using EventBridge routing rules to filter events and send them to multiple targets. EventBridge Pipes, on the other hand, builds a one-to-one integration pipeline between an event publisher and a subscriber. Within a pipe, you have the option to perform event filtering, data transformation, and data enrichment (see Figure 3-41). This is quite a powerful feature, and it reduces the need for writing custom code in many use cases.

Figure 3-41. A representation of EventBridge Pipes integration between an event source and its target (source: adapted from an image on the Amazon EventBridge Pipes web page)

Event producers and event publishing best practices – Software Architecture for Building Serverless Microservices

Event producers and event publishing best practices

Event producers are applications that create and publish events. As you develop on AWS, you publish your events to one of your custom event buses on Amazon EventBridge. Here are some best practices for event publishers to follow:

Event publishers should be agnostic of the consumers of their events.

One of the golden rules in event-driven architecture is that event producers remain agnostic of the consumers. The event producers should not make assumptions about who or what might consume their events and tailor the data. This agnosticism lets you keep applications decoupled—one of the main benefits of EDA.

Tip

In its pure form, consumer agnosticism suggests the use of a publish-and-forget model. However, the reality, as you develop granular microservices in serverless, can be different. There will be situations (still within the loosely coupled services construct) where a publisher may want to know the outcome of the handling of an event by a downstream consumer so that it can update its status for recordkeeping, trigger an action, etc. The event types listed in “Differentiating event categories from event types” can be indicators for this purpose.

Every event should carry a clear identification of its origin.

The details of the domain, service, function, etc., are important information to identify the origin of an event. Not all events need to follow a strict pattern of the hierarchy of their origin, but it benefits cross-domain consumers to set the event filters as part of consumption.

In a secured and regulated environment, teams apply event encryption measures to protect data privacy. Often, third-party systems sign the event payload, and consumers perform IP address checks to validate the event origin before consumption.

Treat domain events as data contracts that conform to event schemas.

With distributed services, event producers should conform the events to the published schema definitions, treating them as the equivalent of API contracts.

Versioning your events is essential to avoid introducing breaking changes.

Event producers should adhere to an agreed structure for uniformity across the organization.

As discussed earlier, uniformity in the event structure at the organizational, domain, or department level helps make the development process smoother in many ways.

It may be challenging to create a standard format for your events at the outset. You can evolve it as you gain experience and learn from others. Allow flexibility within the overall design to enable teams that need to accommodate information specific to them to do so.

An event should carry just the required data to denote the occurrence of the event.

Often it takes time to decide on the content of an event. If you follow the structure shown earlier, with metadata and data sections, start with the metadata, as you may already have clarity on most of those fields.

Begin from the context of when and where the event occurred, and build from there. It’s a good practice to include a minimal set of shareable data that is just enough to understand the event as an entity.

Event producers should add a unique tracing identifier for each event.

Including a unique identifier that can travel with the event to its consumers improves your application’s tracing capabilities and observability.

Be aware of the event payload size limit and service quota.

The maximum payload size of an event in Amazon EventBridge is 256 KB (at the time of writing). In high-volume event publishing use cases, consider the limit on how many events you can send to EventBridge per second, and have measures in place to avoid losing critical events if you exceed this limit.

Tip

When you publish events with sensitive data, you can add a metadata attribute—say, severity—to indicate the level of severity of the risk of this data being exposed, with values like RED, AMBER, and GREEN. You can then implement logic to prevent certain subscribers from receiving high-severity events, for example.

The gatekeeper event bus pattern described in Chapter 5 can make use of the severity classification of events to consider encryption measures when sharing events outside of its domain.

Event consumers and event consumption best practices – Software Architecture for Building Serverless Microservices

Event consumers and event consumption best practices

Event consumers are applications on the receiving end. They set up subscription policies to identify the events that are of interest to them. As you get started with Amazon EventBridge, here are a few tips and best practices for event consumers to keep in mind:

Consumer applications may receive duplicate events and should be idempotent.

In event-driven computing, in the majority of cases, the event delivery is guaranteed to be at least once (as opposed to exactly once or at most once). If you don’t properly account for this, it can cause severe consequences. Imagine your bank account getting debited twice for a purchase you made!

Building idempotency into an application that reacts upon receipt of events is the most critical measure to implement.

Storing the event data while processing it has benefits.

Depending on the event consumer’s logic, the event processing may happen in near real time or with a delay. A practice often adopted by event subscribers is to store the event data—temporarily or long-term—before acting on it. This is a form of storage-first pattern, which you will learn about in Chapter 5.

There are several benefits to this practice. Primarily, it helps to alleviate the problem of events potentially being received more than once by providing a register that can be checked for duplicates before handling an event. In addition, storing the events eases the retry burden on the consumer application; if a downstream application goes down, for example, it won’t need to request that the producer resend all of the events that application needs to process.

Ordering of events is not guaranteed.

Maintaining the order of events in a distributed event-driven architecture with multiple publishers and subscribers is hard. EventBridge does not guarantee event ordering. If the order of events is crucial, you’ll need to work with the event producers to add sequence numbering. If that’s not possible, subscribers can implement sorting based on the event creation timestamps to put them into the correct order.

Avoid modifying events before relaying them.

There are situations where applications engage in an asynchronous chain of actions known as the event relay pattern: the service receives an event, performs an action, and emits an event for a downstream service. In such situations, the subscriber should never modify the source event and publish a modified version. It must always emit a new event with its identity as publisher and the schema it is responsible for.

Collect events that failed to reach the target consumer.

In a resilient event-driven architecture, a consumer may have all the measures it needs to process an event successfully—but what happens if the event gets lost in transit and does not arrive, or if the consumer experiences an unforeseen outage?

EventBridge retries event delivery until successful for up to 24 hours. If it fails to deliver an event to the target, it can send the failed event to a dead letter queue (DLQ) for later processing. As you saw earlier, you can also use EventBridge’s archive and replay feature to reduce the risk of missing critical business events.

Note

CloudEvents is a specification for describing event data in a common way. It’s supported by the Cloud Native Computing Foundation (CNCF). Adopting CloudEvents as the standard for defining and producing your event payloads will ensure your events remain interoperable, understandable, and predictable, particularly as usage increases across the domains in your organization.

AsyncAPI is the industry standard for defining asynchronous APIs, and it can be used to describe and document message-driven APIs in a machine-readable format. Whereas the CloudEvents specification constrains the schema of your event payloads, Asyn⁠c­API helps you to document the API for producing and consuming your events. AsyncAPI is to event-driven interfaces as OpenAPI is to RESTful APIs.

The Importance of Event Sourcing in Serverless Development – Software Architecture for Building Serverless Microservices

The Importance of Event Sourcing in Serverless Development

Event sourcing is a way of capturing and persisting the changes happening in a system as a sequence of events.

Figure 3-3 showed a customer account service that emits account created, account updated, and account deleted events. Traditionally, when you store and update data in a table, it records the latest state of each entity. Table 3-1 shows what this might look like for the customer account service. There’s one record (row) per customer, storing the latest information for that customer.

Table 3-1. Sample rows from the Customer Account table Customer IDFirst nameLast nameAddressDOBStatus
100-255-8730JoeBloke99, Edge Lane, London1966/04/12ACTIVE
100-735-6729BizRaj12A, Top Street, Mumbai1995/06/15DELETED

While Table 3-1 provides an up-to-date representation of each customer’s data, it does not reveal whether customers’ addresses have changed at any point. Event sourcing helps provide a different perspective on the data by capturing and persisting the domain events as they occur. If you look at the data in Table 3-2, you’ll see that it preserves the domain events related to a customer account. This data store acts as the source for the events if you ever want to reconstruct the activities of an account.

Table 3-2. Event source data store for the customer account service PKSKEvent IDFirst nameLast nameAddressDOBStatus
100-255-87302023-04-05T08:47: 30.718ZHru343t5-jvcjJoeBloke99, Edge Lane, London1966/04/12UPDATED
100-735-67292023-01-15T02:37: 20.545Zlgojk834sd3-r454BizRaj12A, Top Street, Mumbai1995/06/15DELETED
100-255-87302022-10-04T09:27: 20.443ZJsd93ebhas-xdfgnsJoeBloke34, Fine Way, Leeds1966/04/12UPDATED
100-255-87302022-06-15T18:57: 43.148ZZxjfie294hfd-kd9e7nJoeBloke15, Nice Road, Cardiff1966/04/12CREATED
100-735-67292009-11-29T20:49: 40.003Zskdj834sd3-j3nsBizRaj12A, Top Street, Mumbai1995/06/15CREATED

Uses for event sourcing – Software Architecture for Building Serverless Microservices

Uses for event sourcing

Although early thoughts on event sourcing focused on the ability to re-create the current state of an entity, many modern implementations use event sourcing for additional purposes, including:

Re-creating user session activities in a distributed event-driven system

Many applications capture user interactions in timeboxed sessions. A session usually starts at the point of a user signing into the application and stays active until they sign out, or the session expires.

Event sourcing is valuable here to help users resume from where they left off or resolve any queries or disputes, as the system can chart each user’s journey.

Enabling audit tracing in situations where you cannot fully utilize logs

While many applications rely on accumulated, centrally stored logs to trace details of system behaviors, customer activities, financial data flows, etc., enterprises need to comply with data privacy policies that prevent them from sending sensitive data and PII to the logs. With event sourcing, as the data resides inside the guarded cloud accounts, teams can build tools to reconstruct the flows from the event store.

Performing data analysis to gain insights

Data is a key driver behind many decisions in the modern digital business world. Event sourcing enables deeper insights and analytics at a fine-grained level. For example, the event store of a holiday booking system harvests every business event from several microservices that coordinate to help customers book their vacations. Often customers will spend time browsing through several destinations, offers, and customizable options, among other things, before completing the booking or, in some cases, abandoning it. The events that occur during this process carry clues that can be used, for example, to identify popular (and unpopular) destinations, packages, and offers.

Note

Since the conception of event sourcing a couple of decades ago, due to the emergence of the cloud and managed services, there have been vast changes in the volume of data captured and the available ingestion mechanisms and storage options. The data models of many (but not all) modern applications accommodate storing the change history for a certain period alongside the actual data, as per the business requirements, to enable quickly tracing all the activities.

Architectural considerations for event sourcing – Software Architecture for Building Serverless Microservices

Architectural considerations for event sourcing

At a high level, the concept of event sourcing is simple—but its implementation requires careful planning. When distributed microservices come together to perform a business function, you face the challenge of having hundreds of events of different categories and types being produced and consumed by various services. In such a situation:

  • How do you identify which events to keep in an event store?
  • How do you collect all the related events in one place?
  • Should you keep an event store per microservice, bounded context, application, domain, or enterprise?
  • How do you handle encrypted and sensitive data?
  • How long do you keep the events in an event store?

Finding and implementing the answers to these critical questions involves several teams and business stakeholders working together. Let’s take a look at some of the options.

Dedicated microservice for event sourcing

Domain events flow via one or more event buses in a distributed service environment. With a dedicated microservice for event sourcing, you separate the concerns from different services and assign it to a single-purpose microservice. It manages the rules to ingest the required events, perform necessary data translations, own one or more event stores, and manage data retention and transition policies, among other tasks.

Event store per bounded context

A well-defined bounded context will benefit from having its own event store, which can be helpful for auditing purposes or for reconstructing the events that led to the current state of the application or a particular business entity. For example, in the rewards system we looked at earlier in this chapter (Figure 3-36), you might want to have an event store to keep track of rewards updates. With an extendable event-driven architecture, it’s as simple as adding another set piece microservice for event sourcing, as shown in Figure 3-42.

Figure 3-42. Adding a dedicated rewards-audit microservice for event sourcing to the rewards system

Application-level event store

Many applications you interact with daily coordinate with several distributed services. An ecommerce domain, for example, has many subdomains and bounded contexts, as you saw back in Figure 2-3 (in “Domain-first”). Each bounded context can successfully implement its own event sourcing capability, as discussed in the previous subsection, but it can only capture its part in the broader application context.

As shown in Figure 3-43, your journey as an ecommerce customer purchasing items touches several bounded contexts—product details, stock, cart, payments, rewards, etc. To reconstruct the entire journey, you need events from all these areas. To plot a customer’s end-to-end journey, you must collate the sequence of necessary events. An application-level event store is beneficial in this use case.

Figure 3-43. An ecommerce customer’s end-to-end order journey, with the different touchpoints

Centralized event sourcing cloud account

So far, you have seen single-purpose dedicated microservice, bounded context, and application-level event sourcing scenarios. A centralized event store takes things to an even more advanced level, as shown in Figure 3-44. This is an adaptation of the centralized logging pattern, where enterprises use a consolidated central cloud account to stream all the CloudWatch logs from multiple accounts from different AWS Regions. It provides a single point of access for all their critical logs, allowing them to perform security audits, compliance checks, and business analysis.

Figure 3-44. A central cloud account for event sourcing

There are, however, substantial efforts and challenges involved in setting up a central event sourcing account and related services:

  • The first challenge is agreeing upon a way of sharing events. Not all organizations have a central event bus that touches every domain. EventBridge’s cross-account, cross-region event sharing is an ideal option here.
  • Identifying and sourcing the necessary events is the next challenge. A central repository is required in order to have visibility into the details of all the event definitions. EventBridge Schema Registry is useful, but it is per AWS account, and there is no central schema registry.
  • With several event categories and types, structuring the event store and deriving the appropriate data queries and access patterns to suit the business requirements requires careful planning. You may need multiple event stores and different types of data stores—SQL, NoSQL, object, etc.—depending on the volume of events and the frequency of data access.
  • Providing access to the event stores and events is a crucial element of this setup, with consideration given to data privacy, business confidentiality, regulatory compliance, and other critical measures.

Event sourcing is an important pattern and practice for teams building serverless applications. Even if your focus is primarily on delivering the core business features (to bring value), enabling features such as event sourcing is still crucial. As mentioned earlier, not every team will need the ability to reconstruct the application’s state based on the events; however, all teams will benefit from being able to use the event store for auditing and tracing critical business flows.

EventStorming – Software Architecture for Building Serverless Microservices

EventStorming

One of the classic problems in software engineering is balancing what’s in the requirements, and what gets implemented and delivered. Misunderstandings of business requirements and misalignments between what the business stakeholders want and what the engineering team actually builds are common in the software industry. Applying the first principles of serverless development brings clarity to what you are building, making it easier to align with the business needs. Developing iteratively and in small increments makes it easier to correct when things go wrong before it is too late and becomes expensive.

You cannot expect every serverless engineer to have participated in requirements engineering workshops and UML modeling sessions or to understand domain-driven design. Often, engineers lack a complete understanding of why they are building what they are building. EventStorming is a collaborative activity that can help alleviate this problem.

What is EventStorming?

EventStorming is a collaborative, non-technical workshop format that brings together business and technology people to discuss, ideate, brainstorm, and model a business process or analyze a problem domain. Its inventor, Alberto Brandolini, drew his inspiration from domain-driven design. EventStorming is a fast, inexpensive activity that brings many thoughts to the board as a way of unearthing the details of a business domain using simple language that everybody understands. The two key elements of EventStorming are domain experts (contributors) and domain events (outcomes). Domain experts are subject matter experts (SMEs) who act as catalysts and leading contributors to the workshop. They bring domain knowledge to the process, answer questions, and explain business activities to everyone (especially the technical members). Domain events are significant events that reflect business facts at specific points. These events are identified and captured throughout the course of the workshop.

The EventStorming process looks at the business process as a series of domain events, arranges the events over a timeline, and depicts a story from start to finish. From the thoughts gathered and domain events identified, you begin to recognize the actors, commands, external systems, and, importantly, pivotal events that signal the change of context from one part to the other and indicate the border of a bounded context.

A command is a trigger or action that emits one or more domain events. For example, the success of a redeem reward command produces a reward-redeemed domain event. You will see the domain model emerging as aggregates (clusters of domain objects) as you identify the actors, commands, and domain events. In the previous example, the reward is an aggregate that receives a command and generates a domain event.

A full explanation of how you conduct an EventStorming workshop is beyond the scope of this book, but several resources are available. In addition to the ones listed on the website, Vlad Khononov’s book Learning Domain-Driven Design (O’Reilly) has a chapter on EventStorming.

The importance of EventStorming in serverless development – Software Architecture for Building Serverless Microservices

The importance of EventStorming in serverless development

EventStorming is a great way to collaborate and to learn about business requirements, identify domain events, and shape the model before considering architecture and solution design. However, depending on the scale of the domain or product, the outcome of EventStorming could be high-level.

Say your organization is transforming the IT operations of its manufacturing division. The EventStorming exercise will bring together several domain experts, business stakeholders, enterprise and solution architects, engineering leads and engineers, product managers, UX designers, QA engineers, test specialists, etc. After a few days of collaboration, you identify various business process flows, domain events, model entities, and many bounded contexts, among other things. With clarity about the entire domain, you start assigning ownership—stream-aligned teams—to the bounded contexts.

These teams then delve into each bounded context to identify web applications, microservices, APIs, events, and architectural constructs to implement. While the artifacts from the domain-level EventStorming sessions form a base, serverless teams need more granular details. Hence, it is useful in serverless development if you employ EventStorming in two stages:

Domain-level EventStorming

According to Brandolini, this is the “Big Picture” EventStorming workshop aimed at identifying the business processes, domain events, commands, actors, aggregates, etc.

Development-level EventStorming

This is a more small-scale activity that involves an engineering team, its business stakeholders, the product manager, and UX designers. This is similar to what Brandolini calls “Design Level EventStorming.”

Here, the focus is on the bounded context and developments within it. The team identifies the internal process flows, local events, and separation of functionality and responsibilities. These become the early sketches for set-piece microservices, their interfaces, and event interactions. The outcome from the development-level EventStorming feeds into the solution design process (explained in Chapter 6) as engineers start thinking about the serverless architecture.

Let’s consider an example situation for development-level EventStorming:

Context: Figure 2-3 (in “Domain-first”) shows the breakdown of an ecommerce domain. A domain-level EventStorming workshop has identified the subdomains and bounded contexts. A stream-aligned team owns the user payments bounded context.

Use case: Due to customer demand and to prevent fraud, the stakeholders want to add a new feature where customers who call the customer support center to place orders over the phone can make their payments via a secure link emailed to them rather than providing the card number over the phone.

The proposed new feature only requires part of the ecommerce team to participate in a (development-level) EventStorming session. It is a small-scale activity within a bounded context with fewer participants.

Summary

You’ve just completed one of the most crucial chapters of this book on serverless development. The architectural thoughts, best practices, and recommendations you’ve learned here are essential whether you work as an independent consultant or part of a team in a big enterprise. Irrespective of the organization’s size, your ambition is to architect solutions to the strengths of serverless. Business requirements and problem domains can be complex and hard to comprehend, and it is the same in other fields and walks of life. You can observe and learn how people successfully solve non-software problems and apply those principles in your work.

Serverless architecture need not be a complex and tangled web of lines crisscrossing your entire organization. Your vision is to architect single-purpose, loosely coupled, distributed, and event-driven microservices as set pieces that are easier to conceive, develop, operate, observe, and evolve within the serverless technology ecosystem of your organization.

You will carry the learnings from these initial chapters with you as you go through the remainder of the book. You will begin to apply the architectural lessons in Chapter 5, which will teach you some core implementation patterns in serverless development. But first, the next chapter delves into one of the fundamental and critical topics in software development: security.