An Introduction to Amazon EventBridge – Software Architecture for Building Serverless Microservices

An Introduction to Amazon EventBridge

Amazon EventBridge is a fully managed serverless event bus that allows you to send events from multiple event producers, apply event filtering to detect events, perform data transformation where needed, and route events to one or more target applications or services (see Figure 3-39). It’s one of the core fully managed and serverless services from AWS that plays a pivotal role in architecting and building event-driven applications. As an architect or a developer, familiarity with the features and capabilities of EventBridge is crucial. If you are already familiar with EventBridge and its capabilities, you may skip this section.

Figure 3-39. The components of Amazon EventBridge (source: adapted from an image on the Amazon EventBridge web page)

The technical ecosystem of EventBridge can be divided into two main categories. The first comprises its primary functionality, such as:

  • The interface for ingesting events from various sources (applications and services)
  • The interface for delivering events to configured target applications or services (consumers)
  • Support for multiple custom event buses as event transportation channels
  • The ability to configure rules to identify events and route them to one or more targets

The second consists of features that are auxiliary (but still important), including:

  • Support for archiving and replaying events
  • The event schema registry
  • EventBridge Scheduler for scheduling tasks on a one-time or recurring basis
  • EventBridge Pipes for one-to-one event transport needs

Let’s take a look at some of these items, to give you an idea of how to get started with EventBridge.

Event buses in Amazon EventBridge

Every event sent to EventBridge is associated with an event bus. If you consider EventBridge as the overall event router ecosystem, then event buses are individual channels of event flow. Event producers choose which bus to send the events to, and you configure event routing on each bus.

The EventBridge service in every AWS account has a default event bus. AWS uses the default bus for all events from several of its services.

You can also create one or more custom event buses for your needs. In addition, to receive events from AWS EventBridge partners, you can configure a partner event source and send events to a partner event bus.

Event routing rules – Software Architecture for Building Serverless Microservices

Event routing rules

The rules you create in EventBridge are the logic behind the filtering and routing of events that you associate with an event bus. These rules are effectively part of your application logic, and are designed, documented, deployed, and tested as such. A rule comprises three parts: the event filter pattern, event data transformation, and the target(s).

To filter an event in and send it to a target, you configure an event pattern as your filter condition. The sample pattern in Example 3-2 will match events like the one in Example 3-1 based on the domain, service, type, and payment_type attribute values.

Example 3-2. An example event filter pattern
{
“detail”
:
{
“metadata”
:
{
“domain”
:
[
“ecommerce”
],
“service”
:
[
“service-payments”
],
“type”
:
[
“payment_received”
]
},
“data”
:
{
“payment_type”
:
[
“creditcard”
]
}
}

}

As part of each rule, you can perform simple data transformations. At the time of writing, for each rule you can add up to five targets to send matching events to.

An important fact to keep in mind is that EventBridge guarantees at least once delivery of events to targets. This means a target may receive an event more than once (i.e., it may receive duplicate events). You will learn how to handle this situation later in the chapter.

Event archiving and replay

In EventBridge, you can store events in one or more archives. The events you archive depend on the event filter pattern. For example, you could create an archive to store all the events that match the pattern shown in Example 3-2.

You can create multiple archives to cater to your needs. Then, based on your business requirements, you can identify the events within your bounded context that need archiving and send them to the appropriate archives using different filter conditions. Unless there is a specific requirement to archive all the events, keep your archives as lean as possible as a best practice. Figure 3-40 shows a comparison of the different approaches for a better understanding.

To replay events from an archive, you specify the archive name and the time window. EventBridge reads the events from the archive and puts them onto the same event bus that originally emitted them. To differentiate a replayed event from the original event, EventBridge adds a replay-name attribute.

Figure 3-40. Different event archiving approaches, from least to most favored

The Importance of Event Sourcing in Serverless Development – Software Architecture for Building Serverless Microservices

The Importance of Event Sourcing in Serverless Development

Event sourcing is a way of capturing and persisting the changes happening in a system as a sequence of events.

Figure 3-3 showed a customer account service that emits account created, account updated, and account deleted events. Traditionally, when you store and update data in a table, it records the latest state of each entity. Table 3-1 shows what this might look like for the customer account service. There’s one record (row) per customer, storing the latest information for that customer.

Table 3-1. Sample rows from the Customer Account table Customer IDFirst nameLast nameAddressDOBStatus
100-255-8730JoeBloke99, Edge Lane, London1966/04/12ACTIVE
100-735-6729BizRaj12A, Top Street, Mumbai1995/06/15DELETED

While Table 3-1 provides an up-to-date representation of each customer’s data, it does not reveal whether customers’ addresses have changed at any point. Event sourcing helps provide a different perspective on the data by capturing and persisting the domain events as they occur. If you look at the data in Table 3-2, you’ll see that it preserves the domain events related to a customer account. This data store acts as the source for the events if you ever want to reconstruct the activities of an account.

Table 3-2. Event source data store for the customer account service PKSKEvent IDFirst nameLast nameAddressDOBStatus
100-255-87302023-04-05T08:47: 30.718ZHru343t5-jvcjJoeBloke99, Edge Lane, London1966/04/12UPDATED
100-735-67292023-01-15T02:37: 20.545Zlgojk834sd3-r454BizRaj12A, Top Street, Mumbai1995/06/15DELETED
100-255-87302022-10-04T09:27: 20.443ZJsd93ebhas-xdfgnsJoeBloke34, Fine Way, Leeds1966/04/12UPDATED
100-255-87302022-06-15T18:57: 43.148ZZxjfie294hfd-kd9e7nJoeBloke15, Nice Road, Cardiff1966/04/12CREATED
100-735-67292009-11-29T20:49: 40.003Zskdj834sd3-j3nsBizRaj12A, Top Street, Mumbai1995/06/15CREATED

Uses for event sourcing – Software Architecture for Building Serverless Microservices

Uses for event sourcing

Although early thoughts on event sourcing focused on the ability to re-create the current state of an entity, many modern implementations use event sourcing for additional purposes, including:

Re-creating user session activities in a distributed event-driven system

Many applications capture user interactions in timeboxed sessions. A session usually starts at the point of a user signing into the application and stays active until they sign out, or the session expires.

Event sourcing is valuable here to help users resume from where they left off or resolve any queries or disputes, as the system can chart each user’s journey.

Enabling audit tracing in situations where you cannot fully utilize logs

While many applications rely on accumulated, centrally stored logs to trace details of system behaviors, customer activities, financial data flows, etc., enterprises need to comply with data privacy policies that prevent them from sending sensitive data and PII to the logs. With event sourcing, as the data resides inside the guarded cloud accounts, teams can build tools to reconstruct the flows from the event store.

Performing data analysis to gain insights

Data is a key driver behind many decisions in the modern digital business world. Event sourcing enables deeper insights and analytics at a fine-grained level. For example, the event store of a holiday booking system harvests every business event from several microservices that coordinate to help customers book their vacations. Often customers will spend time browsing through several destinations, offers, and customizable options, among other things, before completing the booking or, in some cases, abandoning it. The events that occur during this process carry clues that can be used, for example, to identify popular (and unpopular) destinations, packages, and offers.

Note

Since the conception of event sourcing a couple of decades ago, due to the emergence of the cloud and managed services, there have been vast changes in the volume of data captured and the available ingestion mechanisms and storage options. The data models of many (but not all) modern applications accommodate storing the change history for a certain period alongside the actual data, as per the business requirements, to enable quickly tracing all the activities.

Architectural considerations for event sourcing – Software Architecture for Building Serverless Microservices

Architectural considerations for event sourcing

At a high level, the concept of event sourcing is simple—but its implementation requires careful planning. When distributed microservices come together to perform a business function, you face the challenge of having hundreds of events of different categories and types being produced and consumed by various services. In such a situation:

  • How do you identify which events to keep in an event store?
  • How do you collect all the related events in one place?
  • Should you keep an event store per microservice, bounded context, application, domain, or enterprise?
  • How do you handle encrypted and sensitive data?
  • How long do you keep the events in an event store?

Finding and implementing the answers to these critical questions involves several teams and business stakeholders working together. Let’s take a look at some of the options.

Dedicated microservice for event sourcing

Domain events flow via one or more event buses in a distributed service environment. With a dedicated microservice for event sourcing, you separate the concerns from different services and assign it to a single-purpose microservice. It manages the rules to ingest the required events, perform necessary data translations, own one or more event stores, and manage data retention and transition policies, among other tasks.

Event store per bounded context

A well-defined bounded context will benefit from having its own event store, which can be helpful for auditing purposes or for reconstructing the events that led to the current state of the application or a particular business entity. For example, in the rewards system we looked at earlier in this chapter (Figure 3-36), you might want to have an event store to keep track of rewards updates. With an extendable event-driven architecture, it’s as simple as adding another set piece microservice for event sourcing, as shown in Figure 3-42.

Figure 3-42. Adding a dedicated rewards-audit microservice for event sourcing to the rewards system

Application-level event store

Many applications you interact with daily coordinate with several distributed services. An ecommerce domain, for example, has many subdomains and bounded contexts, as you saw back in Figure 2-3 (in “Domain-first”). Each bounded context can successfully implement its own event sourcing capability, as discussed in the previous subsection, but it can only capture its part in the broader application context.

As shown in Figure 3-43, your journey as an ecommerce customer purchasing items touches several bounded contexts—product details, stock, cart, payments, rewards, etc. To reconstruct the entire journey, you need events from all these areas. To plot a customer’s end-to-end journey, you must collate the sequence of necessary events. An application-level event store is beneficial in this use case.

Figure 3-43. An ecommerce customer’s end-to-end order journey, with the different touchpoints

Centralized event sourcing cloud account

So far, you have seen single-purpose dedicated microservice, bounded context, and application-level event sourcing scenarios. A centralized event store takes things to an even more advanced level, as shown in Figure 3-44. This is an adaptation of the centralized logging pattern, where enterprises use a consolidated central cloud account to stream all the CloudWatch logs from multiple accounts from different AWS Regions. It provides a single point of access for all their critical logs, allowing them to perform security audits, compliance checks, and business analysis.

Figure 3-44. A central cloud account for event sourcing

There are, however, substantial efforts and challenges involved in setting up a central event sourcing account and related services:

  • The first challenge is agreeing upon a way of sharing events. Not all organizations have a central event bus that touches every domain. EventBridge’s cross-account, cross-region event sharing is an ideal option here.
  • Identifying and sourcing the necessary events is the next challenge. A central repository is required in order to have visibility into the details of all the event definitions. EventBridge Schema Registry is useful, but it is per AWS account, and there is no central schema registry.
  • With several event categories and types, structuring the event store and deriving the appropriate data queries and access patterns to suit the business requirements requires careful planning. You may need multiple event stores and different types of data stores—SQL, NoSQL, object, etc.—depending on the volume of events and the frequency of data access.
  • Providing access to the event stores and events is a crucial element of this setup, with consideration given to data privacy, business confidentiality, regulatory compliance, and other critical measures.

Event sourcing is an important pattern and practice for teams building serverless applications. Even if your focus is primarily on delivering the core business features (to bring value), enabling features such as event sourcing is still crucial. As mentioned earlier, not every team will need the ability to reconstruct the application’s state based on the events; however, all teams will benefit from being able to use the event store for auditing and tracing critical business flows.

Getting Started – Serverless and Security

Getting Started

Establishing a solid foundation for your serverless security practice is pivotal. Security can, and must, be a primary concern. And it is never too late to establish this foundation.

As previously alluded to, security must be a clearly defined process. It is not a case of completing a checklist, deploying a tool, or deferring to other teams. Security should be part of the design, development, testing, and operation of every part of your system.

Working within sound security frameworks that fit well with serverless and adopting sensible engineering habits, combined with all the support and expertise of your cloud provider, will go a long way toward ensuring your applications remain secure.

When applied to serverless software, two modern security trends can provide a solid foundation for securing your application: zero trust and the principle of least privilege. The next section examines these concepts.

Once you have established a zero trust, least privilege security framework, the next step is to identify the attack surface of your applications and the security threats that they are vulnerable to. Subsequent sections examine the most common serverless threats and the threat modeling process.

Optimism Is Greater than Pessimism

The Optimism Otter says: “People in our organisation need to move fast to meet the needs of our customers. The job of security is to help them move fast AND stay secure.”

Serverless enables rapid development; security specialists should not only support this pace but also act upon it. They should enhance the safety and sustainability of the pace and, above all, not slow it down.

Software engineers should delegate to security professionals whenever there is a clear need, either through knowledge acquisition or services, such as penetration testing and vulnerability scanning.

Combining the Zero Trust Security Model with Least Privilege Permissions – Serverless and Security

Combining the Zero Trust Security Model with Least Privilege Permissions

There are two modern cybersecurity principles that you can leverage as the cornerstones of your serverless security strategy: zero trust architecture and the principle of least privilege.

Zero trust architecture

The basic premise of zero trust security is to assume every connection to your system is a threat. Every single interface should then be protected by a layer of authentication (who are you?) and authorization (what do you want?). This applies both to public API endpoints, or the perimeter in the traditional castle-and-moat model, and private, internal interfaces, such as Lambda functions or DynamoDB tables. Zero trust controls access to each distinct resource in your application, whereas a castle-and-moat model only controls access to the resources at the perimeter of your application.

Imagine a knight errant galloping up to the castle walls, presenting likely-looking credentials to the guards and persuading them of their honorable intentions before confidently entering the castle across the lowered drawbridge. If these perimeter guards form the extent of the castle’s security, the knight is now free to roam the rooms, dungeons, and jewel store, collecting sensitive information for future raids or stealing valuable assets on the spot. If, however, each door or walkway had additional suspicious guards or sophisticated security controls that assumed zero trust by default, the knight would be entirely restricted and might even be deterred from infiltrating this castle at all.

Another scenario to keep in mind is a castle that cuts a single key for every heavy-duty door: should the knight gain access to one copy of this key, they’ll be able to open all the doors, no matter how thick or cumbersome. With zero trust, there’s a unique key for every door. Figure 4-2 shows how the castle-and-moat model compares to a zero trust architecture.

Figure 4-2. Castle-and-moat perimeter security compared to zero trust architecture

There are various applications of zero trust architecture, such as remote computing and enterprise network security. The next section briefly discusses how the zero trust model can be interpreted and applied to serverless applications.

Lambda execution roles – Serverless and Security

Lambda execution roles

A key use of IAM roles in serverless applications is Lambda function execution roles. An execution role is attached to a Lambda function and grants the function the permissions necessary to execute correctly, including access to any other AWS resources that are required. For example, if the Lambda function uses the AWS SDK to make a DynamoDB request that inserts a record in a table, the execution role must include a policy with the dynamodb:PutItem action for the table resource.

The execution role is assumed by the Lambda service when performing control plane and data plane operations. The AWS Security Token Service (STS) is used to fetch short-lived, temporary security credentials which are made available via the function’s environment variables during invocation.

Each function in your application should have its own unique execution role with the minimum permissions required to perform its duty. In this way, single-purpose functions (introduced in Chapter 6) are also key to security: IAM permissions can be tightly scoped to the function and remain extremely restricted according to the limited functionality.

IAM guardrails

As you are no doubt beginning to notice, effective serverless security in the cloud is about basic security hygiene. Establishing guardrails for the use of AWS IAM is a core part of promoting a secure approach to everyday engineering activity. Here are some recommended guardrails:

Apply the principle of least privilege in policies.

IAM policies should only include the minimum set of permissions required for the associated resource to perform the necessary control or data plane operations. As a general rule, do not use wildcards (*) in your policy statements. Wildcards are the antithesis of least privilege, as they apply blanket permissions for actions and resources. Unless the action explicitly requires a wildcard, always be specific.

Avoid using managed IAM policies.

These are policies provided by AWS, and they’re often tempting shortcuts, especially when you’re just getting started or using a service for the first time. You can use these policies early in prototyping or development, but you should replace them with custom policies as soon as you understand the integration better. Because these policies are designed to be applied to generic scenarios, they are simply not restricted enough and will usually violate the principle of least privilege when applied to interactions within your application.

Prefer roles to users.

IAM users are issued with static, long-lived AWS access credentials (an access key ID and secret access key). These credentials can be used to directly access the application provider’s AWS account, including all the resources and data in that account. Depending on the associated IAM roles and policies, the authenticating user may even have the ability to create or destroy resources. Given the power they grant the holder, the use and distribution of static credentials must be limited to reduce the risk of unauthorized access. Where possible, restrict IAM users to an absolute minimum (or, even better, do not have any IAM users at all).

Prefer a role per resource.

Each resource in your application, such as an EventBridge rule, a Lambda function, and an SQS queue, should have its own unique role. Permissions for those roles should be fine-grained and least-privileged.

The AWS Shared Responsibility Model – Serverless and Security

The AWS Shared Responsibility Model

AWS uses a shared responsibility model to define the remit of application security consumers and the cloud provider (see Figure 4-3). The important thing here is the shift in security responsibility to AWS when using cloud services. This is increased when using fully managed serverless services, such as compute with AWS Lambda: AWS manages patching of the Lambda runtime, function execution isolation, and so on.

Serverless applications are made up of business logic, infrastructure definitions, and managed services. Ownership of these elements is split between AWS and the consumers of its public cloud services. As a serverless application engineer and AWS customer, you are responsible the for security of:

  • Your function code and third-party libraries used in that code
  • Configuration of the AWS resources used in your application
  • The IAM roles and policies governing access control to the resources and functions in your application
Figure 4-3. The cloud security shared responsibility model: you are responsible for security in the cloud, and AWS is responsible for security of the cloud

Think Like a Hacker

With your foundational zero trust, least privilege security strategy and a clear delineation of responsibility in place, the next step is to identify the potential attack vectors in your application and be aware of the possible threats to the security and integrity of your systems.

When you imagine the threats to your systems, you may picture bad actors who are external to your organization—hackers. While external threats certainly exist, they must not overshadow internal threats, which must also be guarded against. Internal threats could, of course, be deliberately malicious, but the more likely scenario is that the vulnerabilities are introduced unintentionally. The engineers of an application can often be the architects of their own security flaws and data exposures, often through weak or missing security configuration of cloud resources.

The popular depiction of a hacker performing an obvious denial of service attack on a web application or infiltrating a server firewall is still a very real possibility, but subtler attacks on the software supply chain are now just as likely. These insidious attacks involve embedding malicious code in third-party libraries and automating exploits remotely once the code is deployed in production workloads.

It is essential to adopt the mindset of a hacker and fully understand the potential threats to your serverless applications in order to properly defend against them.

Think before you install – Serverless and Security

Think before you install

You can start securing the serverless supply chain by scrutinizing packages before installing them. This is a simple suggestion that can make a real difference to securing your application’s supply chain, and to general maintenance at scale.

Use as few dependencies as necessary, and be aware of dependencies that obfuscate the data and control flow of your app, such as middleware libraries. If it is a trivial task, always try to do it yourself. It’s also about trust. Do you trust the package? Do you trust the contributors?

Before you install the next package in your serverless application, adopt the following practices:

Analyze the GitHub repository.

Review the contributors to the package. More contributors represents more scrutiny and collaboration. Check whether the repository uses verified commits. Assess the history of the package: How old is it? How many commits have been made? Analyze the repository activity to understand if the package is actively maintained and used by the community—GitHub stars provide a crude indicator of popularity, and things like the date of the most recent commit and number of open issues and pull requests indicate usage. Also ensure the package’s license adheres to any restrictions in place in your organization.

Use official package repositories.

Only obtain packages from official sources, such as NPM, PyPI, Maven, NuGet, or RubyGems, over secure (i.e., HTTPS) links. Prefer signed packages that can be verified for integrity and authenticity. For example, the JavaScript package manager NPM allows you to audit package signatures.

Review the dependency tree.

Be aware of the package’s dependencies and the entire dependency tree. Pick packages with zero runtime dependencies where available.

Try before you buy.

Try new packages on as isolated a scale as possible and delay rollout across the codebase for as long as possible, until you feel confident.

Check if you can do it yourself.

Don’t reinvent the wheel for the sake of it, but one very simple way of removing opaque third-party code is to not introduce it in the first place. Examine the source code to understand if the package is doing something simple that is easily portable to a first-party utility. Logging libraries are a perfect example: you can trivially implement your own logger rather than distributing a third-party library across your codebase.

Make it easy to back out.

Development patterns like service isolation, single-responsibility Lambda functions, and limiting shared code (see Chapter 6 for more information on these patterns) make it easier to evolve your architecture and avoid pervasive antipatterns or vulnerable software taking over your codebase.

Lock to the latest.

Always use the latest version of the package, and always use an explicit version rather than a range or “latest” flag.

Uninstall any unused packages.

Always uninstall and clear unused packages from your dependencies manifest. Most modern compilers and bundlers will only include dependencies that are actually consumed by your code, but keeping your manifest clean adds extra safety and clarity.