An Introduction to Amazon EventBridge – Software Architecture for Building Serverless Microservices

An Introduction to Amazon EventBridge

Amazon EventBridge is a fully managed serverless event bus that allows you to send events from multiple event producers, apply event filtering to detect events, perform data transformation where needed, and route events to one or more target applications or services (see Figure 3-39). It’s one of the core fully managed and serverless services from AWS that plays a pivotal role in architecting and building event-driven applications. As an architect or a developer, familiarity with the features and capabilities of EventBridge is crucial. If you are already familiar with EventBridge and its capabilities, you may skip this section.

Figure 3-39. The components of Amazon EventBridge (source: adapted from an image on the Amazon EventBridge web page)

The technical ecosystem of EventBridge can be divided into two main categories. The first comprises its primary functionality, such as:

  • The interface for ingesting events from various sources (applications and services)
  • The interface for delivering events to configured target applications or services (consumers)
  • Support for multiple custom event buses as event transportation channels
  • The ability to configure rules to identify events and route them to one or more targets

The second consists of features that are auxiliary (but still important), including:

  • Support for archiving and replaying events
  • The event schema registry
  • EventBridge Scheduler for scheduling tasks on a one-time or recurring basis
  • EventBridge Pipes for one-to-one event transport needs

Let’s take a look at some of these items, to give you an idea of how to get started with EventBridge.

Event buses in Amazon EventBridge

Every event sent to EventBridge is associated with an event bus. If you consider EventBridge as the overall event router ecosystem, then event buses are individual channels of event flow. Event producers choose which bus to send the events to, and you configure event routing on each bus.

The EventBridge service in every AWS account has a default event bus. AWS uses the default bus for all events from several of its services.

You can also create one or more custom event buses for your needs. In addition, to receive events from AWS EventBridge partners, you can configure a partner event source and send events to a partner event bus.

Event schema registry – Software Architecture for Building Serverless Microservices

Event schema registry

Every event has a structure, defined by a schema. EventBridge provides the schema for all the AWS service events, and it can infer the schemas of any other events sent to an event bus. In addition, you can create or upload custom schemas for your events.

Schema registries are holding places or containers for schemas. As well as the default registries for built-in schemas, discovered schemas, and all schemas, you can create your own registries to provide groupings for your schemas.

Tip

EventBridge provides code bindings for schemas, which you can use to validate an event against its schema. This is useful to protect against introducing any breaking changes that might affect the downstream event consumers.

EventBridge Scheduler

EventBridge Scheduler is a way to configure tasks to be invoked asynchronously, on a schedule, from a central location. It is fully managed and serverless, which allows scheduling of millions of tasks either for one-time invocation or repeatedly. The schedules you configure are part of your architecture.

The EventBridge Scheduler can invoke more than 270 AWS services; it has a built-in retry mechanism and a flexible invocation time window.

EventBridge Pipes

Earlier, we discussed using EventBridge routing rules to filter events and send them to multiple targets. EventBridge Pipes, on the other hand, builds a one-to-one integration pipeline between an event publisher and a subscriber. Within a pipe, you have the option to perform event filtering, data transformation, and data enrichment (see Figure 3-41). This is quite a powerful feature, and it reduces the need for writing custom code in many use cases.

Figure 3-41. A representation of EventBridge Pipes integration between an event source and its target (source: adapted from an image on the Amazon EventBridge Pipes web page)

The Importance of Event Sourcing in Serverless Development – Software Architecture for Building Serverless Microservices

The Importance of Event Sourcing in Serverless Development

Event sourcing is a way of capturing and persisting the changes happening in a system as a sequence of events.

Figure 3-3 showed a customer account service that emits account created, account updated, and account deleted events. Traditionally, when you store and update data in a table, it records the latest state of each entity. Table 3-1 shows what this might look like for the customer account service. There’s one record (row) per customer, storing the latest information for that customer.

Table 3-1. Sample rows from the Customer Account table Customer IDFirst nameLast nameAddressDOBStatus
100-255-8730JoeBloke99, Edge Lane, London1966/04/12ACTIVE
100-735-6729BizRaj12A, Top Street, Mumbai1995/06/15DELETED

While Table 3-1 provides an up-to-date representation of each customer’s data, it does not reveal whether customers’ addresses have changed at any point. Event sourcing helps provide a different perspective on the data by capturing and persisting the domain events as they occur. If you look at the data in Table 3-2, you’ll see that it preserves the domain events related to a customer account. This data store acts as the source for the events if you ever want to reconstruct the activities of an account.

Table 3-2. Event source data store for the customer account service PKSKEvent IDFirst nameLast nameAddressDOBStatus
100-255-87302023-04-05T08:47: 30.718ZHru343t5-jvcjJoeBloke99, Edge Lane, London1966/04/12UPDATED
100-735-67292023-01-15T02:37: 20.545Zlgojk834sd3-r454BizRaj12A, Top Street, Mumbai1995/06/15DELETED
100-255-87302022-10-04T09:27: 20.443ZJsd93ebhas-xdfgnsJoeBloke34, Fine Way, Leeds1966/04/12UPDATED
100-255-87302022-06-15T18:57: 43.148ZZxjfie294hfd-kd9e7nJoeBloke15, Nice Road, Cardiff1966/04/12CREATED
100-735-67292009-11-29T20:49: 40.003Zskdj834sd3-j3nsBizRaj12A, Top Street, Mumbai1995/06/15CREATED

EventStorming – Software Architecture for Building Serverless Microservices

EventStorming

One of the classic problems in software engineering is balancing what’s in the requirements, and what gets implemented and delivered. Misunderstandings of business requirements and misalignments between what the business stakeholders want and what the engineering team actually builds are common in the software industry. Applying the first principles of serverless development brings clarity to what you are building, making it easier to align with the business needs. Developing iteratively and in small increments makes it easier to correct when things go wrong before it is too late and becomes expensive.

You cannot expect every serverless engineer to have participated in requirements engineering workshops and UML modeling sessions or to understand domain-driven design. Often, engineers lack a complete understanding of why they are building what they are building. EventStorming is a collaborative activity that can help alleviate this problem.

What is EventStorming?

EventStorming is a collaborative, non-technical workshop format that brings together business and technology people to discuss, ideate, brainstorm, and model a business process or analyze a problem domain. Its inventor, Alberto Brandolini, drew his inspiration from domain-driven design. EventStorming is a fast, inexpensive activity that brings many thoughts to the board as a way of unearthing the details of a business domain using simple language that everybody understands. The two key elements of EventStorming are domain experts (contributors) and domain events (outcomes). Domain experts are subject matter experts (SMEs) who act as catalysts and leading contributors to the workshop. They bring domain knowledge to the process, answer questions, and explain business activities to everyone (especially the technical members). Domain events are significant events that reflect business facts at specific points. These events are identified and captured throughout the course of the workshop.

The EventStorming process looks at the business process as a series of domain events, arranges the events over a timeline, and depicts a story from start to finish. From the thoughts gathered and domain events identified, you begin to recognize the actors, commands, external systems, and, importantly, pivotal events that signal the change of context from one part to the other and indicate the border of a bounded context.

A command is a trigger or action that emits one or more domain events. For example, the success of a redeem reward command produces a reward-redeemed domain event. You will see the domain model emerging as aggregates (clusters of domain objects) as you identify the actors, commands, and domain events. In the previous example, the reward is an aggregate that receives a command and generates a domain event.

A full explanation of how you conduct an EventStorming workshop is beyond the scope of this book, but several resources are available. In addition to the ones listed on the website, Vlad Khononov’s book Learning Domain-Driven Design (O’Reilly) has a chapter on EventStorming.

The importance of EventStorming in serverless development – Software Architecture for Building Serverless Microservices

The importance of EventStorming in serverless development

EventStorming is a great way to collaborate and to learn about business requirements, identify domain events, and shape the model before considering architecture and solution design. However, depending on the scale of the domain or product, the outcome of EventStorming could be high-level.

Say your organization is transforming the IT operations of its manufacturing division. The EventStorming exercise will bring together several domain experts, business stakeholders, enterprise and solution architects, engineering leads and engineers, product managers, UX designers, QA engineers, test specialists, etc. After a few days of collaboration, you identify various business process flows, domain events, model entities, and many bounded contexts, among other things. With clarity about the entire domain, you start assigning ownership—stream-aligned teams—to the bounded contexts.

These teams then delve into each bounded context to identify web applications, microservices, APIs, events, and architectural constructs to implement. While the artifacts from the domain-level EventStorming sessions form a base, serverless teams need more granular details. Hence, it is useful in serverless development if you employ EventStorming in two stages:

Domain-level EventStorming

According to Brandolini, this is the “Big Picture” EventStorming workshop aimed at identifying the business processes, domain events, commands, actors, aggregates, etc.

Development-level EventStorming

This is a more small-scale activity that involves an engineering team, its business stakeholders, the product manager, and UX designers. This is similar to what Brandolini calls “Design Level EventStorming.”

Here, the focus is on the bounded context and developments within it. The team identifies the internal process flows, local events, and separation of functionality and responsibilities. These become the early sketches for set-piece microservices, their interfaces, and event interactions. The outcome from the development-level EventStorming feeds into the solution design process (explained in Chapter 6) as engineers start thinking about the serverless architecture.

Let’s consider an example situation for development-level EventStorming:

Context: Figure 2-3 (in “Domain-first”) shows the breakdown of an ecommerce domain. A domain-level EventStorming workshop has identified the subdomains and bounded contexts. A stream-aligned team owns the user payments bounded context.

Use case: Due to customer demand and to prevent fraud, the stakeholders want to add a new feature where customers who call the customer support center to place orders over the phone can make their payments via a secure link emailed to them rather than providing the card number over the phone.

The proposed new feature only requires part of the ecommerce team to participate in a (development-level) EventStorming session. It is a small-scale activity within a bounded context with fewer participants.

Summary

You’ve just completed one of the most crucial chapters of this book on serverless development. The architectural thoughts, best practices, and recommendations you’ve learned here are essential whether you work as an independent consultant or part of a team in a big enterprise. Irrespective of the organization’s size, your ambition is to architect solutions to the strengths of serverless. Business requirements and problem domains can be complex and hard to comprehend, and it is the same in other fields and walks of life. You can observe and learn how people successfully solve non-software problems and apply those principles in your work.

Serverless architecture need not be a complex and tangled web of lines crisscrossing your entire organization. Your vision is to architect single-purpose, loosely coupled, distributed, and event-driven microservices as set pieces that are easier to conceive, develop, operate, observe, and evolve within the serverless technology ecosystem of your organization.

You will carry the learnings from these initial chapters with you as you go through the remainder of the book. You will begin to apply the architectural lessons in Chapter 5, which will teach you some core implementation patterns in serverless development. But first, the next chapter delves into one of the fundamental and critical topics in software development: security.

Getting Started – Serverless and Security

Getting Started

Establishing a solid foundation for your serverless security practice is pivotal. Security can, and must, be a primary concern. And it is never too late to establish this foundation.

As previously alluded to, security must be a clearly defined process. It is not a case of completing a checklist, deploying a tool, or deferring to other teams. Security should be part of the design, development, testing, and operation of every part of your system.

Working within sound security frameworks that fit well with serverless and adopting sensible engineering habits, combined with all the support and expertise of your cloud provider, will go a long way toward ensuring your applications remain secure.

When applied to serverless software, two modern security trends can provide a solid foundation for securing your application: zero trust and the principle of least privilege. The next section examines these concepts.

Once you have established a zero trust, least privilege security framework, the next step is to identify the attack surface of your applications and the security threats that they are vulnerable to. Subsequent sections examine the most common serverless threats and the threat modeling process.

Optimism Is Greater than Pessimism

The Optimism Otter says: “People in our organisation need to move fast to meet the needs of our customers. The job of security is to help them move fast AND stay secure.”

Serverless enables rapid development; security specialists should not only support this pace but also act upon it. They should enhance the safety and sustainability of the pace and, above all, not slow it down.

Software engineers should delegate to security professionals whenever there is a clear need, either through knowledge acquisition or services, such as penetration testing and vulnerability scanning.

The Power of AWS IAM – Serverless and Security

The Power of AWS IAM

AWS IAM is the one service you will use everywhere—but it’s also often seen as one of the most complex. Therefore, it’s important to understand IAM and learn how to harness its power. (You don’t have to become an IAM expert, though—unless you want to, of course!)

The power of AWS IAM lies in roles and policies. Policies define the actions that can be taken on certain resources. For example, a policy could define the permission to put events onto a specific EventBridge event bus. Roles are collections of one or more policies. Roles can be attached to IAM users, but the more common pattern in a modern serverless application is to attach a role to a resource. In this way, an EventBridge rule can be granted permission to invoke a Lambda function, and that function can in turn be permitted to put items into a DynamoDB table.

IAM actions can be split into two categories: control plane actions and data plane actions. Control plane actions, such as PutEvents and GetItem (e.g., used by an automated deployment role) manage resources. Data plane actions, such as PutEvents and GetItem (e.g., used by a Lambda execution role), interact with those resources.

Let’s take a look at a simple IAM policy statement and the elements it is composed of:
{
“Sid”
:
“ListObjectsInBucket”
,
# Statement ID, optional identifier for
                               
# policy statement
“Action”
:
“s3:ListBucket”
,
# AWS service API action(s) that will be allowed
                            
# or denied
“Effect”
:
“Allow”
,
# Whether the statement should result in an allow or deny
“Resource”
:
“arn:aws:s3:::bucket-name”
,
# Amazon Resource Name (ARN) of the
                                         
# resource(s) covered by the statement
“Condition”
:
{
# Conditions for when a policy is in effect
“StringLike”
:
{
# Condition operator
“s3:prefix”
:
[
# Condition key
“photos/”
,
# Condition value
]
}
}
}

See the AWS IAM documentation for full details of all the elements of an IAM policy.

Lambda execution roles – Serverless and Security

Lambda execution roles

A key use of IAM roles in serverless applications is Lambda function execution roles. An execution role is attached to a Lambda function and grants the function the permissions necessary to execute correctly, including access to any other AWS resources that are required. For example, if the Lambda function uses the AWS SDK to make a DynamoDB request that inserts a record in a table, the execution role must include a policy with the dynamodb:PutItem action for the table resource.

The execution role is assumed by the Lambda service when performing control plane and data plane operations. The AWS Security Token Service (STS) is used to fetch short-lived, temporary security credentials which are made available via the function’s environment variables during invocation.

Each function in your application should have its own unique execution role with the minimum permissions required to perform its duty. In this way, single-purpose functions (introduced in Chapter 6) are also key to security: IAM permissions can be tightly scoped to the function and remain extremely restricted according to the limited functionality.

IAM guardrails

As you are no doubt beginning to notice, effective serverless security in the cloud is about basic security hygiene. Establishing guardrails for the use of AWS IAM is a core part of promoting a secure approach to everyday engineering activity. Here are some recommended guardrails:

Apply the principle of least privilege in policies.

IAM policies should only include the minimum set of permissions required for the associated resource to perform the necessary control or data plane operations. As a general rule, do not use wildcards (*) in your policy statements. Wildcards are the antithesis of least privilege, as they apply blanket permissions for actions and resources. Unless the action explicitly requires a wildcard, always be specific.

Avoid using managed IAM policies.

These are policies provided by AWS, and they’re often tempting shortcuts, especially when you’re just getting started or using a service for the first time. You can use these policies early in prototyping or development, but you should replace them with custom policies as soon as you understand the integration better. Because these policies are designed to be applied to generic scenarios, they are simply not restricted enough and will usually violate the principle of least privilege when applied to interactions within your application.

Prefer roles to users.

IAM users are issued with static, long-lived AWS access credentials (an access key ID and secret access key). These credentials can be used to directly access the application provider’s AWS account, including all the resources and data in that account. Depending on the associated IAM roles and policies, the authenticating user may even have the ability to create or destroy resources. Given the power they grant the holder, the use and distribution of static credentials must be limited to reduce the risk of unauthorized access. Where possible, restrict IAM users to an absolute minimum (or, even better, do not have any IAM users at all).

Prefer a role per resource.

Each resource in your application, such as an EventBridge rule, a Lambda function, and an SQS queue, should have its own unique role. Permissions for those roles should be fine-grained and least-privileged.

Meet the OWASP Top 10 – Serverless and Security

Meet the OWASP Top 10

Cybersecurity is an incredibly well-researched area, with security professionals constantly assessing the ever-changing software landscape, identifying emerging risks and distributing preventative measures and advice. While as a modern serverless engineer you must accept the responsibility you have in securing the applications you build, it is absolutely crucial that you combine your own efforts with deference to professional advice and utilization of the extensive research that is publicly available.

Identifying the threats to the security of your software is one task that you should not attempt alone. There are several threat categorization frameworks available that can help here, but let’s focus on the OWASP Top 10.

The Open Web Application Security Project, or OWASP for short, is a “non-profit foundation that works to improve the security of software.” It does this primarily through community-led, open source projects, tools, and research. The OWASP Foundation has repeatedly published a list of the 10 most prevalent and critical security risks to web applications since 2003. The latest version, published in 2021, provides the most up-to-date list of security risks (at the time of writing).

While a serverless application will differ in some ways from a typical web application, Table 4-1 interprets the OWASP Top 10 through a serverless lens. Note that the list is in descending order, with the most critical application security risk, as classified by OWASP, in the first position. We’ve added the “serverless risk level” column as an indicator of the associated risk specific to serverless applications.

Table 4-1. Top 10 serverless application security risks Threat categoryThreat descriptionMitigationsServerless risk level
Broken access controlAccess control is the gatekeeper to your application and its resources and data. Controlling access to your resources and assets allows you to restrict users of your application so that they cannot act outside of their intended permissions.API authentication and authorization.Least-privilege, per-resource IAM roles.Medium
Cryptographic failuresWeak or absent encryption of data, both in transit between components in your application and at rest in queues, buckets, and tables, is a major security risk.Classify data being processed, stored, or transmitted.Identify sensitive data according to privacy laws, regulatory requirements, and business needs.Encrypt sensitive data as a minimum.Protect data in transit with HTTPS/TLS.Medium
InjectionInjection of malicious code into an application via user-supplied data is a popular attack vector. Common attacks include SQL and NoSQL injection.Validate and sanitize external data received by all entry points to your application, e.g., API requests and inbound events.High
Insecure designImplementing and operating an application that was not designed with security as a primary concern is risky, as it will be susceptible to gaps in the security posture.Adopt a secure by design approach.Security must be considered during business requirements gathering and solution design and formalized via threat modeling.Medium
Security misconfigurationMisconfigurations of encryption, access control, and computational constraints represent vulnerabilities that can be exploited by attackers. Unintended public access of S3 buckets is a very common root cause of cloud data breaches. Lambda functions with excessive timeouts can be exploited to cause a DoS attack.Define a paved road to secure configuration of cloud resources for engineers.Keep application features, components, and dependencies to a minimum.Medium
Vulnerable and outdated componentsContinued use of vulnerable, unsupported, or outdated software (operating systems, web servers, databases, etc.) makes your application susceptible to attacks that exploit known vulnerabilities.Delegate infrastructure management and security patching to AWS by using fully managed serverless services.Low
Identification and authentication failuresThese failures can permit unauthorized usage of APIs and integrated resources, like Lambda functions, S3 buckets, or DynamoDB tables.Leverage an access management service to apply proper, fine-grained authentication and authorization for API gateways.Rely on AWS IAM for inter-service communication.Medium
Software and data integrity failuresThe presence of vulnerabilities or exploits in third-party code is quickly becoming the most common risk to software applications. As application dependencies are bundled and executed with Lambda function code, they are granted the same permissions as your business logic.Secure your software supply chain with automated dependency upgrades and other controls.Remove unused dependencies.High
Security logging and monitoring failuresAttackers rely on the lack of monitoring and timely response to achieve their goals without being detected. Without logging and monitoring, breaches cannot be detected or analyzed. Logs of applications and APIs are not monitored for suspicious activity.Enable API Gateway execution and access logs.Use CloudTrail monitoring to identify and report abnormal behavior.Medium
Server-side request forgery (SSRF)In AWS this primarily concerns a vulnerability with running web servers on EC2 instances. The most devastating example was the Capital One data breach in 2019.Serverless applications utilizing API Gateway and Lambda will not generally be susceptible to SSRF attacks.Avoid accepting URLs in client inputs, always sanitize incoming request payloads, and never return raw HTTP responses to clients.Low

There are two further noteworthy security risks that are relevant to serverless applications:

Denial of service

This is a common attack where an API is constantly bombarded with bogus requests in order to disrupt the servicing of genuine requests. Public APIs will always face the possibility of DoS attacks. Your job is not always to completely prevent them, but to make them so tricky to execute that the deterrent alone becomes enough to secure the resources. Firewalls, rate limits, and resource throttle alarms (e.g., Lambda, DynamoDB) are all key measures to prevent DoS attacks.

Denial of wallet

This kind of attack is fairly unique to serverless applications, due to the pay-per-use pricing model and high scalability of managed services. Denial of wallet attacks target the constant execution of resources to accumulate a usage bill so high it will likely cause severe financial damage to the business.

Tip

Setting up budget alerts can help ensure you are alerted to denial of wallet attacks before they can escalate. See Chapter 9 for more details.

Now that you have an understanding of the common threats to a serverless application, next you will explore how to use the process of threat modeling to map these security risks to your applications.

Think before you install – Serverless and Security

Think before you install

You can start securing the serverless supply chain by scrutinizing packages before installing them. This is a simple suggestion that can make a real difference to securing your application’s supply chain, and to general maintenance at scale.

Use as few dependencies as necessary, and be aware of dependencies that obfuscate the data and control flow of your app, such as middleware libraries. If it is a trivial task, always try to do it yourself. It’s also about trust. Do you trust the package? Do you trust the contributors?

Before you install the next package in your serverless application, adopt the following practices:

Analyze the GitHub repository.

Review the contributors to the package. More contributors represents more scrutiny and collaboration. Check whether the repository uses verified commits. Assess the history of the package: How old is it? How many commits have been made? Analyze the repository activity to understand if the package is actively maintained and used by the community—GitHub stars provide a crude indicator of popularity, and things like the date of the most recent commit and number of open issues and pull requests indicate usage. Also ensure the package’s license adheres to any restrictions in place in your organization.

Use official package repositories.

Only obtain packages from official sources, such as NPM, PyPI, Maven, NuGet, or RubyGems, over secure (i.e., HTTPS) links. Prefer signed packages that can be verified for integrity and authenticity. For example, the JavaScript package manager NPM allows you to audit package signatures.

Review the dependency tree.

Be aware of the package’s dependencies and the entire dependency tree. Pick packages with zero runtime dependencies where available.

Try before you buy.

Try new packages on as isolated a scale as possible and delay rollout across the codebase for as long as possible, until you feel confident.

Check if you can do it yourself.

Don’t reinvent the wheel for the sake of it, but one very simple way of removing opaque third-party code is to not introduce it in the first place. Examine the source code to understand if the package is doing something simple that is easily portable to a first-party utility. Logging libraries are a perfect example: you can trivially implement your own logger rather than distributing a third-party library across your codebase.

Make it easy to back out.

Development patterns like service isolation, single-responsibility Lambda functions, and limiting shared code (see Chapter 6 for more information on these patterns) make it easier to evolve your architecture and avoid pervasive antipatterns or vulnerable software taking over your codebase.

Lock to the latest.

Always use the latest version of the package, and always use an explicit version rather than a range or “latest” flag.

Uninstall any unused packages.

Always uninstall and clear unused packages from your dependencies manifest. Most modern compilers and bundlers will only include dependencies that are actually consumed by your code, but keeping your manifest clean adds extra safety and clarity.