Exploring Cloud PaaS Services: Benefits, Drawbacks, and a Comparison of AWS, Azure, and GCP PaaS Solutions

Cloud Platform-as-a-Service (PaaS) is a type of cloud computing service that provides a platform for developers to build and deploy applications without the need for infrastructure management. PaaS solutions provide a complete platform for application development, testing, deployment, and scaling. In this article, we will discuss the benefits and drawbacks of PaaS, the problems it solves, and compare PaaS services offered by the top cloud providers, namely AWS, Azure, and GCP.

Benefits of PaaS:

Easy to use: PaaS makes it easy for developers to create and deploy applications, as the underlying infrastructure is abstracted away. Developers can focus on writing code and developing applications, without worrying about infrastructure management.

Reduced time-to-market: PaaS solutions provide pre-configured environments, libraries, and development tools, which makes it easy for developers to create and deploy applications quickly.

Scalability: PaaS platforms can easily scale up or down to meet the changing demands of the application. This makes it easy for businesses to handle traffic spikes, seasonal loads, and other sudden changes.

Cost-effective: PaaS solutions are usually offered on a pay-as-you-go model, which means businesses only pay for the resources they use. This can be more cost-effective than building and maintaining an in-house infrastructure.

Security: PaaS providers are responsible for the security of the underlying infrastructure, which means businesses can focus on securing their applications and data.

Drawbacks of PaaS:

Limited customization: PaaS solutions provide pre-configured environments, which may not be suitable for all applications. Businesses may have to compromise on customization and flexibility.

Dependency on the PaaS provider: PaaS solutions are tied to the provider’s infrastructure and services. Businesses may have to change their applications if they decide to switch providers.

Vendor lock-in: PaaS solutions can lead to vendor lock-in, as businesses may find it difficult to migrate their applications to a different provider or to an on-premises infrastructure.

Limited control: PaaS solutions abstract away the underlying infrastructure, which means businesses may have limited control over the infrastructure and services.

Problems solved by PaaS:

Infrastructure management: PaaS solutions abstract away the underlying infrastructure, which means businesses do not have to worry about infrastructure management.

Development and deployment: PaaS solutions provide pre-configured environments, libraries, and development tools, which makes it easy for developers to create and deploy applications.

Scalability: PaaS solutions can easily scale up or down to meet the changing demands of the application.

Security: PaaS providers are responsible for the security of the underlying infrastructure, which means businesses can focus on securing their applications and data.

AWS PaaS services:

AWS Elastic Beanstalk: Elastic Beanstalk is a PaaS solution that supports popular programming languages such as Java, .NET, PHP, Node.js, Python, Ruby, and Go. Elastic Beanstalk provides pre-configured environments, load balancing, auto-scaling, and monitoring.

AWS Lambda: Lambda is a serverless computing service that runs code in response to events and automatically scales up or down to meet the demands of the application. Lambda supports popular programming languages such as Java, .NET, Node.js, Python, Ruby, and Go.

Azure PaaS services:

Azure App Service: App Service is a PaaS solution that supports popular programming languages such as .NET, Java, Node.js, PHP, and Python. App Service provides pre-configured environments, load balancing, auto-scaling, and monitoring.

Azure Functions: Functions is a serverless computing service that runs code in response to events and automatically scales up or down to meet the demands of the application. Functions supports popular programming languages such as C#, Java, JavaScript, Python, and PowerShell.

GCP PaaS services:

Google App Engine: App Engine is a PaaS solution that supports popular programming languages such as Java, Python, PHP, Go, and Node.js. App Engine provides pre-configured environments, load balancing, auto-scaling, and monitoring.

Google Cloud Functions: Cloud Functions is a serverless computing service that runs code in response to events and automatically scales up or down to meet the demands of the application. Cloud Functions supports popular programming languages such as Node.js, Python, and Go.

Comparison of AWS, Azure, and GCP PaaS services:

Ease of use: All three cloud providers offer PaaS solutions that are easy to use and provide pre-configured environments, load balancing, auto-scaling, and monitoring.

Cost-effectiveness: All three cloud providers offer PaaS solutions on a pay-as-you-go model, which makes them cost-effective. However, pricing may vary depending on the specific service and usage.

Customization and flexibility: GCP App Engine offers more customization and flexibility than AWS Elastic Beanstalk and Azure App Service, as it allows developers to use custom runtimes and libraries.

Serverless computing: AWS Lambda and Azure Functions offer more serverless computing options than GCP Cloud Functions, as they support more programming languages and have more advanced features.

Vendor lock-in: All three cloud providers may lead to vendor lock-in, as businesses may find it difficult to migrate their applications to a different provider or to an on-premises infrastructure.

Based on the above comparison, the best PaaS service depends on the specific requirements and preferences of the business. However, in terms of popularity and breadth of offerings, AWS Elastic Beanstalk and Azure App Service are the most widely used PaaS solutions.

In conclusion, PaaS solutions provide a platform for developers to build and deploy applications without the need for infrastructure management. PaaS solutions offer benefits such as ease of use, reduced time-to-market, scalability, cost-effectiveness, and security. However, PaaS solutions also have drawbacks such as limited customization, dependency on the PaaS provider, vendor lock-in, and limited control. AWS, Azure, and GCP offer PaaS services that are easy to use and cost-effective, but differ in terms of customization, serverless computing options, and vendor lock-in.

Single Page Application — With Worker Process

What is Worker Process

A worker process is a type of computer process that runs in the background and performs tasks as assigned by a main process. It is typically used to offload tasks from a main process, allowing the main process to continue with other operations while the worker process handles the task in the background.

This article constitutes a component of a comprehensive collection of web architecture pattern references.

Single Page Application Frontend
Backend for SPA and Mobile App (Managed Services)
Backend for SPA and Mobile App (Serverless)
Multi Page Application (Integrated Web)
Single Page Application with Worker Process

How it works

In cloud computing, worker processes can be implemented using messaging or queues. The main process adds tasks to a queue, and worker processes monitor the queue for new tasks. When a worker process finds a new task in the queue, it takes the task and performs the work. This ensures that tasks are performed in the order in which they were received and that multiple worker processes can operate in parallel to handle a large number of tasks.

Examples of messaging systems used for worker process implementation in the cloud include RabbitMQ, Apache Kafka, and Amazon Simple Queue Service (SQS). The messaging system provides a way for the main process to communicate with the worker processes, allowing tasks to be dispatched and results to be returned.

Components

The main components of a worker process can include:

  • Task queue: A task queue is used to store the tasks that need to be performed. The worker processes monitor the queue for new tasks.

  • Task dispatcher: The task dispatcher is responsible for allocating tasks to the worker processes. It ensures that tasks are dispatched to the worker processes in a fair and efficient manner.

  • Worker processes: The worker processes are the actual processes that perform the tasks. They receive tasks from the task queue, perform the work, and return the results.

  • Result storage: A result storage mechanism is used to store the results of the tasks performed by the worker processes. This can be a database, a file system, or another type of data storage.

  • Error handling: A worker process should have error handling mechanisms in place to deal with unexpected errors that may occur during task processing. This can include retrying failed tasks or logging the error for later review.

  • Monitoring and reporting: A monitoring and reporting system is used to monitor the performance of the worker processes and provide feedback to the main process. This can include information about the number of tasks processed, the processing time for each task, and any errors that may have occurred.

Architecture

There are multiple architecture patterns to implement a Worker Process, and those are totally dependant on use cases.

Use Case 1

An example use case can be sending newsletters to all subscribed users at 10 AM.

  • Application has to execute some long running jobs on specific intervals.
  • Application should not use any compute resources if it’s not executing any jobs
  • Jobs should be triggered based some external or internal events or from schedulers

In this use case we can utilize AWS EventBridge Scheduler to trigger the jobs on specific intervals. AWS ECS Tasks can be used to execute the jobs, and once the job is done these tasks will end the execution and de-allocate the compute resources. We don’t require any Messaging/Queue services in this case(will cover that in the next use case).

The following diagram explains the architecture of the worker process along with other important components.

Single Page Application with Worker Process (Use Case 1)

Use Case 2

An example use case can be sending registration welcome messages to users after their successful registration on the platform

  • After completing the registration process, a message object with userId will be pushed to a Queue(eg: SQS).
  • A Lambda configured as an SQS trigger will be triggered and send the email to the user.
  • Update database if required.
  • As lambda is a serverless component, the costing will be calculated based on number of executions and time.

The following diagram explains the architecture of the worker process along with other important components.

Single Page Application with Worker Process (Use Case 2)

Previous – Multi Page Application (Integrated Web)

Home

Multi Page Application

Introduction

A Multi-Page Application (MPA) is a traditional web application that consists of multiple separate pages, with each page being loaded in full when a user navigates to it. MPAs are typically built using server-side technologies such as PHP, Ruby on Rails, or .NET, and rely on server-side rendering to generate HTML that is sent to the browser.

Each page in an MPA is a self-contained unit that operates independently of the other pages, and navigation between pages involves making a full round trip to the server. This can result in slower page load times, as the entire page must be reloaded every time the user navigates to a new page.

This article constitutes a component of a comprehensive collection of web architecture pattern references.

Single Page Application Frontend
Backend for SPA and Mobile App (Managed Services)
Backend for SPA and Mobile App (Serverless)
Multi Page Application (Integrated Web)
Single Page Application with Worker Process

How it works

Multi-page application (MPA) is a type of web application that consists of multiple pages served from a server to a client’s web browser, where each page is a distinct URL. These pages are generated dynamically on the server-side and then rendered on the client-side, with each page request triggering a round-trip to the server.

When a user requests a page, the server responds with the HTML, CSS, and JavaScript necessary to render the page, which is then displayed in the user’s web browser. Navigation between pages is achieved by the user clicking on links or by JavaScript code dynamically updating the URL and triggering a new page request to the server.

Components

The main components of a Multi-page application (MPA) are:

  • Server: A server that runs a server-side language, such as PHP, Node.js, or Ruby, and is responsible for handling HTTP requests and generating dynamic content.

  • Client: A web browser that displays the dynamic content generated by the server, such as HTML, CSS, and JavaScript.

  • Router: A component that handles navigation between pages and updates the URL to reflect the current page.

  • HTML, CSS, and JavaScript: The technologies used to create the user interface and dynamic behavior of the pages.

  • Database: A database that stores the application’s data, such as user information, product catalog, and order history. The server accesses this data to generate dynamic content for each page.

  • APIs: Interfaces that allow the server and client to exchange data, such as REST APIs that enable the client to retrieve data from the server and send data to the server.

These components work together to provide the user with a seamless experience as they navigate the application, with each page request triggering a round-trip to the server and a rendering of the updated content in the client’s web browser.

Architecture

All the above components are served from a single application

Backend

There are different solutions and services to host the MPA Backend on cloud, here are some of the popular:

  • VM (eg: EC2)
  • Container Services (eg: ECS)
  • Managed Services (eg: Elastic Beanstalk)
  • Serverless (eg: Lambda)
  • Kubernetes (eg: EKS)

Among the above list, we’re covering only Managed Services.

Managed Services

Using managed services like Amazon Web Services (AWS) Elastic Beanstalk and Microsoft Azure App Service provides several benefits, including:

  • Simplified deployment and management: Managed services take care of infrastructure management and deployment, allowing developers to focus on writing code and delivering features.

  • Automated scaling: Managed services automatically scale application instances based on demand, providing high availability and performance without manual intervention.

  • Reduced operational overhead: By outsourcing infrastructure management, organizations can reduce operational overhead and focus on delivering value to their customers.

  • Improved security: Managed services provide built-in security features, such as automatic patching and secure data storage, reducing the security burden on organizations.

  • Easy integration with other services: Managed services integrate seamlessly with other services, allowing organizations to leverage the full suite of cloud services to build and deploy their applications.

  • Cost-effectiveness: Managed services offer a cost-effective solution for deploying and managing applications, with flexible pricing models and the ability to pay for only the resources used.

  • Global availability: Managed services are designed to be globally available, providing fast and reliable access to applications from anywhere in the world.

MPA Architecture (Managed Services)

Next – Single Page Application with Worker Process

Previous – Backend for SPA and Mobile App (Serverless)

Home

How to Scale an AWS RDS MySQL Database Horizontally?

What is scalability

The scalability of an application is the measure of the number of client requests it can simultaneously handle. When a hardware resource runs out and can no longer handle requests, it is counted as the limit of scalability. When this limit of the resource is reached, the application can no longer handle additional requests. To efficiently handle additional requests, administrators should scale the infrastructure by adding more resources such as RAM, CPU, storage, network devices, etc. Horizontal and vertical scaling are the two methods implemented by administrators for capacity planning.

What is Horizontal Scaling?

Horizontal scaling is an approach of adding more devices to the infrastructure to increase the capacity and efficiently handle increasing traffic demands. As the name says, horizontal scaling is about expanding the capacity horizontally by adding extra servers. The load and processing power are shared among multiple servers within a system using a load balancer. It is also called scaling out.

What is Vertical Scaling?

Vertical scaling is a type of scalability wherein more computing and processing power is added to a machine to increase its performance. Also called scale-up, vertical scaling allows you to increase the machine’s capacity while maintaining resources within the same logical unit. The processor, memory, storage, and network capacity are increased in this approach.

Scalability Issues of RDBMS (Specific to MySQL)

As we discussed earlier, vertical scalability has some hardware upper limits. Vertical scaling also requires some downtime. We cannot afford both in the database world. So we need to look into horizontal scalability options. In a database world, horizontal scaling is usually based on the partitioning of data (each partition only contains part of the data). Partitioning requires more effort and thought process in the development and design phase. That is a separate process and we’re not discussing that here.

Scaling MySQL Using Read Replicas

The read replica feature allows you to replicate data from MySQL server to one or more read-only servers. Replicas are updated asynchronously using the MySQL engine’s native binary log file position-based replication technology.

In this case, we will create a Master-Slave architecture and route all the write queries on the Master instance and all the read queries on the slave instance which are replicated from the Master. We can have multiple Slave instances running at one and scale our read operations horizontally. But the Master can only be scale Vertically. In most of the cases databases are read heavy, so this approach will work in most of the use cases.

Step 1: Application Development Considerations

While developing the application, we should follow the CQRS design pattern. CQRS stands for Command and Query Responsibility Segregation, a pattern that separates read and update operations for a data store. Implementing CQRS in your application can maximize its performance, scalability, and security. The flexibility created by migrating to CQRS allows a system to better evolve over time and prevents update commands from causing merge conflicts at the domain level. In short our application will have two connection strings, one for read operations and the another one for update operations.

As we have a single Master write node we can use that connection string for update operations and we will have multiple read nodes(slaves), so we need to setup a load balancer for that which will equally distribute the load among multiple read nodes.

Step 2: Setup Load Balancer for Read Only Nodes

We can set up Amazon Route 53 weighted record sets to distribute requests across your read replicas. Within a Route 53 hosted zone, create individual record sets for each DNS endpoint associated with your read replicas. Then, give them the same weight, and direct requests to the sub domain/endpoint of the record set.

How to Create Read Replicas

Assuming that you already have a MySQL RDS in your AWS account. Follow the below steps to create a read replica, and repeat the steps to create multiple replicas if required. In order to evaluate the load balancing feature, we should create at least 2 replicas.

  • Type rds in AWS Console search box and select RDS
  • Select Databases from the left panel
  • Select the database you want to create read replica on
  • Click on the Actions menu and select Create read replica as shown in the below screenshot.

  • Then select the Db Instance class
  • Select Publicly Accessible to Yes
  • Select the VPC Security Groups(You can select the same security group of your master node)
  • Enter the Db Instance Indentifier and then click ‘Create Read Replica`

Read replica will be created in a few minutes. Repeat the above steps to create one more node.

Create DNS Based Load Balancer

To create a DNS based load balancer, you have to set up a hosted zone in Route 53. Follow the below steps to create a hosted zone and record set.

  • Type route 53 in AWS Console search box and select Route 53 from the result.
  • Click Create Hosted Zone

  • Enter the Domain name, Description is optional
  • Select the Public hosted zone in Type option
  • Click Create hosted zone

  • Now we need to create Records in the newly created hosted zone
  • Select Create Record
  • Enter a subdomain name in the Name field
  • Select CNAME as Type
  • For Value enter the endpoint DNS name of the first read replica
  • For TTL value, set a value that is appropriate for your needs
  • For Routing Policy, choose Weighted
  • In the Weight field, enter a value. Be sure to use the same value for each replica’s record set
  • Provide an Id for the Record set
  • Repeat the steps to create records for all the replicas. Keep the same name (subdomain) for all the records.

Now your records would look like the below screenshot:

Update the NS records Entries in the Custom Name Server

Now copy the all four NS record values. You have to go to your Domain Registrar’s portal (godaddy, google domains etc.) and update the custom name server values there. In case of Google Domains, that will look like below. The changes may take some time to reflect.

Once your NS records are properly updated, you will be able to use the newly created subdomain as your read only database host name. You can use the same credentials of your master database to access the read only load balanced instances

Now you can configure your Master hostname for CUD operations and load balanced hostname for Read operations.

AWS Serverless and DynamoDb Single Table Design using .Net 6 – Part 2

Introduction

This is a continuation of the previous article AWS Serverless and DynamoDb Single Table Design using .Net 6 – Part 1. In this part we’re going to create a sample Serverless application using DynamoDb and deploy that on AWS Lambda.

Tools

Configure

Configure the AWS Toolkit using the link. While creating the IAM user make sure to attach the below policies

AWS Policies
Image: 1 

We’ll be using this user for creating the serverless application and deploying the same from Visual Studio or dotnet tools command line interface.

Why Serverless

Serverless solutions offer technologies for running code, managing data, and integrating applications, all without managing servers. Serverless technologies feature automatic scaling, built-in high availability, and a pay-for-use billing model to increase agility and optimize costs. These technologies also eliminate infrastructure management tasks like capacity provisioning and patching, so you can focus on writing code that serves your customers. Some of the popular serverless solutions are AWS Lambda and Azure Functions.

Development

AWS Toolkit for Visual Studio provides many built-in templates for creating AWS based serverless applications quickly.

Create a new project from Visual Studio and type ‘serverless’ in the search box and select `AWS Serverless Application (.NET Core – C#).

Image: 2

Enter the project name and continue.

Image: 3

Then select the ASP.NET Core Web API blueprint from the selection and click Finish

Image: 4

Once the project is ready in Visual Studio, you can see a file called serverless.template. This is AWS CloudFormation Serverless Application Model template file for declaring your Serverless functions and other AWS resources. Make sure to add two policies( AWSLambda_FullAccess and AmazonDynamoDBFullAccess) as shown below: These permissions are required for the Lambda to Read and Write to DynamoDb

Image: 5

Then add the below Nuget packages:

AWSSDK.DynamoDBv2Newtonsoft.JsonSwashbuckle.AspNetCore.SwaggerGenSwashbuckle.AspNetCore.SwaggerUI

Create an Interface called IEmployeeDb to define the methods

public interface IEmployeeDb{   Task<IEnumerable<EmployeeModel>> GetAllReporteesAsync(string empCode);   Task<EmployeeModel> GetEmployeeAsync(string empCode);   Task SaveAsync(EmployeeModel model);   Task SaveBatchAsync(List<EmployeeModel> models);}

Create a class to implement the IEmployeeDb interface. The constructor would look like the below:

public EmployeeDb(ILogger<EmployeeDb> logger, IWebHostEnvironment configuration){   //Comment out the below four line if you're not using the DynamoDb local instance.   if (configuration.IsDevelopment())   {      _clientConfig.ServiceURL = "http://localhost:8000";   }   _client = new AmazonDynamoDBClient(_clientConfig);   _context = new DynamoDBContext(_client);   _logger = logger;}

We configured the ServiceURL to point the localhost in case we’re using DynamoDb local instance. We also initialized the AmazonDynamoDBClient and DynamoDBContext. We’ll be mainly using the Highlevel API called DynamoDBContext for reading and writing data from DynamoDb.

The below methods are responsible for writing/saving the data:

public async Task SaveAsync(EmployeeModel model){   await SaveInDbAsync(GetUserModelForSave(PrepareEmpModel(model)));   await SaveInDbAsync(GetReporteeModelForSave(PrepareEmpModel(model)));}private async Task SaveInDbAsync(EmployeeModel model){   await _context.SaveAsync(model);   _logger.LogInformation("Saved {} successfully!", model.EmployeeCode);}private EmployeeModel PrepareEmpModel(EmployeeModel model){   model.EmployeeCode = model.EmployeeCode?.ToUpper();   model.ReportingManagerCode = model.ReportingManagerCode?.ToUpper();   return model;}

When saving a record, this method will actually insert two objects, one for user type and the other for reportee type. We discussed the reason and logic for creating two entries in the previous part.

In the below method we implemented the logic for fetching the employee by EmployeeCode:

public async Task<EmployeeModel> GetEmployeeAsync(string empCode){   var result = await _context.LoadAsync<EmployeeModel>(empCode.ToUpper(), empCode.ToUpper());   if (result != null)      result.ReportingManagerCode = ""; //ReportingManagerCode was same as EmployeeCode, so just remove it   return result;}

Next method will cover the logic for fetching the reportees by EmployeeCode:

public async Task<IEnumerable<EmployeeModel>> GetAllReporteesAsync(string empCode){   var config = new DynamoDBOperationConfig   {      QueryFilter = new List<ScanCondition> {         new ScanCondition("Type", ScanOperator.Equal, "Reportee"),         new ScanCondition("LastWorkingDate", ScanOperator.IsNull)      }   };   var result = await _context.QueryAsync<EmployeeModel>(empCode.ToUpper(), config).GetRemainingAsync();   return PrepareReporteeReturnModel(result); //swap the EmployeeCode and  ReportingManagerCode and return}

All the other code fragments and complete solution can be downloaded from the GitHub repository.

Once you complete the development you need to create a DynamoDb table in your AWS account. There are many ways to create a service in AWS. You can use CLI, Console, SDK or even Visual Studio Toolkit. Below is the CLI command for creating the table and setting up the pk and sk.

aws dynamodb create-table --table-name employees --attribute-definitions AttributeName=EmployeeCode,AttributeType=S         AttributeName=ReportingManagerCode,AttributeType=S --key-schema AttributeName=EmployeeCode,KeyType=HASH AttributeName=ReportingManagerCode,KeyType=RANGE --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 --table-class STANDARD

Now you can deploy the serverless application either using Visual Studio or dotnet tools. To deploy using Visual Studio, right click on the project and select the Publish to AWS Lambda button.

To deploy using dotnet tools you need to follow the below steps in the command line.

dotnet tool install -g Amazon.Lambda.Toolscd "AWSServerlessDynamoDb/AWSServerlessDynamoDb" #or whatever the folderdotnet lambda deploy-serverless

After successful deployment, you will get a Lambda endpoint(ApiURL)  as below:

Image: 6

You can access your SwaggerUI by adding /swagger in the above url and you can test the APIs.

Complete source code can be found here.

Happy coding!!

AWS Serverless and DynamoDb Single Table Design using .Net 6 – Part 1

Introduction

When developing a high performance scalable application everybody tends to use the below technologies.

  • Serverless Functions or Lambdas
  • Cloud managed NoSQL databases like DynamoDb or CosmosDb
  • Database design strategies like Single Table Design

In this article we’ll cover about Single Table Design. Next part we’ll create a Serverless application using .Net 6 and DynamoDb.

Use Case

Recently we worked on a Social Networking platform and we used Single Table Design in that project. That use case is very complex and overwhelming for a beginner, so let’s consider an imaginary use case (this may not exactly fit the Single Table Design). But let’s consider an Employee REST API which will help us to design a basic Single Table Design. Here are the features of the API:

  • User will be able to add a Employee
  • User will be able to fetch the Employee details with the EmployeeCode
  • User will be able to fetch the immediate Reportees (for the sake of simplicity) of the Employee/Manager

Schema

EmployeeCodeEmailIdFirstNameLastNameReportingManagerCode

In the RDBMS world EmployeeCode will be the Primary Key and ReportingManagerCode will be the foreign key pointing to the same table using self join.

Single Table Design

In RDBMS, we use multiple tables in a database, and that tables may be interrelated with foreign keys and we tend to normalize the tables up to a certain level and avoid duplicate storage as far as we can. In the NoSQL world(especially in DynamoDb), there are no foreign keys and joins(and there is a reason for that), and do not care about duplicacy. In Single Table Design, we put all the entities(eg: Post, User, Comment, Follower etc.) in a single table and may use the ‘Type’ attribute to identify the entities.

Why

In a Read Heavy database, many(millions of) users will be accessing the different content at the same time. So you have to fetch the data as fast as possible. If you want to return the data quickly you have to minimize the database requests for a single API call. In RDBMS even though we’re making a single call most of the queries will have complex joins and involve multiple tables, as the data size increases these queries take more time.

If you have to fetch the Posts of all the users who I’m following, then in SQL-based databases you have to join ‘Users’, ‘Followers’, ‘Posts’, ‘Comments’ etc. If you store the entities in separate DynamoDb tables then you’ve to make multiple calls from your backend to DynamoDb and do some JSON manipulations and return that to the Frontend. We cannot afford that many Db calls from the backend, so we need to get all the data in a single Db request.

How

In DynamoDb within each table, you must have a partition key, which is a string, numeric, or binary value. This key is a hash value used to locate items in constant time regardless of table size. It is conceptually different to an ID or primary key field in a SQL-based database and does not relate to data in other tables. When there is only a partition key, these values must be unique across items in a table.

Each table can optionally have a sort key. This allows you to search and sort within items that match a given primary key. While you must search on exact single values in the partition key, you can pattern search on sort keys. It’s common to use a numeric sort key with timestamps to find items within a date range, or use string search operators to find data in hierarchical relationships.

With only partition keys and sort keys, this limits the possible types of query without duplicating data in a table(even though there is no harm in duplicating the data as storage cost is very less, but modifying the multiple copies is another headache). To solve this issue, DynamoDB also offers two types of indexes ie: Local secondary indexes (LSIs) and Global secondary indexes (GSIs). We can discuss these topics in a separate session.

Single Table Design is not ‘Agile’, you have to identify the Data Access Patterns in the beginning of the project, otherwise you may identify a use case later which may require an entire redesign of the data structure. So let’s identify our access patterns first.

Data Access Patterns

Our patterns are the GET API responses we discussed earlier. In our case it’s simple as of now.

  • User will be able to fetch the Employee details with the EmployeeCode
  • User will be able to fetch the immediate Reportees of the Employee/Manager
User will be able to add an Employees

Let’s create the table as follows. Partition Key is EmployeeCode, and Sort Key is ReportingManagerCode

Table: 1
Table: 1

Now things look simple, we can get the entity based on pk, let’s evaluate the next access pattern and come back here if required.

User will be able to fetch the immediate Reportees of the Employee/Manager

Suppose we need to fetch all the direct reportees of user 11. We can see if we can query using the sort key then we could have fetched all the reportees of 11 in a single query, but here your challenges start. You cannot query a DynamoDb table without providing a pk equals statement. So if you query pk=11 then you’ll get only one record.

Now the next step is to evaluate whether we can use LSI or GSIs to solve the problem or not. LSI is just another sort key and in query we need to provide pk then LSI won’t work here. If we use GSI then you can create one more pk and sk, but then you can’t use the main pk or sk.

Next option is to duplicate the data, let’s think about that. In the above table structure one record was self-sufficient(it had both EmployeeCode and ReportingManagerCode), but in the below format.we separated the entity. We also added a type attribute to identify the type of entity.

Table: 2
Table: 2

In user entities both pk and sk are the same (ie EmployeeCode). We duplicated the same entities and swapped the pk and sk and assigned the type as ‘reportee’. Now let’s evaluate the query. If we ran a query as pk=11 we will get three records

Table: 3
Table: 3

One record is for the Manager and multiple records for the reportees, we can filter out the user type by filter expressions if required. So the second access pattern has been solved, but we’ve a problem in the first access pattern now. Our first query was pk=11, but now that will return 3 records so we need to fix that. We can use pk=11 and sk=11. Solved!

Conclusions

The use case we discussed here was very basic one, but still we had to take care of multiple things and we also had to duplicate the data. Duplicating the data will further complicate things like updation and deletions etc. You may need to implement Message Queues like SQS and BackgroundService to solve that issue.

In the next article, we will cover the actual practical implementation with code samples.

Generic Message Queue implementation using AWS SQS and .Net 6 BackgroundService

Requirement

Most of the Enterprise projects we develop, we have to implement some cross-cutting concerns like Audit Logs. The important factor is that application performance should not impact because of these Audit Logs (or similar cross-cutting concerns). So how to implement this without compromising the performance?

BackgroundService

Background tasks and scheduled jobs are something you might need to use in any application, whether or not it follows the microservices architecture pattern. The difference when using a microservices architecture is that you can implement the background task in a separate process/container for hosting so you can scale it down/up based on your need.

From a generic point of view, in .NET we called these type of tasks Hosted Services, because they are services/logic that you host within your host/application/microservice. Note that in this case, the hosted service simply means a class with the background task logic.

In our sample we will be configuring the BackgroundService in the same Web API project for the sake of simplicity, but in the real production scenario you should consider a separate service. So BackgroundService can offload the Audit Log writing mechanism from the Web API. Now the challenge is how should we send the Audit Log objects to BackgroundService?

Message Queue

Message queues allow different parts of a system to communicate and process operations asynchronously. A message queue provides a lightweight buffer which temporarily stores messages, and endpoints that allow software components to connect to the queue in order to send and receive messages. The messages are usually small, and can be things like requests, replies, error messages, or just plain information. To send a message, a component called a producer adds a message to the queue. The message is stored on the queue until another component called a consumer retrieves the message and does something with it.

There are different implementations of Message Queues exists, multiple cloud providers provide their own implementations. Here we’re using AWS SQS

Steps

Create an ASP.NET Core Web API

Install Dependencies

Create a folder Services -> Contracts and create a Generic Interface called IMessageService as follows:

public interface IMessageService<T>{    Task DeleteMessageAsync(string id);    Task<Dictionary<string, T?>> ReceiveMessageAsync(int maxMessages = 1);    Task SendMessage(T message);}

Now let’s create the SQS Message Service by implementing the above Interface

Let’s first create the Constructor and this will create AWS SQS Client. Before doing that we need to create an IAM user with required permissions.

Create IAM user

  • Go to the AWS console and search IAM.
  • Click on the Users panel on the left.
  • Click on the Add User button.
  • Provide a user name and select the Access key - Programmatic access checkbox and click Next: Permissions button.
  • Click on Attach existing policies directly tab
  • Search AmazonSQSFullAccess and select that policy. We would require FullAccess because we will be creating the Queue programmatically if that Queue does not exist.
  • Click on Next button twice and finally click on Create User button.
  • Copy the Access key ID and Secret key ID and store in a safe place.

Configure the AWS Credentials

There are multiple ways to configure the AWS credentials. I used AWS CLI, which can be downloaded from here. Once it’s downloaded and installed in your machine follow the below command in Terminal or Command Prompt.

aws configure

The above command will access the following details from user and store in the ~/.aws/config file. AWS SDK will fetch these credentials and create the clients.

AWS Access Key IDAWS Secret Access KeyDefault region name

Constructor code will look like the below:

public SqsGenericService(ILogger<SqsGenericService<T>> logger, IConfiguration configuration, IHostingEnvironment env){    _logger = logger;    var options = c;    //This queueName will be used to create the SQS Queue for each type of object in different environments    var queueName = $"que-{env.EnvironmentName.ToLower()}-{typeof(T).Name.ToLower()}";    _amazonSQSClient = options.CreateServiceClient<IAmazonSQS>();    _queueUrl = GetQueueUrl(queueName).Result;}

Most of  the code is self explanatory here. The configuration.GetAWSOptions() will fetch the AWS configurations.

Dynamic Queue creation for each environments and entities

queueName variable will be created concatenating the environment name and the name of the generic entity. GetQueueUrl() method will fetch the queue url if it already exists or else it will create a queue.

The next method is SendMessage. This method will accept a Generic message object and use AWS SQS Client to push the serialized object to the Queue.

public async Task SendMessage(T message){    var messageBody = JsonConvert.SerializeObject(message);    await _amazonSQSClient.SendMessageAsync(new SendMessageRequest    {       QueueUrl = _queueUrl,       MessageBody = messageBody    });    _logger.LogInformation("Message {message} send successfully to {_queueUrl}.", message, _queueUrl);}

The next method is ReceiveMessageAsync, this method will fetch the messages from queue and convert that to a Dictionary of MessageReceiptHandle and MessageBody as the consumer of this service would require RecieptHandle to delete the message after processing.

Worker Process

To implement the worker process, as mentioned earlier, decided to use BackgroundService. Here is the complete code for the AuditLogWorker class.

 public class AuditLogWorker : BackgroundService {     private readonly ILogger<AuditLogWorker> _logger;     private readonly IMessageService<AuditLogModel> _messageClient;     public AuditLogWorker(ILogger<AuditLogWorker> logger, IMessageService<AuditLogModel> messageClient)     {         _logger = logger;         _messageClient = messageClient;     }     protected override async Task ExecuteAsync(CancellationToken stoppingToken)     {         while (!stoppingToken.IsCancellationRequested)         {             _logger.LogInformation("AuditLogWorker running at: {Time}", DateTime.Now);             var messages = await _messageClient.ReceiveMessageAsync();             foreach (var message in messages)             {                 //You can write your custom logic here...                 _logger.LogInformation("AuditLogWorker processed message {userID}, {Action}", message.Value?.UserId, message.Value?.Message);                 await _messageClient.DeleteMessageAsync(message.Key);             }         await Task.Delay(5000, stoppingToken); //Delay can be set according to your business requirement.         }     }}

Modify the Program.cs to add the necessary dependencies and configure the hosted service as below:

builder.Services.AddSingleton<IMessageService<AuditLogModel>, SqsGenericService<AuditLogModel>>();builder.Services.AddHostedService<AuditLogWorker>();

Create a simple REST API method to accept an object and push that item to the queue and test it. The testing method will look like the below:

 [HttpPost] public async Task Post([FromBody] AuditLogModel model) {     await _messageClient.SendMessage(model);     _logger.LogInformation("Message pushed to the queue successfully."); }

That’s it folks, I hope everybody enjoyed the blog. The entire code can be downloaded from the GitHub repo

Happy coding!!!