CQRS with Event Sourcing: Why Is It a Good Fit for Your Analytics Needs?

by   |   March 3, 2017 5:30 am   |   0 Comments

Andrei Kaminski, .NET developer, Softeq

Andrei Kaminski, .NET developer, Softeq

When a business person hears the acronyms “CQRS” and “Event sourcing” for the first time, they sound like nothing more than fancy incomprehensible combinations of letters. These terms don’t ring a bell or give any clue about how they may help your business grow, aid you with improving operational efficiency, increase company’s ROI, or provide you with actionable insights in any way.

And indeed these are terms that lie in the area of interests of those who talk tech and are no strangers to the application development process. Yet they are the terms that can be brought to the discussion table when you go about creating your own custom solution to address unique challenges of your business. They are the terms used to name application architecture and data representation patterns that can prove to be particularly fit for your software initiatives – the initiatives that will become a technology enabler for solving your analytics problems as well.

So what on Earth does CQRS mean? How does it relate to Event sourcing? Why should you even care about both of these?

CQRS and Event Sourcing Basics, or When an N-tier Architecture Can Be an Outcast

To get to grips with what the nub of CQRS and Event sourcing is, let’s look at one of the most common scenarios encountered on the Web in terms of a software architecture. Oftentimes a typical web application has a classic N-tier architecture that is divided into three core layers: a user interface, business logic, and a data layer. Other layers like integration, transportation can also make part of solution design in order to meet its functional and non-functional requirements, though it doesn’t change the main idea behind the concept.

A CQRS-based architecture, as opposed to the multitier option, has one key difference – it implies that the business logic layer of a solution based on this architecture pattern is divided into two autonomous parts: the one that processes user commands according to defined business rules, and the one that handles user queries. User commands in this case are actions that make changes to the existing data and re-write the system state, whereas user queries are simple requests to have some data delivered to a UI (requests to read the system state). Accordingly, CQRS itself stands for Command Query Responsibility Segregation.

And where does Event sourcing fit in? Event sourcing works incredibly well when tied with a CQRS-based application architecture. It is a data representation pattern that is applied to the database level. Instead of storing the current state of a system as tables, the pattern allows to write each command into a database as an event. Thus, the data in the storage is made up of a series of events (aka commands sent to the application). These events can be then replayed to reproduce the latest state of the system or its state at any given moment in time as well as a string of events for a reporting period. The database literally becomes a log of changes to data triggered by user commands. When chosen for an appropriate business context, Event sourcing can amplify the advantages of CQRS-based solution, providing a hoard of useful data about system operation with unprecedented level of detail.

But what’s wrong with the integral business logic tier offered in an N-tier solution and why do we ever want to separate the business logic into two parts or store data as events? What are the gains for our business in this case?

In a nutshell, when it comes to high-load scenarios and complex business logic, using a common N-tier approach and sole business logic to process both commands and queries – that, as a sidenote, are likely to be in disproportional amount – will probably be a cumbersome solution and certainly not the easiest path to take as your business gains momentum and faces the growth point. But let’s see into the potential benefits of CQRS and Event sourcing and their opportunities for your enterprise analytics goals in more detail.

Do You Really Have to Put All Your Eggs in One Basket?

First off, let us slightly elaborate on why the CQRS software architecture pattern can give you a leg up with high-load systems and sophisticated business logic. As it was said before, a classic N-tier application will send both user queries and commands down the same business logic tier. This means that the application will use its high-level computational power and resources to process intricate business rules, as well as simple database read requests or more complex data queries (yet fundamentally different from commands by their nature) in the same place.

Given the fact that commands as a rule come in far less number than the amount of queries, the solution assuming an N-tier design shape will most probably use up its application tier resources inefficiently. It will handle simple read requests wasting away the capacity of an application part focused on more difficult activities.

For example, if we take a look at a typical eCommerce solution, the number of user queries to showcase various product catalogs and separate items, no matter how hard we try to convert each visit into a purchase, will outweigh the number of placed orders. The latter type inevitably entails changes to an application state and involves processing of business rules.

The same issue of wasted efficiency arises when you have to scale such a high-load solution based on the N-tier architecture to address the growing needs of a business. Rather than scaling a particular part of an application that experiences a noticeable surge in traffic, increasing its capacity in a targeted fashion, and leaving the other part that faces more or less the same load unchanged, you will have to give a boost to the entire application tier. You’ll have to scale your solution regardless of whether such a boost is equally pressing for both types of operations be they queries or commands.

Added Level of Comfort and a Bundle for Future Growth

There is a sound disclaimer regarding the applicability of the CQRS pattern to each and every high-load system to be mentioned straight away. The thing is CQRS is not an all-purpose software architecture option however productive it may look. The main idea to remember is that the pattern can’t be made use of to build an entire software solution and it will be unreasonable to apply it to an architecture of a relatively small application. Your best bet here will be to spot particular components of your system that will benefit from a CQRS-based implementation. Therefore, the best candidate for the CQRS architecture is a complex modular application where each module is responsible for a certain business function and has its own limited context.

For starters, let’s see what advantages in general CQRS and event sourcing can give you:

– Targeted scalability and optimization

If business logic serves different types of operations separately, you can jazz up and refactor one piece of it without looking back at its “counterpart”.

– Increased efficiency, performance and failover standards

Better optimization of each part of business logic inevitably results in improved performance, increased application efficiency, and added capacity to hold higher load.

– Maintainability of code and its testability characteristics

When all code is not put “into one pot”, and each part of application logic is written using semantics and syntax best fit to express the rules of its functionality, the code itself becomes more succinct, comprehensible and explicit. This means it can be re-used by non-tech people for other project needs, for example, for automated test case generation.

– Improved communication between business and tech teams

Segregation of queries and commands ensures a clear structure of business logic, thus, enabling a development team to focus on business rules while writing code. Approaching a tech problem from a business perspective leads to streamlined communication between two sides.

– Flexibility of technology stack choice

The same segregation principle gives a free hand when it comes to technology stack selection. In fact, it’s carte blanche not to have one. You can go for NoSQL for the Write part storage and choose RDBMS for the Read part, or go for any other tech options.

– Efficient development tasks segregation

Segregated responsibilities of business logic allow you to assign frontend and backend tasks to two different teams even if they are located remotely. These types of tasks can seamlessly go in parallel because they do not overlap.

– Streamlined maintenance and support in the future

Clear code and an application tier structure provide for a facilitated maintenance and support process of your solution in the long run.

– Complete transaction and system log at your disposal

An event-based storage is essentially the most exhaustive history of your solution operation that you could ever have: be it software logs or transaction logs, you have it all laid in front of you.

– Easy bugs reproduction and their prompt fixes

With Event sourcing in place, whenever you have erroneous application behavior, you can see into the events stored in your database, replay any of them, understand where things went wrong and quickly fix the problem.

– Ability to quickly restore system data after downtimes

Again, the Event sourcing pattern on your storage side enables you to replay events to the latest state before an application crash and restore the solution after downtimes within a relatively short time period.

Finally, let’s get to the point and explore further the opportunities that the “CQRS-Event sourcing” duo open up for the enterprise web context. How can a company leverage these two patterns for in-house data analysis and data management purposes? Here are a few points to highlight:

– Ample data, system operation statistics and user activity history easily extractable for further analytics, visualization and reporting

Storing data as events means having the history of all operations and all manipulations to your system at your disposal and in a raw format. This history can be quickly reproduced and rendered into a convenient UI view of a given format, taking into account the required visualization standards.

For example, let’s turn again to an eCommerce-type application. Imagine one day you will need to get the numbers related to shopping cart items that were not purchased and after some time of being idle in the cart were deleted from it. With a classic relational database system schema that registers the state of a solution after the latest operation, you won’t be able to get this data straight away from your eCommerce application that has already been around for some time. The action of removing all items from the shopping cart will update the information in an appropriate row or column of a database table, and the number of shopping cart items will equal “zero”, whereas all previous history will be lost. In this situation, you will have to create a separate software component for this purpose, then take some time to gather data for statistics, and only after that you will have at hand some information that can be used to make conclusions and perform an analysis. The event sourcing approach, conversely, will keep records of the entire transaction history, including that of deleting shopping cart items. Thus, you can simply take one step back in the database to get the stats you need. This approach will allow you to retrieve necessary data in short order by quickly writing an appropriate Event handler. As you may see, it will be far less time-consuming.

Another point here is that owing to data representation based on the event sourcing pattern your business analytics software can harvest data from a raw event log on the storage side and then denormalize it (adapt it) to be used in different parts of your system by different end-users. Events themselves will be customized to correspond to the requirements of UI views for certain user roles and give insights matching a particular business context.

– Simplified tech implementation due to a clear application tier structure and segregated logic for commands and queries

If data that you need goes beyond standard stats about interaction with and operation of your application – like in the example with an eCommerce solution – and requires large amount of complex queries, which is the case for serious business analytics software, then the CQRS principle of segregation of queries and commands will perform particularly well. The logic of a solution will be divided into two different parts – the Write part and the Read part.

The idea of an OLAP system with its complex querying mechanisms will map well onto the Read part, providing data that will be directly used as the final product for your analytical activities. The Write part will continue handling commands and storing incoming events without any side noise or interference caused by the process of issuing queries. The events, in this case, can be varied depending on the nature of your business: they can reflect either the results of direct user commands, or be sensor signals captured from your operational assets, or provide data about the usage and operation of your equipment and machinery. Everything hinges on what serves as a source of valuable actionable data for a particular enterprise.

The recorded series of events will become a well of knowledge that can be used to make predictions, put together equipment maintenance plans, forecast production capacity etc.

In the end, relying on such an implementation you will enjoy the benefit of streamlined business rules processing and a clear business logic structure.

– Real-time data signals delivered in no time to different parts of a distributed software system

The event sourcing pattern can prove to be particularly fit when you have to deal with real-time data that should be fed into separate parts of a distributed software system. If your backend is designed in a way that stores commands as events, you can take advantage of multiple microservices (processes that enable communication between various modules of your application). With microservices in place, you can create subscriptions to particular events, and users of a certain software part in their turn will be able to subscribe to these events. This data (critical real-time signals from the end-user’s perspective) will be pushed to web dashboards or mobile devices notifying subscribers about events that matter to them. Again, the key benefit here is that data retrieved from the same event log can be denormalized (formatted for a particular UI view) to align with the requirements of different parts of a distributed software solution.

Such signals will become alerts about any predefined conditions – suspicious or unauthorized user actions, bottleneck performance issues with equipment that make part of your tech-enabled enterprise operation, or some abnormal indicators and telling points that need to be drawn attention to. You can then write algorithms to scan this data, identify particular patterns and generate reports that provide insights into your business processes, manufacturing metrics, financial data and what not, depending on the exigencies of your business and software that you are set to build in order to meet these goals.

– Easier redesign of data storages and data migration

Hardly anyone would pursue an idea of running a line-of-business application or enterprise analytics software, and in general a business without plans for future growth in mind. Growth is an intrinsic attribute of any business-focused solution, and BI and analytics is not an exception in this case. The “CQRS-Event sourcing” duo gives plenty of room for such growth allowing for agility, scalability and responsiveness to market demands in the future.

With an N-tier option the growth momentum is tackled smoothly if a scalability scenario is linear, and backend instances are replicated without changing a data schema or rules of interaction with and within business logic. But real-life examples demonstrate that with incremental growth of application functionality, constantly evolving technology and ever-changing tech product offers on the market, you will finally reach the point when your solution won’t do with just a linear scalability option and you will have to think about data storage redesign and its migration (into the cloud, to another technology, etc).

In this situation if you’ve started with CQRS and Event sourcing at the very beginning, consider that you’ve already got rid of all major daunting migration issues that may show up when it comes to changing a data schema or getting into the complex relations of read and write requests handled at the same place. Chances that something will go south during this process are minimized. The event log can be later on painlessly mapped on any database structure that you pick up. And there will be no need to be in over your head to work out a solution to refactor the logic handling both commands and queries.

Summary

When a development team sees potential benefits of a CQRS/Event sourcing implementation in terms of future-proof tech characteristics of the system and performance metrics, rejecting this option right off the bat based solely on heads-ups from third parties about their proverbial complexity may turn out to be a short-sighted decision. Look into your domain carefully, study its requirements, and analyze relationships both within the domain and with other domains that will be involved into the operation of a solution-to-be before you take the final decision. Both CQRS and Event sourcing may well be worth the effort and can save you time and money in the long run. The bottom line here is that such decisions are made on a project-by-project basis, but are undoubtedly worth considering for the implementation of certain software solutions.

 

Andrei Kaminski is a leading .NET developer at Softeq with demonstrable experience both in front-end and back-end development, a certified Microsoft professional, a web enthusiast with a keen interest in opportunities of CQRS and Event sourcing and a can-do attitude to creating complex data-driven web solutions, business process automation applications, and enterprise software at large.

 

Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.




the unrealized promise of analytics (and how data governance can help)




Tags: , , , , , , , , ,

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>