An overview of Reaqtor AKA Cloud Native Rx
.NET Foundation Summit 2023
Endjin are proud to be a .NET Foundation Corporate Sponsor, as we are maintainers of Reactive Extensions for .NET, Reaqtor (both of which are part of the .NET Foundation), and over 50 of our own Open Source Projects.
In this talk, as part of the .NET Foundation Summit 2023, Ian Griffiths gives a high level overview of Reaqtor and explains how it evolves many of the core concepts in Reactive Extensions for .NET (Rx), to support Cloud Native Reactive Programming.
Transcript
Ian Griffiths
Reaqtor - reliable Rx at scale for high performance event processing. And my goal is for the meaning of that title to be understandable by the end of this talk. So what exactly is Reaqtor and why am I talking about it? Reaqtor is a system for stateful, reliable distributed processing of events, and it's based on the programming model that originated from the Reactive Extensions for .NET.
Rx, the Reactive Extensions for .NET has been a an open source project for a very long time now also under the .NET Foundation ownership. But more recently another project that grew out of the same source in Microsoft became open source and that's also under the ownership of the .NET Foundation.
And we, at endjin, my employer, we are the main maintainers of this, and we were particularly interested in it because it provides a model for doing. high volume, realtime online analytical processing of event driven sources. I'm gonna talk about some of the application types that we have found. It's to be applicable to but it's basically really good if you've got event sources that generate data on their own schedule, potentially at very high volumes, and you want to do analytics over it, where there might be very large numbers of distinct.
Entities of interest. So that's the broad idea behind it and to understand how it works and the, what the nature of Reaqtor is. It's useful to understand a little bit about where it came from and why it was invented. So this actually grew out of a project inside of Microsoft. So this was originally written by Microsoft.
It was an internal piece of technology and we at endjin, were. helping to try and get it actually open sourced over the course of several years, and it eventually came out in 2021 and it came originally out of the cloud programmability group. So the cloud program programmability group was formed back in the days when Azure was very new and Microsoft was concerned about how easy it would be for developers to exploit the potential of a cloud based programming model.
And so they were looking at potential new programming models that could help developers get to grips with this. And one of the things that they came up with was a project called Volta, which was talked about publicly, but it was never actually released. And I. Some of the ideas in it were a long way ahead of their time and some of it's now come to light in things like Blazor.
But one of the main things that did come out of it was Rx, the Reactive Extensions for .NET, but they were actually originally designed to support cross layer event communication. So one of the models they had in mind was that you might be building a user interface that would run in the browser.
You might have some sort of middle tier running in some servers on Azure in the backend, and then maybe other lower levels behind that. And they wanted to have a reliable way. to flow events up and down through the system. So if the user click to button on the web browser, that could be communicated down into the backend.
If your data model changed, it could push changes back up to the front end, and they wanted to unify this. They wanted to come up with a single conceptual model for representing this that would enable. Distributed flowing of events in a way that could be handled uniformly. And the Reactive Extensions for .NET were what came out of this, and that was what they were originally designed for.
They were designed essentially to enable remote ability of event streams. However, this same group then realized that the model they had they'd come up with was very flexible. It was actually able to express all sorts of interesting computations and the idea of event streams and subscriptions to event sources were in their own right and interesting entity to have on the back end.
And so that same team created a product internally called Reaqtor. So this was only used inside of Microsoft for quite a long time. They did talk about it. So if you've watched any, if you used to be a fan of Channel9 back in the day, you have probably come across this before. So Bart De Smet gave a lot of talks about this and they had hopes to release it.
It got used a lot internally. It was behind things like Cortana and it was used internally in Bing. used in bits of Microsoft Exchange. So it got a lot of use, but they never quite managed to get it out of the door. However, Fortunately after work, after some discussions with them about ownership and who was gonna maintain it, eventually it did get out then.
We now that's available and so we can build on top of it. The first thing to get to grips with before I talk about what Reaqtor is like and what sorts of things we can do with it, is the underlying model. If you are already familiar with Rx, the reactive extensions for.NET, then this will be revision.
But just in case you are not, the basic model presented by Rx is really very simple. It's one thing after another. . So it's a sequence of things. Those things might be timestamps, they might be measurements of a temperature, they might be atmospheric quality monitoring. They might be key presses. They might be cars driving past a beam in a car park.
It could be any event. And they'll, they just happen one after another. That's the basic programming model. So conceptually it's not very different from an IEnumerable of tea, it's just a sequence of things. But the critical feature of Rx is that the source decides when to produce an item. So unlike an IEnumerable of tea where you can just say, give me the next item now, please gimme the next item.
You can just write a loop to pull the items out with observables. The source tells you when it has an item. I, as a program, I can't force the user to click on a button right now. I have to wait till they do it. But when they do it, then that's their choice. So it's in the nature of observable sources that they produce values as they're good and ready.
And because the. Basic conceptual model is just a sequence of things. It's the same conceptual model that Iron Numerable has. We can use Link. One of the things the reactive extensions and also Reaqtor give you is a complete set of LINQ operators, which you can use to build up queries and to compose computations.
And so it's we're able to build On top of this, sorry, I just noticed there are some chats. Are we still having issues with the audio or is it okay now? I've just noticed this was from 10 minutes ago. It's all good. Continue. It's all good. Okay. I will continue. Thank you. So the we can build up computations on the link programming model, and this is basically what Rx is and Reaqtor basically makes that.
Distributed, reliable, and persistent. So at this point, a lot of people say isn't this just and they then insert some technology here. So some people say this is just message queuing, isn't it? To which the answer is no, not really. So with message queuing, essentially message queuing systems are all about the delivery of messages.
They provide a safe place to put messages and then there's a model for consuming them. And they generally give you guarantees, like there will be exactly once consumption or. At least once consumption of the messages. But they don't generally do any sort of computation over the messages. They don't generally inspect what's in the messages.
You can't give a message to some instructions to say please group the messages by the value inside this field here. And and then. , gimme a sort of a windowed aggregation over the things within each group. You can't do that sort of processing of the concepts. It's up to the application code to decide what to do with MessageQ.
So it really isn't like those at all. And it isn't really like event streaming systems either. So things like event hubs in Azure or Kafka, if you're in that world, those. Often get confused with message queues. They're actually very different concepts with event streams. The idea is that a cons individual consumers can see every single event in the stream, and there might be multiple consumers.
So that's quite a different idea from a message queue where there's generally one consumer per message. So with event streams, again, it tends to be a rule with these systems that they are all about convey. The message and they have no business with the payload. So generally speaking, if you look at these sorts of streaming systems, they don't know how to make decisions based on what is in the message.
There might be some decision making around things like the metadata attached to the message, but nothing that they can actually peek into the messages itself. Whereas Reaqtor can absolutely do that. Reaqtor you generally do describe computations that are gonna look in the messages flowing through the.
If you are into this space, you might well go. Okay. Is it like Spark structured streaming then? To which the answer is yes, a bit. You can solve similar application problems with each of these. However, there are some substantial differences with the programming model. The biggest one being that in Reaqtor we base everything around Expression Trees in .NET and they can be moved around and it's possible for the expressions to be rewritten and distributed automatically outside of query engine processing.
For example, you can build systems based on Reaqtor where some of the processing is pushed out into edge devices, like sensors out in the field, for example. But also the whole thing can be used entirely in process as well. As we'll see, the whole code base for Reaqtor has been structured in such a way that you don't have to use the processing environment that it provides.
You can actually use all the technology it has in isolation or in your own host process. So having said what it's not what is it? What sorts of things can we build with Reaqtor? I'll start with an example. That was one of the earlier applications that was a Reaqtor has actually been used for in practice.
So if you have. Any sort of digital calendar, most of the ones out there are able to give you notifications to say it's probably time to leave now if you're gonna meet, make that meeting. So you have services like if you've got an appointment that requires you to drive to the other side of town and there's a traffic jam your phone can give you a mesh.
You're saying you better leave now because it's gonna take you 20 minutes longer than you thought it was going to get to that meeting. And To be able to do this. We have potentially millions of entities in the system, millions of ac of appointments across all the users in let's say, outlook.com or whatever.
Or Microsoft 365. There's loads and loads of appointments all being monitored. Each of them has its own notion of what destination it's interested in, and they all might want to monitor local conditions, so they might want to look at event sources like traffic information. So they might be monitoring the traffic information on certain roads or certain routes, and so they might.
Repeatedly update the estimation of how long it's gonna take to get somewhere. And so this is very much an event driven system, so whenever the traffic conditions change, you need to change your estimation of when it's time to leave for an appointment. If someone changes the appointment, that also needs to be updated and it needs to scale.
There are millions of users of some of of exchange and they've each got, multiple appointments all set up in advance, possibly for years in advance. And these things all need to deliver the right notifications at the right moment. So that's one of the example scenarios. It's lots of events, but also Lots of different processing for those events.
So the key to understand about this style of thing is we're not just taking a huge amount of data and aggregating it down in a sort of funnel shape. We're not reducing a huge volume of things to a single report. The hugeness kind of flows through the system, millions of people and you've got millions of completely distinct, unique, customized notifications that need to be given out.
It's not just an aggregation with a sort of single page result. It's that ability to deal with the scale throughout the whole life cycle of the system is part of what makes these things challenging. So that was one of the things that Reaqtor has been used for extensively. And it was used to prove out the scale of the thing.
It demonstrated that it was definitely able to cope with tens of millions of things happening concurrently. But at endjin we have used this for some other applications. So one of the systems that we have built on top of this technology was an atmospheric monitoring system. So we ran a project with a local school where we were doing some.
Quality monitoring with a view to trying to work out whether traffic jams on an adjacent road were causing issues in classroom air quality. And we had these little monitor devices had bristling with little air sensors so they could measure particulates at various different sizes, CO2 levels, temperature, humidity and various other chemical sensors and things on them. And they would just report measurements every so often. We had these in a few different classrooms and we wanted to collect all the data and these were naturally modeled by. The Rx abstractions, just, they're just observable sequences of measurements. But what we also wanted to be able to do was to, excuse me, perform analysis over the stream of events as they were coming in.
We wanted to be able to define rules to say if these sorts of thresholds are, or things are changing faster than a certain rate, we would like to be able to respond to that. And Reaqtor was a natural fit for.
Another system we've used it for was on a project with a broadband provider. So they have millions of broadband routers in people's homes. And the, a volume of telemetry that those things are capable of producing is phenomenal. If you looked at every single message that every device in every person's home on a single broadband provider's network was sending at you it would just be, just an incomprehensibly large number of events. This company wanted to be able to perform analysis over these things in order to get ahead of problems in the network. So they wanted to find out that things were starting to go wrong before they started getting calls from customers. They wanted to be able to perform a live analysis. They could answer questions like, is this problem happening to just this customer? Or is this a problem affecting all the customers on a particular exchange, for example, because that's quite an important operational understanding. You don't wanna send 30 people out in vans to 30 separate customers when you could have sent one van to one exchange to look at the problem where it's really happening. So this ability to run anomaly detection algorithms to run moderately complicated computations across very large streams of data and to be able to. Analyze things at the level of individual devices of which there were millions, but also to aggregate and analyze at the level of exchanges of which there were hundreds.
That ability to span those sorts of scales was it was one of the things that made us want to use Reaqtor and is actually part of what drove us to try and get the thing open sourced in the first place. So those are some example kinds of applications that they might be used for. So with that in mind, let's look in a bit more detail at this so I can get to a point where I can show you a demo.
So what does Rx look like? It's pretty simple. You have a type which is actually baked into the .NET runtime libraries. It's themselves called IObservable<T>
. It's an interface and it just represents a sequence of things and has a single method subscribe. The only thing you can do to an observable is subscribe to it.
You can say, I am interested in the events that this observable has to offer, so I would like to subscribe to its events. And to do that you supply a thing called an. Which must implement the IObserver
interface, which requires you to do three things. You must supply an on next method that will be invoked every time the observable source emits some information.
So in the case of our atmospheric monitors, every time we took a reading from a sensor, the observing of any observers attached would just get the on next measurement of whatever it was. , there's also an on complete. Some sources naturally reach an end, and if they do, they call on complete on their observers.
And once they've done that, you'll never hear from them again. Or they might fail. They can say it's all gone wrong. Here's an exception. That should have an exception argument missing on the slide. Sorry. But yes, that will also be the last thing you'll ever hear from a source as well if it dies on you.
So that's the basic model. So now I'm gonna show that in the context of an actual application, because people often struggle a bit to see how this could be of any practical value. because it all seems a bit like it might be a little bit Mickey Mouse. So I'm gonna show you an example based around shipping data.
There's a system called AIS, which is the Automatic Identification System, which is the system that pretty much all ocean-going vessels are required to use. So if they've got GPS transmitters on them the thing they're actually broadcasting to tell you their location is AIS messages. Automatic identification system. At endjin we supply a couple of free open source libraries. These ones aren't under .NET Foundation thing, they're just free things We provide called AIS.NET that you can use to process these things. And in this dot net notebook, I'm using .NET Interactive, Polyglot Notebooks to write C# in a notebook.
I am going to consume AIS data from Norway why Norway? Because the Norwegian government very generously makes a free AIS stream available for all of the ships around Norwegian coastlines. So we have a receiver for that so we can connect to the Norwegian service to receive all the locations of all the ships anywhere near You can also run this notebook in a kind of offline mode with a pre-processed data if you prefer not to have it live. But this is gonna be live because I'm crazy. We're gonna wrap that around a thing that presents it in Rx terms, and then we're gonna show all this in a map. I wanna be able to show the location of the various ships on a map.
So I've got a cell here that just basically displays a map. We're gonna zoom in on Norway because that's where the data will be coming from. And scrolling down I need to hook up my.NET code to the JavaScript that's running the map. And then we are going to arrange to be able to send data from our T sharp code into that handler.
On the JavaScript side, this is just a bit of glue so we can connect things together. And then we get to the point of this, which is an Rx query. So this. Is where we're gonna use that receiver host. So the receiver host provides an observable source of messages. So this message is property is an IObservable<AISMessage>
, and we are gonna do a bit of work on it because it turns out that if you want to plot on the map, the location and name of a vessel you can't just read the location and name out of the AIS messages and there's a reason for that.
Ships with GPS transceivers in them. Send out their location relatively often. They send out messages with their location and speed and heading and a few other things like that. Maybe every few minutes, maybe every few seconds, depending on how fast they are moving. They also send out messages that describe the ship's name and its width and its length.
They don't send those out very often because ships don't tend to change size as often as they change position. Not unless you're having a really bad day in the ocean. , they tend to give us the vessel name type messages much more infrequently. So if we wanna show on a map names and locations, we had to somehow reconcile those two things.
We had to join them back together. So this is essentially doing a dynamic join over some data streams. So what we're gonna do is we are going to group the messages, first of all, by the unique vessel identification. The MMSI (Maritime Mobile Service Identity) it's a unique vessel identifier. So this is gonna give us an observable source where each thing that comes outta the observable is another observable, a grouped, one grouped by the vessel id.
And then from that stream, we're splitting it out into the, into three different message types. So vessel navigation ones are things like, how fast are we going? Where are we going? Which way are we pointing on those sorts of things. There's also a separate one for the vessel name and also, The vessel type, is it an oil tanker?
Is it a ferry, is it a jet ski? Or whatever it might be. We get all three of those and they, those might all turn up in completely separate radio messages and then, We're gonna use Rxs combined latest to say, look, I've got three streams here and I'd like you to combine the latest of value from each of those into a single two.
And that then becomes the output of our stream. So if I run this. That doesn't start yet. So this is an observable, but it doesn't do anything until someone actually observes it. So we then have to connect that up to all the code we had earlier. So this is the point where we're actually going to set up the subscription.
And the last thing I have to do is tell the receiver to start receiving because it isn't actually talking to Norway until I click this button here. So at this point as long as the demo works we should be live. So if I scroll back up to the map, Hey, it worked. Excellent. So ships are starting to appear on the map around the coast of Norway and also probably around Svalbard.
Yeah, there we go. So all the Norwegian owned waters these ships are starting to appear and as time goes on, we'll see more and more because as I said, they don't all report their name all the time. So we're seeing all the ones where we've had both the. And the location and also the type. So these are color coded and away.
I forget exactly what the colors mean, but you can tell what sort of vessel they are and over time they will move around if we sit and stare at them for long enough and more and more will appear so. This is a, an application of Rxs ability to do live processing. And I'm just gonna talk about it a little more because there's a really important point here when it comes to understanding the value that Reaqtor has to bring on top of Rx.
Just a recap AIS feed. So the raw message is coming out of the ships. You don't get all the information at once. Some of the messages tell you location. Some of them tell you the vessel name. Some of them tell you other things like the vessel type. Some of them tell you the dimensions and so on.
But we want. All of that combined for any given ship before we can show it on the map. So what we're gonna do is group them as you saw by vessel, and then use that to combine them. So initially, the first message comes in, and this is the important bit. The group by operator will see that message. It'll use our key look up and say, okay, the MMSI is 1234 in this case.
Have we seen that key before? And the answer in this case is no. So it goes, okay, we need to create a new group. It creates a new observable group for that. Next message comes in, that's 7977 different ones. He goes, oh, okay. Need another group. So we haven't seen that one before either. The next message that comes in, 1234, again, that goes out on the same group as the first one and the next one also, same id, although different message type.
And at this point we actually have enough information to report both the name and the location in the real code. It was actually combining three things, but there isn't enough space on the slide to show that. So we're ignoring the vessel type in this particular. If you look at the top, the very next message that's gonna come in is the name for the second ship.
So at that point, that's also got enough to report the combined name and location. And then as subsequent locations come in, it just combines it with the last message of the other type that it's seen as it goes on, and then we're able to process the results. That's all going on inside the head of that Rx query if you like.
So what does Reaqtor to add to this? Because that demo was entirely based around ordinary Rx using the System.Reactive NuGet package. I wasn't using Reaqtor in that demo. So why on earth was I showing it to you? One of the big things Reaqtor adds is handling state. So let's talk about state. Some operators build up state as they go. Actually, most interesting operators build up state as they go. So the group buy that, I used to work out which vessels have we already seen. It's gotta remember which vessels it's already seen. It needs to know which ones it's seen before, so it can send them out on the group it already created, and to know when to create new groups.
But simpler operators also often have. So take, this is one of the operators that's a standard link. One says, I'd like to take the first N items from a sequence and then just discard the rest. So if I see observable take eight, then we get 1, 2, 3, 4, 5, 6, 7, 8 items come in, and then it just stops at that point.
So it'll reach the end. So far so good, but, one of the things that Reaqtor gives us is reliability. So if my server fails, it's able to recover and continue from where it left off. Or even if there's no failure, even if it's just a matter of servicing. Even if all that's happening is that my operating system needs an update, or maybe the host software needs to be updated for one reason or another.
Machines do get rebooted from time to time. And if I set an appointment three years ago, I want the reminder to go off and I don't want to be relying on a machine having stayed up for three years just to make that happen. I need the entities in my system to outlive any single process, and that's a problem with classic Rx.
If you have an observable with classic Rx and you do observable take eight, and it starts receiving items, but then you have a reboot and you move into a different process, it goes back to the start. You're gonna get eight items, not from the start of the sequence. You're gonna get I eight items from whenever you constructed that particular observable.
That's not very, And this is one of the classic problems with standard Rx, which Reaqtor aims to solve. It's building in the ability to manage state. All interesting Rx queries tend to accumulate state. If you're doing anything of any value at all with Rx, then in all likelihood you're gonna build up some states that lives inside of the operators in the query that you have defined.
And the problem with classic Rx is that this state is ephemeral. It's inside the operators and there's no way of getting it out. So when. The process that's hosting these things exits, all that state is lost like tears in rain. So there's no way of actually freezing the state of some complicated Rx query and serializing it and then reconstituting that later on.
And this is one of the things that the Reaqtive codebase gives. Strictly speaking, it's the reactive bit with a queue reactive reactive LINQ in particular gives you a complete implementation of all of the standard LINQ operators for Rx. But it does them in a way that they're able to manage states.
They are able to be interrogated. You can ask them, what state are you? Can you write your state out into a store for me? And then later on you can construct brand new ones. Give them back the state store, they initially populated and they will get back into the same state. And so the reactive. Layer of the Reaqtor code base provides that implementation of Rx, that state enabled version of Rx, and then the Reaqtor layer sits on top of that and provides a check pointing host.
It provides a host environment, which knows how to use those state management APIs and can host any number of. Concurrent subscriptions and it's able to suspend them all from time to time, collect their state and build a checkpoint, serialize that out to disk or wherever, and then continue. And this is how we're able to build reliable host environments for Rx queries.
So I'm quickly gonna show you an example of that. So I have running on my computer a local environment that's running a Reaqtor query engine. And I'm gonna create about the simplest. Query you could possibly create here through this web front end. So I'm gonna create this that is gonna hit a break point, which I'm gonna move on to monitor that you can all see.
And this is going to use the Rx external client API to define a subscription that it wants set up. So it's saying, Here saying, okay, I'm gonna use the reactive client context. What's that? That's your way in to some Reaqtor environment. It's a bit like a DB context or an OP in entity framework. It's basically your connection to a particular implementation of Reaqtor.
And we're saying to that, I'd like a timer, please. I'd like that timer to wait 10 seconds and then go off every 20 seconds after that. And then I'm using the. Select operator to say what I'd like to do with the events that come out of the timer. This timer just produces a sequence of numbers, one after another on the schedule you specified.
And so this says, I'd like to take that number and turn it into a string. So this is gonna say, give me tick space n And then says, what? What am I gonna do with that? This is an observable. Actually it's an I async, reactive, observable, but logically speaking, it's an observable, I'll talk about the queue in a minute.
And then what do we need to do with an observable? We need to observe it. Nothing. It doesn't do anything until we subscribe and observe it to it. So we get hold of an observer. I'm actually using one that's gonna put all of its output straight into the diagnostic logging. It's actually gonna write events to Event Tracing for Windows. Because I didn't want to complicate this demo by having to set up. A load of things like event grids and event hubs and so on. So this is just working entirely inside the query engine. And then I'm gonna say, okay, I'd like to actually subscribe this observer to this observable that happens here.
And as soon as I do that that actually sets the thing up and it should be up and running. So if I let this run and then go back in, Two Visual studio and look at the diagnostic events. I need to zoom in a bit. Unfortunately, you can't change the font size on this bit of Visual Studio. I am hoping that is legible, right?
After some point we should see, oh, there it is, tick zero. There's my first tick. Tick zero. After another 20 seconds, we should see another one. The thing I wanted to do though, was to deliberately kill off the process that hosts this because the whole point of Reaqtor is it gives us reliability and durability.
I shouldn't be tied to the fact that this query is running in some particular host process. Right now, all these texts take zero, take one, take two. They're all sitting inside the same process. 703. Oh, find, zoom back out again and go and find my event. My task manager. I should be able to find that.
So where are we? Long way down 7360. That is Endjin.Reaqtor.SF.Query.EngineHost. Okay. So there it is. I'm gonna take that out and see what happens. So if I go back to this web browser, if I actually go in I'm hosting this inside a service fabric for various reasons. If. Come in here and find where my query engine host is running.
I've got a very simple exactly environment here. Normally you'd have loads of query engines. I've got just one in order to make it really easy to find for the purposes of this demo because I wanna destroy the node that's running on. So it's saying, okay, you're running on node two in our compute cluster, so fine, I'm gonna blow that away.
Let's go into number two. Time to go away. If I now cut. Oh, I've gotta say yes. I already mean it. So if I restart that node and then go and look here, I should see fairly soon. There I go. It's gone. I dunno if you can read that, but it's gone. The the process that was hosting that tick, 70360, whatever it was, has gone away.
But if I come back into here there's a certain amount of activity has gone on because it's panic. It's gone a bit. Ugh, because I rebooted everything. Let's see if that it was before or after I knocked it out. We're still in 70360 at that point. Okay. So that was before I managed to kill the thing off.
But fairly soon. We should see there every 20 seconds. I think I set them up fairly soon. We should see another tick coming through. Come on.
Oh yeah, everything goes a bit slur when you attach the debugger. All right. Yes, this is probably it recovering from a checkpoint. So it would go a bit faster than the real thing. Checkpointing generally happens in less than a second in the real system. I might actually have missed the tick in all of that.
Let me have a quick search through it. Oh, take eight. Ah, different process ID 1984. So yes, it's. Managed to move the whole subscription into a completely different process and with no loss of sequence. So the sequence is still counting up exactly as it should be even though it's managed to move from one process to another. So this is it's a very simple subscription.
Obviously I'm just holding tick events from a timer, but the point is for anything that you can set up a subscription for in Reaqtor it. Migrate the thing from one service to another if something goes down. And that's what enables the durability and the reliability. So to make all of this happen, The React to code base provides various things.
So I already mentioned the Reaqtive LINQ library. This is a re-implementation of Rx. So it's basically the most important operators from Rx re-implemented with stateful awareness. They're able to write out their current states to store, they're able to recover themselves from a state that was written into a store, so you can serialize them and then rehydrate them in some different process.
It. We also get the Check pointing query engine, which is the Reaqtor namespace. That's the thing that actually maintains all the subscriptions and interrogates them and builds up the checkpoint from time to time. So you've actually got an environment that is able to take advantage of those stateful operators.
But there are some other pieces as well that enable the distribution of compute. We've got something called Bonsai and we've got something called the Nuqleon data model. I'm gonna talk about the second one first. The data model enables us. To write strongly type queries against dot net types that represent the structure of our data streams, but to do so in a way that doesn't depend on those types actually being available on the query engine.
Because this is a classic problem. Whenever you try and remote bits of code around, if you have anything where you're trying to take some code and migrate it from here to there, which is essentially what we are doing. We're saying, I'm building a query that's on my client box, but I want that to run inside the query engine.
How's that gonna work if I don't have all the right DLLs installed on the target system? And then, Spark a solution to this is often you just deploy a, a jar file that has all of the types you need. That's one way of doing it. You just put the code you want into your query engine. But Reaqtor is designed not to need that.
Because Reaqtor is designed to enable you to do ad hoc querying and to change your mind about what the queries are dynamically. And it does this through a system that rewrites the queries to be structurally typed rather than based on. Particular net type, there's a whole system that enables it to essentially match the shape of the data without really being too fussed about what the actual type is.
And this massively simplifies the deployment that's required to work with the types you want to work with. So that's one part of it. But there's another thing called Bonsai that I want to talk. Bonsai is one of the key enabling features of the Reaqtor code base, and it enables expressions to be sent.
Over the network. So you can write a C# expression and Bonsai will turn that into a string of bites that can be sent over the network and that string of bites can then be turned back into an expression later on. I'm gonna very quickly show you this. So I have this copy of Visual Studio. Why can I not see that on the web browser?
Oh, it's caught up. There we go. So what I'm gonna do is I've got a function called shows Bonsai, which does the Bonsai serialization, and then just prints it onto the screen so you can see it. And I can pass in an expression. So if I want to see what X plus X looks like, then I can do that. So if I run as far as.
And step over that and then bring the window onto the right screen. That is a Bonsai representation of the expression. You can, if you squint, you can see it. So we've got it's basically js, om and this bit here says, here's the expression. It's a Lambda. Look at that. That's a Lambda arrow a lambda with two arguments.
The body of the lambda consists of a plus, and the left hand side of the plus the dollar signifies an argument. So it's the argument that positions zero. Here's the list of arguments at the ends, that's X. And then the right hand side of the plus is also an argument that positions there are also X. So that is a lambda evaluating to X plus X.
So it's converted this into a into a string. And if I copy that and stop this and paste that in here, actually it probably is already what it was. And I run a little further and I run this convert Bonsai to expression. So I've completely restarted the process now, so that this is, I'm feeling a string in, it's just that text that it printed out last time.
I'm feeding that in and it gives me back an expression object, which is a .NET expression saying X goes to X plus X. So essentially I've got the ability to serialize .NET expression. Trees to send those over the network. So that's a thing that isn't built in to .NET. That's a thing you couldn't do with .NET outta the box.
But it's a thing that the Bonsai part of the code base in the Reaqtor repo gives you. Why is it interesting? It means if you want to write queries to have moderately complicated code in them, though, that can be turned into an expression tree that can be serialized and then sent. Over to the query engine.
It gives us a way of getting our queries from our client's applications that run outside of the Reaqtor query engine into the query engine. So this whole expression here that says, gimme the observable, representing traffic information. I'd like to filter it down to just traffic flow data. This is slightly, but only slightly simplified version of the thing that does the, you have to leave now to make your appointment type notifications. So it's saying, I wanna know any time the traffic flow rate changes and when it does, I'm gonna calculate a new time it's gonna take me to get there and I'm gonna set off a timer that's gonna go when it's when it's time to leave. So all of this whole thing can be turned into a slightly bigger expression tree and sent into the query engine. So one of the things it enables them is distributed execution, but it also. Enables other tricks. It enables rewriting of the expression.
So because our queries are ex expressed in a symbolic format now it's possible for things to actually get rewritten. So most Reaqtor deployments in practice have a thing called a query coordinator. So when you, when I was talking to the query, the, sorry, the Reaqtor system, through that reactive client context, I wasn't talking directly to a query engine.
I was talking to a query coordinator and its. Is to inspect incoming subscriptions, work out what to do with them, and and work out where they're going to run. And as part of that, it might tear them into pieces. It might say okay, you are looking at traffic reports. Actually we've got a compute node over there.
That's already doing all the kind of filtering based on roots and so on. So we're gonna run that part of your query over there, and then we're gonna connect that to this other node that's gonna handle the rest of the query for you. Or it might do something like say, oh, okay, that query you've written that can actually be pushed out all the way to the edge device. So if I've got, for example one of these air sensor air quality monitors it might work out that the Air Quality Monitor device is perfectly capable of running the filtering that I want it to run. And so it can push that all the way down to the bottom, which might be really useful if you're running in a bandwidth constrained environment, which is often the case if you have sensors dotted all over the place.
So this ability to completely restructure the query is made easier by the existence of the fact that these things get turned into a serialized expression. So it means we don't have to have the actual code of the query deployed into the query engine in advance. We can just send it an expression that describes what we want.
It can optimize that. It can rewrite it or it can reject it if it doesn't like it. We don't need to put any code in advance into the query engine. We just send it the expression, and if it's happier. It runs it. Now. One other thing I wanna talk about I thought I deleted these slides, but I have to talk about it now because it's on here.
So here are some other libraries that do, or systems that do similar sorts of things. Broadly speaking to this I did have a section on how they're different, but I'm running outta time. So the main differences were half of these aren't dot net, so if you want to live in a.NET, that instantly rules out a load of these.
But are the ones that are left a couple of them. Broadly speaking, embrace the actor model, which is a perfectly reasonable thing to do, but it's very different from the Rx model. One of the benefits of the Rx model is that it's a much more constrained, formalized thing, which enables things like optimization and rewriting and distribution in a way that's much harder to do if your logic is expressed in just plain old compiled C#.
But also, One of the big benefits of the way that Reaqtor library has been structured is it's been designed to enable reuse of a lot of different scales. And the way this works is that there are, Essentially three layers to the libraries. So you've got low level, mid-level, and high level. The low level is called nucleon with a queue.
And this is where that bonsai feature I just described lives, and also it's where a whole load of expression tree rewriting algorithms live, and also a host of other utilities that I don't have time to go into. So all the stuff that was developed, that wasn't built into the .NET runtime, which the Reaqtor team found, Full has ended up in this Nucleon library, and you don't have to be using any of the rest of Reaqtor to use that stuff.
If Bonsai looks useful to you, you can use it in isolation because it's in this low level library. And then layer it on top of this we have the Reaqtive libraries. This is where you get the stateful implementation of Rx. So it's like regular Rx, but with state management. And then the layer on top of that is Reaqtor, which is where we have the endjin that's able to host long running subscriptions and it knows how to do check pointing and how to recover from a failure and all those kinds of things.
Now I've realized I've got to this point and have not explained why these Q's keep turning up. So before we stop, I'm just gonna explain why everything has a Q in its name. So in .Net runtime as it is now. We have the IEnumerable interface, which you can run queries over with LINQ, but there's also IQueryable.
And the difference between IEnumerable and IQueryable is that with IEnumerable, it's all just executable code. If you provide a, if you provide a filter to a wear clause, it's just a delegate. It's just a pointer to a function. But with I query. It's all written to accept expression tree. So rather than an actual delegate, what you pass in is an expression tree that describes the filter that you want.
And things like entity, framework can actually take that expression and rewrite it as SQL and then pass that into a SQL server or whatever to execute the query. So it's doing. Remote execution of code and distribution in a very similar way conceptually to what Rx is doing. It's what Reaqtor doing. But with Reaqtor, we actually turn it back into.NET code at the other end as well.
So we're sort of.NET at both ends, but it gets transformed on the way. But the queue in IQueryable. Has been kinda latched onto by all the interfaces in Rx that use Expression Trees rat.NET than just plain old functions. So we have, I observable is the plain old function based thing, but there's also IQbservable with a queue is the version that works on Expression Trees.
S.NEThe queue crops up to tell you that you are dealing with a thing that's able to do promoting of Expression Trees. So.NETat's what the queue signifies The queue is all about. The fact that this is a, this works on top of the, the.NET expression tree model to enable the distribution and rewriting and general optimization of queries across the servre farm.
Anyway, back to the point about libraries, the fact that it's all carved up in the way that it is, means that Reaqtor can actually run on a wide range of form factors. It was designed on the assumption that you had server farms with about, 40 machines in it hosting hundreds of thousands or millions of queries.
But it doesn't have to, it's perfectly happy to run on edge devices. It can easily run on a small raspberry pie, for example. It can even be induced to run on much smaller.NET embedded devices like a meadow. And. You can run smaller bits of the framework and that enables you to do this pushing out of queries down to the edge.
You don't have to run a full Reaqtor engine on your edge devices. You can just run little bits of the library and it can all work there. So that is one of the big features of this library as you can use as much or little of it as you like. With half a minute to go. Reaqtor gives us event processing with Rxs programming model, but the NCL and reactive layers of the library may give us stateful and distributed processing of Rx queries.
And then the Reaqtor layer on top of that gives us a hosting environment that provides reliability and scalability. Those are the websites for the repos for this and the information about it. I've left you no time at all for questions, but in my defense, we started five minutes late. Do we have any questions?
Eric Schneider:
And thank you for listening. Yes, we do. And actually it's been answered already and I see my video's frozen, so that's the best part. Oh, so I'm just gonna stop my camera. Welcome to Technical Difficulties, . And I'll put it here. Actually actually it's more of a comment because Rodney Littles, who we all know and love, he's a great guy.
He had a question where he says, So in this level, so is this on some level Rx with wire consistency built in? With wire consistency. I'm not quite sure what wire consistency means in that case. You start with the capital W so I'm wondering if that's a reference. Yeah, capital W It could be. I dunno about, but then he had a comment and I'll replace it with that.
Ian Griffiths
So Reaqtor knows how much states is being processed. This is amazing. So I think he answered his own question as you were going through the demo, which by the way was amazing. I loved it. . Oh, thank you. It was great. But we don't have a lot of questions. So I think people are afraid to ask questions. Which tends to happen.
Eric Schneider:
I do have something here from Johan when he said that Bonsai looks really useful. Thank you for sharing.
Ian Griffiths
Yeah, I think Bonsai a really critical enabler in this. And just even if that on its own had been released, that would've been really helpful. But yeah it's a there's more to say about it because actually it was designed to work across different languages as well.
Internally, Microsoft, I believe had implementations that use JavaScript as well. But yeah, the version that's been exposed is all net stuff. But in Principle, It can cross, language boundaries as well.
Eric Schneider: Gotcha. And to add here, this is why I love the internet. This is why I love a live show. Rodney just said. Yes. Amazing. I understood . So obviously if I wanna learn a little bit more about reactive, I wanna, I should go to https://reactive.net and go through do you have tutorials? What does that look like? Can you drive us here quick? Yeah, sure. So it has Which window has that opened in?
Ian Griffiths
Has it opened anything? That would be an, oh, there we go. So let me drag that onto the screen, oh, and it refused to go on the screen and wanted it to, there we are. So we have API reference documentation in there and also conceptual documentation. There is also An ebook is available, if you don't mind giving us your email address.
This is produced by endjin and it basically tells you all about how Reaqtor came up. There are various blog posts specifically about Reaqtor. Also if you go on the talks page there are various talks by Bart De Smet who is one of the principal people behind it. And also a few by me explaining and some by some of my colleagues at endjin explaining a whole load about all of this.
So if you want to get deep into it then reading those talks will be an excellent place to start. Yep. And I would add also that endjin is a sponsor of the.NET foundation. Yes. So we really appreciate that in all the commitment and. I know I'm talking with Howard about some other stuff that you guys want to contribute more with, which is great.
Eric Schneider:
So we may hopefully have a more announcements to that here in the coming weeks. And that is it. I'm wanna, I've been playing around, see if I can change my camera. A bit and it's not working the way I expect it. Eric, anything to add? Nothing nothing to add except our sincere Thanks Ian for that great presentation, including live action demos.
I was extremely impressed with the The real time tick events and I'm sure everybody was
Ian Griffiths:
I was extremely relieved that they worked!
Eric Schneider:
They worked and they worked well. Thank you. It was fascinating and excellent. And thanks for bearing with us. We started a little late but you got us back on track and I thank you very much.
Ian Griffiths:
Okay, thank you very much for listening to me. Great.