Archive

Archive for December, 2008

The State of the CEP

December 30th, 2008

Another year has passed in the world of Complex Event Processing and everything looks great, or does it?

The CEP Market

I have followed the CEP business for a while now. In 2001, the early days of CEP, when we started with ruleCore there were basically nothing called CEP. Then suddenly I started to hear this talk about Event Stream Processing and after a while it changed to Complex Event Processing – CEP- About the same time marketing departments (basically the founders plus friends) of a handful of startups went nuts. Bold claims were published almost daily and "we are faster than you" were the typical marketing message for several years.

This year have been rather quiet from a marketing perspective and the most balanced year this far. Very few bold claims of being first, being fastest or having coolest features. Mostly rather boring press releases about new versions and a some talk about new partnerships and customer stories.

Maybe the CEP companies started to use professional PR firms? Probably a good idea. But a bit boring, I miss the performance claim war and the personal touch of marketing from startups in their early days.

Also the marketing machines of IBM, Tibco, Progress and Oracle now have an event processing solution to deal with. It seems that event processing is hiding somewhere inside these giants below layers of SOA, BPM and other more mainstream technologies. Only to be seen occasionally. But I’m expecting to see more from these giants during 2009, I think they are just warming up and trying to understand what to do with this new thing called event processing. Maybe we will even see some announcements from Microsoft? (who will they buy to get up to speed?)

It seems that finance is the dominant market these vendors go after. Others, like Coral8, have started to broaden their message and talk more about applying event processing to business intelligence problems. Especially Coral8 seems to be heading in the right direction. It will be nice to see how their new strategy works, if they don’t get acquired by someone first.

There are also marketing activity from other vendors attempting to get the attention of telco, CRM, fraud detection and logistics users. It seems that everyone is still trying to figure out where to go and exactly what the best marketing message looks like.

I think 2009 will be a defining year for CEP and most vendors will discover where event processing solutions are most appreciated by the users. During 2009 I would expect most vendors to follow the money and go for sectors where they can quickly increase their revenue. In 2009 it’s time to start making money in the CEP business.

The SQL Based Approach

We have during 2008 seen number of products getting more advanced and powerful all based on the same idea that streams should be queried using same type of queries which are used to query data at rest.

Most of these products have some kind of textual language which look similar to SQL. A couple have graphical variations on the theme in the connect-the-boxes-with-arrows style. These vendors have added a lot of features to their own SQL-derived language in order to make it work with streams of data. Most languages are powerful and can be used to create complex systems with some ease. These SQL dialects can be found in products from Coral8, Esper, BEA WebLogic Event Server, StreamBase and Aleri.

Personally I see that these SQL based products are great tools for creating data driven algorithms and performing computations on all kinds on data feeds. It’s no surprise that these vendors are targeting the financial markets. It seems that all vendors in this camp have had some success in attracting some customers from the financial sector.

It will be interesting to see what happens if (when?) the major RDBMS vendors announce the ability to register SQL queries in their databases and run them continuously. When this happens, the CEP vendors with SQL based products might have a hard time in making a point for a separate product for processing data streams. Their products will just look too similar to the newly added streaming SQL (or whatever the RDBMS vendors will call that new feature).

My guess is that we will see announcements from at least one major database vendor in 2009 with features for processing streaming data and continuously executing queries. It might not be a bold guess that Oracle will be the first one implementing something like this and then followed by the regular open source copy cats. When this happens I would not be surprised if some of these CEP vendors re-invent themselves as solution providers for the financial industry. After all, they have a great tool for solving problems in that domain.

Rules

A number of vendors have chosen a rule based approach to complex event processing. The ruleCore CEP Server, IBM Business Events and TIBCO BusinessEvents, Progress Apama and Agent Logic all use some form of rules.

In this category we have as many approaches to event processing as they are vendors. Depending on the background of the vendor you can see different aspects highlighted in the various products.

RuleCore provides a foundation for adding reactivity to event-driven SOA, with location aware event processing as its speciality. True to its heritage Progress Apama talks mostly about algo trading, smart order routing but are finding its way into other areas like Telco and RFID. Agent Logic seems to focuse on the GUI (Can we see CIA/NSA operatives busy with creating new rules each day?) in order to give the spooks a great tool to listen in on various telco trunks (I made this up, but if I were NSA/CIA that’s what I use it for).

TIBCO and IBM feels a lot like the traditional message broker tools where you control the flow of events and take decisions depending on state. If you have been working with traditional message brokering tools you might wonder what the excitement is all about as this sounds like old tools with new name, now they are called events instead of messages.

Aleri have a combination of a SQL based language and a special purpose stream processing language called SPLASH, which I will continue to watch with interest.

At ruleCore we have spent the whole year in improving the product by adding support for location (geospatial) aware rules. Basically native support for GPS events. So now we can easily create rules which track vehicles and detect deviations from expected behavior or security violations.

Geofencing is now really easy to do using Google Maps and the new support for geospatial rules. We also improved the context aware rule evaluation features and have now a really nice declarative rule evaluation model for tracking business entities.

Others

Apart from the vendors I already mentioned there are also a small number of other vendors that I have not heard much about this year. Not sure if they are silent or just not on my radar screen due to lack of time. If wish I had more time to follow all the interesting companies doing event processing out there. There are Event Zero, Pion, Senactive, WestGlobal, RTM and maybe others which I can’t remember right now. A quick look on their web pages indicate that there are at least some activity going on and lots of ambition to bring event processing to the masses.

Customers?

So, how many customers are there out there using any of these CEP products?

I hate to sound pessimistic. But from a distance, I don’t see that many customers actually using CEP. I’m sure many of the vendors have a number of customers which might keep them busy for a while. But I think we could still fit all CEP users into a single room. Not sure if the small number of customers is a problem. There are not many CEP vendors to begin with, so even a small group of customers can keep us running.

If you explain the CEP concept to a groups of developers I’m sure many of them go "hey, that’s something we’re been doing for a long time". So I think there are many developers which are "doing CEP" but not actually using a tool specially made for event processing. The normal Java, EJB, .Net tools seems to work for now. All these will eventually start to look at "real" CEP tools.

Community

In 2008 we saw a lot of activity in the Event Processing Technical Society. It’s mission is

"To promote understanding and advancement of Event Processing technologies, to assist in the development of Standards to ensure long-term growth, and to provide a cooperative and inclusive environment for communication and learning".

It was founded by 29 companies which more or less define the whole event processing world as we see it today. During 2009 there will be lots of interesting activity in a number of workgroups. Interoperability is the one I’m going to watch closely and hopefully have time to take part in.

The Future

A couple of years ago, about 1998, a company called iSpheres developed a CEP product. Things were going fine and then management came and shared their infinite wisdom and called for a vertical marketing approach. Energy trading was the silver bullet! Energy in the form of electricity can’t be stored to it has to be traded in real-time. So all forces was focused on solutions for trading electricity. Then came Enron…

Back to 2008 and the era of financial meltdown. Look at the current crop of CEP companies focusing on algorithmic trading and other solutions for the financial industry. Do I even need to point out the obvious here?

Some of the CEP companies will probably have a hard time during 2009. I suspect that at least a couple will go belly up, others will be acquired to avoid that fate and many others will have to fight for venture capital.

If I put on the customer/user hat for a while, I still can’t see a convincing message from many of the major CEP vendors. Most users that I talk to just don’t get it either. It is nor clear to them why they should invest time and money in these products.

An informal survey that I have made shows clearly what I have suspected; There is simply no demand on a larger scale for event processing! Most users just conclude that the current generation of tools like their favorite SQL database, application servers and programming environments can do the job. I’m not saying that there are anything wrong with the current CEP tools. It’s just that very few feels that they have a need for this kind of tools.

The major challenge for the event processing vendors in 2009 will be to convince the users that they need a completely new technology to solve streaming problems. And yes, most users would benefit from stopping development of their own streaming solutions and buy one instead. Surely most vendors will win customer during 2009, but at a high cost and only after many sales calls and visits.

But I’m still optimistic. Event processing is like a heavy train with a undersized engine. It will take a while to accelerate, but when it have gained speed, it’s hard to stop. (Here I thought of inserting a joke about our state controlled railroads, but it didn’t seem fair to kick at someone already laying down).

The products of many CEP vendors have shown great improvements in 2008 and they start to look really good. So there’s nothing wrong with the technology.

Let’s see if the customers will find the world of CEP during 2009…

(On an unrelated note I would like to add a bold prediction that during 2009 we will see something like the 2000 dot-com bubble burst. There are just too many companies with very hard to understand business ideas. They all seem to have in common that everything are given away free and revenue should be made by some other clever and complex scheme like advertising, services, or whatever. Sites like Google and Facebook will be the first ones to hit by sudden problems and will drag most of the social-something and web2.0-something companies with them creating the second IT crisis. Everyone with a business idea (that is, how to make money) that I can’t understand in 10 seconds will go down.)

 

 

 

Share/Save/Bookmark

Complex Event Processing

Events, Observations and Vehicle Location

December 25th, 2008

 I got some good comments on my last post on event identity from my readers, some of which can be found in the comment section of my previous post.

I must been a bit tired when writing that last post as I complicated things when I started to think about event IDs. But thanks to my helpful readers, I’m now back on track again after a couple of nights of sleep and lots of xmas food high on carbs and fat…

A good way to think about event id seems to be this:

The event is a notification about an observation,

the observation observes a real-world event,

which can be observed by multiple observers,

thus multiple event objects can contain information about the same real world event…

 

With this in mind the event identity is nothing more than the identity of the observation.

The problem I started to think about in the last post is still a real problem but it happens so to speak at the next level.

What if we want our event processing rules to see only one copy of each real-world event? This would simplify the creation of rules in a good way. This kind of event fusion capability is something that would be a great feature in the event processing platform you are using…

 

Let’s look at an example…

 

To continue with our vehicle tracking example (we are doing lots of location aware stuff with ruleCore right now so that’s why I’m stuck in using vehicles in my examples all the time. But the problem is the same independent of what you are tracking)

A common thing one does with trucks, is to wait for them. So a typical reaction rule in ruleCore would be to detect the arrival of a truck, so you can do something else while waiting for them.

Optimally, the rule should be simple to write: 

"Trigger rule when vehicle enters zone A".

In a complex environment things are not that easy. We could detect the fact that a vehicle has entered zone A using a number of methods:

  1. Somebody scans the vehicle or its cargo using RFID scanner which is known to be located inside zone A.
  2. Security notes that a vehicle has passed the gates.
  3. The driver calls in and says he’s arriving in a few minutes.
  4. We note that the drivers cell phone is inside zone A, using the cell network’s positioning mechanism.
  5. And the obvious one – The GPS of the truck sends a location update from inside zone A.

Just in this simple example we have five types of events which could be used to determine that a vehicle is inside zone A. To make life easier for the rule desgner, we would like to provide these five events as an unified ZoneEntry(ZoneId, VehicleID) event instead.

There are probably a number of solutions for this. Currently in ruleCore we use rule hierarchies to solve this. We have domain specific rules at lower level feeding rules at higher level with these high level events.

But I really would like to find a generic solution for this one.

I think the military rooted sensor information fusion world have lots of good ideas on this one.. Stay tuned…

 

Share/Save/Bookmark

Complex Event Processing

Event Models – Identity

December 22nd, 2008

A couple of days ago I wrote a bit about event models and tried to sort out what the fundamental properties of an event are.

One of those properties seems to be the identity of an event. This property is something I see as the major defining characteristics of event processing. The thing that sets event processing apart from mere data processing.

Normally data, like tuples in a database, are all anonymous. If you insert the value 123 into a database you can’t distinguish it from another 123 in the same database. The data looks all the same. But events report on a real world events and should be unique.

Each event (instance) should have its own universally unique id!

Technically this is very easy. Just let someone, commonly the event generator, assign an UUID to the event.

But, what’s in an event id?

First of all, if we add the requirement (which I think we should) that events are create-once-never-change type of things, it is enough to use the identity of an event in place of the event itself in many cases. Events can be replicated and distributed rather easily and found by their id as the search key. Basically we can store info such as “Event with id 1939 occurred because of event  with id 1914” without storing the events themselves.

I tend to think of events as “Notification about state change in some (business) entity”. With this in mind, the event id would be the id of a particular state change.

And here’s were things get tricky…

Lets say we have the event:

  • Type: Collision
  • Payload: Vehicles that collided

To complicate things, or rather make them more realistic, lets say this event can be reported by a number of event generator sources. This, for example, could happen because the collision is detected by a bystander, a surveillance camera and by the vehicles themselves at the same time.

So we have at least five different entities which detect the same collision.

Now, should all the entities generate an event with the same identity? After all, they are all creating a notification about the same thing?

If multiple event generators observe and report about the same event occurrence, they would obviously need to be coordinated if we require that they generate events with the same id. This is not practical or maybe not even possible. So we need some other solution for this. (hint: multi event fusion)

What if the different events describe a slightly different aspect of the collision? It’s still the same collision.

When creating event processing logic, like rules in ruleCore, you wan’t to be shielded from this problem. The rules should see only one collision! Otherwise you will easily end up in rules triggering multiple times and you have suddenly reports about multiple collisions when there in reality have been only one.

Something as simple as a event id could have a rather complex semantics and you should be aware of this when designing your event model.

Or is it just me complicating things?

 

 

 

Share/Save/Bookmark

Complex Event Processing

Let’s Dance – Diskotanssi

December 19th, 2008

 

While doing some googling on Erlang I found and interesting use of it: http://discoproject.org/, a project from Nokia.

It combines two completely different languages:: Erlang and Python.

It also does this in a manner that is rather clever: Use Erlang for the overall picture and Python to do the actual work.

It seems like a nice way to do heavy crunching using MapReduce in a cloud like Amazon’s elastic computing cloud. Although not directly suitable for event processing, it seems like a nice way to deal with huge amounts of data. Maybe for offline event data analysis?

 

Share/Save/Bookmark

Complex Event Processing

Erlang – Take 1

December 18th, 2008

Now I have had some time to look at Erlang, from the event processing perspective

The whole language feels a bit weird. That is, it’s not like any of the "normal" languages like Java, C#, C, Python. The first reaction is; How can you ever acomplish anything in this? But after a couple of hours reading and setting your programming mind into a different mode, the zen of Erlang starts to appear.

First of all – The language seems to be very small. Basically, being a functional language, everything is done using functions. I have not yet found any loops, if-then-else or other more traditional language constructs.

What surprised me is that I actually know some Erlang already! We do lots of coding in Python and use a number of Pythons functional programming features. These seem more or less inspired by Erlang! Even the syntax looks the same in a few places.

If you look past the rather outdated Erlang syntax and just accept a number of "odd" features like the fact that variables can be set only once, there are a number of interesting features built right into the language.

From the event processing view one of the most interesting features of Erlang is that is favors designing your applications around a large number of processes. Basically every function you call can be a process, this allows for a different way of thinking about concurrency. It seems that starting 100k or so processes is not a big deal, compare this with the amount of threads you can start in a conventional system.

Erlang marketing says that 99.9999999% uptime can be achieved. Which you really can’t say about a normal piece of software. Maybe these telco guys know something we other business software programmers don’t.

Everything about Erlang feels a bit 1980, but there are some fundamental concepts which seems to be usable in the new world of cheap servers where everyone is looking for better ways to get that free lunch for scalability…

A simple example of how Erlang code looks like:

Pid = spawn(ipctest, consumer, []),

pingpong() ->
    N = 100000,
    Pid = spawn(ipctest, consumer, []),
    Start = erlang:now(),
    Message = {ping, self()},
    dotimes(N, fun () ->
               Pid ! Message,
               receive pong -> ok end
           end),
    Stop = erlang:now(),
    N / time_diff(Start, Stop).

consumer() ->
    receive
    message -> consumer();
    {done, Pid} -> Pid ! ok;
    {ping, Pid} ->
        Pid ! pong,
        consumer()
    end.

Share/Save/Bookmark

Complex Event Processing

Event Model – What Are We Looking For?

December 16th, 2008

Today I woke up to the most unexpected news – There’s been an earth quake here in Sweden! For many of you this might not sound like something to get exited by. But considering that the last one was ages ago and this one was the biggest one in hundred years, I’m for sure a bit excited.

Sweden is normally the most stable place on earth. Both geographically, wheatherwise (and politically, I wonder if there’s a correlation between these…). So every little blizzard or other acts of nature is a big thing here.

I have been write lately about event models and event semantics. But I think it’s time to step back a bit and try to define our goals with an event model.

I’ll start with the format of events, this is easier than semantics….

When we designed the event format for ruleCore we had a couple of things in mind and I think these are rather generic in nature:

  • First, use XML. For better or worse we are stuck with XML. There are so many tools and standards related to XML so it’s a good choice.
  • There should be a small number of required elements and attributes which must exist in all events. These are all common to all events.
  • The inbound events should be flexible enough to contain an user defined part which is as flexible as possible.
  • Both the common and user defined part should be verifiable against a schema.
  • Events are immutable, so they can’t be changed after creation. But, for practical purposes systems might want add meta-data to an event. Make sure there’s an well defined part to put all kinds of data without breaking anything.
  • It is very common to search for the event type of an inbound event in order to find its definition.

 

I’ll use an example to show what we ended up with. This might serve as an base to your own event format or model:

<SpeedingViolation
  rc:eventTimestamp="2009-11-21T13:35:16.398+01:00"
  rc:eventId="c4e00004-abb1-4bee-8c54-8efbb1a5178b"
  xmlns="http://www.rulecore.com/2008/user"
  xmlns:rc="http://www.rulecore.com/2008/base">

 
  <rc:EventHeader>
 
   <rc:SecurityInfo>
    <rc:Credentials>3536ab16-393b-4447-a892-0a3e161f23a6</rc:Credentials>
   </rc:SecurityInfo>
 
   <rc:CausedBy>
     <rc:Event eventId="87b353e6-765befcda05c" eventType="CameraCptr" index="1"/>
     <rc:Event eventId="0e04849e-39c1db18b5a1" eventType="RadarData" index="2"/>
   </rc:CausedBy>
 
  </rc:EventHeader>
 
  <rc:EventBody>
 
    <Road>Route 293</Road>
    <SpeedLimit>120</SpeedLimit>
    <Speed>145</Speed>
    <Vehicle>AND-239</Vehicle>

   </rc:EventBody>
 
</SpeedingViolation>

You can just ignore all the namespace stuff for now, that’s left for the advanced ruleCore course…

As you can see there are three major parts in the event:

  1. The root element with some attributes
  2. The EventHeader
  3. The EventBody

The idea here is that the root element gives away the type of the element, this allows for easy validation against an schema and allows us to find the event type in a well defined location. It’s alway the root element so it is easy to find using XPath.

Attributes – The attributes of the root element are the fundamental attributes of the event. They must be there for all events. We have choosen to have the timestamp and id of the event as the required attributes.

Header – The event header contains meta-data. This can be used basically for anything and can be used to associate information to the event which is practical to attach to it.

In this example we have security information, which is completely system dependent and only used by the ruleCore CEP Server hosted service. Other systems might use the header for other system info too. By clever use of namespaces it is possible to allow systems to add elements into the header without any risk of breaking things.

The CausedBy element is also meta-data which is convenient to attach to the event. There’s nothing that prevents this information to be stored in an external place. But we found it practical to keep this information with the event too. Here any system can find information about the events which caused this event to happen.

Body – The EventBody is the third and perhaps most interesting part of the event. This is the user defined payload. By smart design of the XML Schemas you can verify the contents of the body against a schema too. By proper namespace usage the content here will not clash with anything else either.

In this example the header tells us that this is a speed violation event and the body contains information about the violation. The id of the vehicle and its speed and so on.

 

 

 

 

 

Share/Save/Bookmark

Complex Event Processing

Events in XML – Two Options

December 14th, 2008

Option 1

         

Option 2

<Event>
  <Header>
    <EventType>Exception</EventType>
    <EventId>92839289382</EventId>
  </Header>
  <Body>
      <Err>Marco messed up</Err>
  </Body>
</Event>

 

<Exception id="98928392839">
  <Header>
  </Header>

  <Body>
      <Err>Marco messed up</Err>
  </Body>
</Exception>

If we get practical for a short while, before going on and talking about event semantics and event models.

If we agree that XML is one good way to represent events in most common cases, we have basically two ways of doing it. Above you can see two different approaches.

Option 1 uses a common XML element as the root and says that this is an "Event". The type, id and other properties can be defined in the header element of the event. The body would then contain a payload suitable for this particular type of event.

Option 2 uses the root element to name the type of the event. The properties of the event are given in the root element and the Body element contains the payload just as in option 1.

It might be a matter of taste which ones to prefer, my taste is for option 2. It’s just more XMLish…

There are a number of features of it I like:

  • The event type is apparent directly from the root element. Many times processing is based on event type and it can be critical to find it easily
  • Having the root element define the event type lets us easily use a schema to verify it. The schema, if done properly, can contain two parts. One base which defines the overall structure of the event. That is the <Header> and the <Body> tags. The base also contains the definition of the header and which attributes to expect in the root element.
  • The extension to the base schema, specific for the event type, defines what can go into the Body.

So each event would have a well defined properties and header part, but still having a totally flexible body.

At the same time everything can be validated against schemas.

It might not come as a suprise that we use option 2 in the ruleCore event model.

Option 1 is not completely wrong either, but iit does not feel as XML Schema friendly to me and it requires some more parsing to get the event type.

This is the to main ways I see using XML, do you have ideas for other ways of doing it?

That’s if you prefer XML. I can see that there are a place for more compact binary formats too. So the event model and the definition of semantics should be neutral to the format used to encode events. But for practical purposes XML might be a good candidate in most cases.

 

 

Share/Save/Bookmark

Complex Event Processing

Event Models – Continued

December 13th, 2008

I had some very good comments on my previous* post about event model semantics.

The issue: What are the fundamental properties of an event, and what are their semantics. The prime candidates seem to be:

  • Id
  • Type
  • Detection timestamp

Others, which could be present, but might not be required:

  • Location
  • Entity
  • Occurrence time stamp
  • Event class or category

Lets look at some practical examples..

<ZoneEntry id="bd79d85a-12c1-4a1f-9047-3dc34fcab2cc" time="2008-12-13 20:45:44">
  <Header>

    add stuff here tomorrow
  </Header>
  <Body>
    add stuff here too tomorrow
  </Body>
</ZoneEntry>

I’m using XML here as an example, but it could as well be something else… In this example we have an event of type ZoneEntry, it was detected just a couple of minutes ago and its globally unique id is bd79d85a-12c1-4a1f-9047-3dc34fcab2cc.

As we can see I envision the event to contains two containers, one if the header and the other is the body.

The header contains information about the event itself, for example which events it depends on, security related information, information about the event sending system and other meta data type of information.

The body is the actual payload of the event. 

This is a good start, lets continue adding to it tomorrow. But until then, lets enjoy the weekend.

 

 

* Due to an unfortunate snafu (that is, I messed things up) my previous posts are not available currently, they are not lost. Just sitting in a SQL dump ready to be imported whenever I have the time to figure out how…

Share/Save/Bookmark

Complex Event Processing