Home > Complex Event Processing > Event Models – Identity

Event Models – Identity

December 22nd, 2008

A couple of days ago I wrote a bit about event models and tried to sort out what the fundamental properties of an event are.

One of those properties seems to be the identity of an event. This property is something I see as the major defining characteristics of event processing. The thing that sets event processing apart from mere data processing.

Normally data, like tuples in a database, are all anonymous. If you insert the value 123 into a database you can’t distinguish it from another 123 in the same database. The data looks all the same. But events report on a real world events and should be unique.

Each event (instance) should have its own universally unique id!

Technically this is very easy. Just let someone, commonly the event generator, assign an UUID to the event.

But, what’s in an event id?

First of all, if we add the requirement (which I think we should) that events are create-once-never-change type of things, it is enough to use the identity of an event in place of the event itself in many cases. Events can be replicated and distributed rather easily and found by their id as the search key. Basically we can store info such as “Event with id 1939 occurred because of event  with id 1914” without storing the events themselves.

I tend to think of events as “Notification about state change in some (business) entity”. With this in mind, the event id would be the id of a particular state change.

And here’s were things get tricky…

Lets say we have the event:

  • Type: Collision
  • Payload: Vehicles that collided

To complicate things, or rather make them more realistic, lets say this event can be reported by a number of event generator sources. This, for example, could happen because the collision is detected by a bystander, a surveillance camera and by the vehicles themselves at the same time.

So we have at least five different entities which detect the same collision.

Now, should all the entities generate an event with the same identity? After all, they are all creating a notification about the same thing?

If multiple event generators observe and report about the same event occurrence, they would obviously need to be coordinated if we require that they generate events with the same id. This is not practical or maybe not even possible. So we need some other solution for this. (hint: multi event fusion)

What if the different events describe a slightly different aspect of the collision? It’s still the same collision.

When creating event processing logic, like rules in ruleCore, you wan’t to be shielded from this problem. The rules should see only one collision! Otherwise you will easily end up in rules triggering multiple times and you have suddenly reports about multiple collisions when there in reality have been only one.

Something as simple as a event id could have a rather complex semantics and you should be aware of this when designing your event model.

Or is it just me complicating things?

 

 

 

Share/Save/Bookmark

  1. December 22nd, 2008 at 12:12 | #1

    A person has more than one Id document (passport, health card, social insurance..); they all describe the same person–with a bit different data–yet they have different Id numbers. However, the Id number actually belongs to the document, not to the person.

    I think that an EventId belongs to an event-object. For example, an event: temperature raised over 30 deg; reported by a sensor actually means that the sensor+circuit detected temperature change. It MAY BE that the sensor is faulty and that another nearby sensor did not report anything. So the reported event actually belongs to the reporting device, not to the real-world phenomenon it is trying to describe.

  2. December 22nd, 2008 at 12:48 | #2

    I agree that event semantics may be complex. If you have a three-car pile-up, does that count as one collision or two, given that the third car hits a few seconds after the first two? Or three collisions, if the third car hits both of the first two cars?

    So you have to have some semantic model that provides a basis for counting how many collisions there are in the real world, let alone how many event reports are received. There is a semantic rule (sometimes called an identity rule, after Frege) as to when two things are the same, or the-same-again.

    An identity rule (together with a membership rule – - which things count as collisions at all) should be part of your definition of “collision”.

    • marco
      December 22nd, 2008 at 13:03 | #3

      Richard, that would depend on when the event generator detects the event. Somehow the event is detected and a notification about it is created. Depending on the observer the notification might contain information about 2 or 3 cars, depending on when it was detected.

      This is by the way a good argument for Damir’s case – That the id is the identity of the event object. The actual event can never be identified across different observers anyway. So the best way might be to look at the id as the id of the event observation.

      It’s probably the task of some event unification/fusion front end to try to figure out which events are notifications about the same real world event and generate some kind of aggregate of this. When think of it, I think something like this could be done with a ruleCore rule, after all it’s a situation we detect here. A situation which consists of a number of near simultaneous reports about the same real-world activity. A perfect task for a rule to detect and aggregate into one outbound “collision summary” event.

  3. December 22nd, 2008 at 14:22 | #4

    Re: aggregation

    Marco, let us imagine a system like GM OnStar. The system reports three vehicles crash at the same GPS location. Does it mean they bumped into each other or something else; like icy road so they all ended in a ditch?

    Aggregation may be tricky. In my view, for aggregation to work, the programmer must actually know what lies on a level below an event–exactly how is it measured–and factor in any possible false-readings and assumptions. Having a background in controls engineering I frequently like to remind myself of the Three Mile Island.

    http://en.wikipedia.org/wiki/Three_Mile_Island_accident#Accident_description

    In other words, an event-object is just an abstraction, how an observing device senses a real world phenomenon–not the phenomenon itself. The distinction should always be clear in users minds.

  4. Hans
    December 22nd, 2008 at 15:37 | #5

    Of course I like unique IDs and I agree that they make soooo many things easier, even down to basic debugging.

    But Taking a step back from the wish to have unique IDs, I’d say that this requirement should be dictated by the use cases. My opinion is that it’s best to relax as many requirements as possible to begin with and to only include those requirements that are genuinely required to meet the use cases.

    For example, if you are interested in using the unique ID as a placeholder/primary key kind of thing… then maybe it would be Ok to have that ID assigned when an event is touched by the EP system (rather than at the source where it’s generated).

    For example, do the GPS events from trucks come with unique IDs? And does that matter? That’s not a rhetorical question, I don’t know the answer.

  5. marco
    December 22nd, 2008 at 23:04 | #6

    Damir, I think I start to agree with you… Event id should be the identity of the observation… That makes most sense.

    To aggregate multiple events into a common notification is clearly done by using one or several rules. There surely must be a way, expressed by rules, describing how to do the aggregation. These rules could be context dependent or dependent on the goal with the aggregation.

    Hans, as for mandatory IDs. Don’t know how others do this. But in ruleCore we require an id for each event when the event hits the rules. It can be set by the sender if it wishes to record the id of an outbound (from event generator to ruleCore, so inbound for ruleCore) event, perhaps for knowing later on what rules that event triggered using the causality tracking feature. If there’s no id set in the event, the Event I/O frontend creates one upon arrival.

    Actually the most common case is that the event generator does not care about the id of the events it generates. So they get stamped by ruleCore, but for the processing and creation of rules to make sense we need an id for each event.

    Each outbound event does also have an id. But that id can be ignored by the receiver.

    So the event id is not required at the borders but is a must internally.

    The only case I see for event generators to actually generate an id is when they need to track causality or if a rule is created to trigger on a particular event instance, bad rule design but anyway…

    If you get an outbound event from ruleCore with a causality list you might want to keep that if you send an event back into ruleCore again. This keeps a nice CausedBy list in the events when they loop between event generators, receivers and ruleCore.

    So. no. Events from GPS do not have an id…

  6. Sue
    December 25th, 2008 at 09:34 | #7

    The ID is important in distributed EP solutions. The distribution of event information should be done using the ID of the event. The event object should only be distributed on demand.

  7. December 29th, 2008 at 16:17 | #8

    I think the semantics of the real-world phenomena (how many real-world collisions does this count as) is logically prior to the aggregation of the event notifications (how many messages do we receive, and do these refer to the same real world phenomenon). I think the latter may depend on when the event generator detects the event, but not the former.

    So I think I’m agreeing with Damir, and I concur with the relevance of the Three Mile Island example. As I recall, the operators were faced with dozens of simultaneous warning lights and were unable to make sense of them. (See analysis by Karl Weick.)

Comments are closed.