Archive

Archive for the ‘Complex Event Processing’ Category

Who decides if A and B are related?

May 15th, 2009 3 comments

Here’s something that I have been thinking about lately..

Let’s say we have two events; A and B.

How are they related?

Scenario 1: I send event A into an event processing box and out comes B. By some information I have at hand, I can conclude that B is caused by A.

Scenario 2: A friend of mine sees event A going into the box, he also sees B coming out of it. But my friend has another set of information and concludes that A did not cause B.

So, two opposing views;

  1. A caused B
  2. A did not cause B

 

Are me and my friend both right? 

If so, where should we record this causality information? 

A popular way is to record it in event B. Where a field would record the fact that is was caused by A.

But if event causality is a matter of perspective? Do we need to have multiple databases of causality information depending on who’s watching?

 

 What do you think?

 

Todays totally unrelated picture:

A totally unrelated lion

Share/Save/Bookmark

Call for Papers – RuleML 2009

May 13th, 2009 Comments off


                               Call for Papers
                                       
                                 RuleML 2009
                                 
   3rd International Symposium on Rules, Applications and Interoperability
                  November 5-7 2009, Las Vegas, Nevada, USA
                  
                           http://2009.ruleml.org/


****************************************************************************
*Latest news                                                               *
*Student grants of the value of up to $1000 plus free registration         *
*Keynote by Sandro Hawke, W3C RIF Contact -  The Future of Rule Interchange*
*Prestigious prizes and new categories in the RuleML Challenge             *
****************************************************************************


Sponsored by
============================================================================
Franz Inc
NICTA (National ICT Australia) Ltd
Corporate Semantic Web
Logic Programming Associated Ltd
Modelsystems Ltd
ruleCore
============================================================================

Overview and Aim
============================================================================
The International Symposium on Rules, Applications and Interoperability has
evolved from an annual series of international workshops since 2002,
international conferences in 2005 and 2006, and international symposia since
2007. This year, the 3rd International Symposium on Rules, Applications and
Interoperability (RuleML-2009) takes place in Las Vegas, Nevada, USA,
collocated with the 12th Business Rules Forum, the world's largest Business
Rules event. RuleML-2009 is devoted to practical distributed rule
Technologies and rule-based applications which need language standards for 
Rules (inter)operating in, e.g., the Semantic Web, Multi-Agent Systems,
Event-Driven Architectures, and Service-Oriented Applications. 

The main goal of RuleML-2009 is to stimulate the cooperation and
interoperability between business and research, by bringing together rule
system providers, participants in rule standardization efforts, open source
communities, practitioners, and researchers. The concept of the symposium
has also advanced continuously in the face of extremely rapid progress in
practical rule and event processing technologies. As a result, RuleML-2009
will feature hands-on demonstrations and challenges alongside a wide range
of thematic tracks, and thus will be an exciting venue to exchange new 
ideas and experiences on all issues related to the engineering, management,
integration, interoperation and interchange of rules in open distributed 
environments such as the Web.

Conference Theme
============================================================================
This year, we particularly welcome submissions that address applications of
Web rule technologies for business and information systems. We invite you to
share your ideas, results, and experiences: as an industry practitioner,
rule system provider, technical expert and developer, rule user or 
researcher, exploring foundations, developing systems and applications, 
or using rule-based systems. We invite high-quality submissions related to 
(but not limited to) one or more of the following topics:

Track Topics
----------------------------------------------------------------------------

Rule Transformation and Extraction
  - Transformation and extraction with rule standards, such as SBVR, RIF and
    OCL
  - Extraction of rules from code
  - Transformation and extraction in the context of frameworks such as KDM
    (Knowledge Discovery meta-model)
  - Extraction of rules from natural language
  - Transformation or rules from one dialect into another

Rules and Uncertainty
  - Languages for the formalization of uncertainty rules
  - Probabilistic, fuzzy and other rule frameworks for reasoning with
    uncertain or incomplete information
  - Handling inconsistent or disparate rules using uncertainty
  - Uncertainty extensions of event processing rules, business rules,
    reactive rules, causal rules, derivation rules, association rules, 
    or transformation rules

Rules and Norms
  - Methodologies for modeling regulations using both ontologies and rules
  - Defeasibility: modeling rule exceptions and priority relations among
    rules
  - The relationship between rules and legal argumentation schemes
  - Rule language requirements for the "isomorphic" modeling of legislation
  - Rule based inference mechanism for legal reasoning
  - E-contracting and automated negotiations with rule-based declarative
    strategies

Rule-based Game AI
  - Rule-based movement, decision making, strategies, behavior design
  - Rule-based environmental programming, virtual reality
  - Rules for multi-agent/character games
  - Rules for serious games
  - Rule-based agent design

Rule-based Event Processing and Reaction Rules
- Reaction rule languages and engines (production rules, ECA rules, logic event action formalisms, vocabularies/ontologies) - State management approaches and frameworks - Concurrency control and scalability - Event and action definition, detection, consumption, termination, lifecycle management - Dynamic rule-based workflows and intelligent event processing (rule-based CEP) - Non-functional requirements, use of annotations, metadata to capture those - Design time and execution time aspects of rule-based (Semantic) Business Processes Modeling and Management - Practical and business aspects of rule-based (Semantic) Business Process Management (business scenarios, case studies, use cases etc.)
Rules and Cross Industry Standards
- Rules in Current Industry Standards, including: - XBRL: Extensible Business Reporting Language - MISMO: Mortgage Industry Standards Maintenance Org - FIXatdl: FIX Algorithmic Trading Definition Language - FpML: Financial products Markup Language - HL7: Health Level 7 - Acord: Association for Cooperative Operations Research and Development (Insurance Industry) - Rules for Governance, Risk, & Compliance (GRC), e.g., rules for internal audit, SOX compliance, enterprise risk management (ERM), operational risk, etc - Rules and Corporate Actions
General Rule Topics
- Rules and ontologies - Execution models, rule engines, and environments - From rules to FOL to modal logics - Rule-based reasoning with non-monotonic negation, modalities, deontic, temporal, priority, scoped or other rule qualification - Rule-based default reasoning with default logic, defeasible logic, and answer set programming - Graphical processing, modelling and rendering of rules - Rules in Semantic Web Technologies(SW), Artificial Intelligence (AI), Business Process Modeling (BPM), Cloud Computing (CC), Intelligent Agents, Model-Driven Architecture (MDA), Software Engineering (SE), Unified Modeling Language (UML), e-Learning, e-Commerce, ... - Miscellaneous rule topics Case studies, experience reports, and industrial problem statements are particularly encouraged.
RuleML-2009 Challenge
============================================================================ The RuleML-2009 Demo Challenge is one of the highlights of RuleML-2009. We invite submissions of demos where rules are used in interesting and practically relevant ways to, e.g., derive useful information, transform knowledge, provide decision support and provide automated rule-based monitoring, enforcement, validation or management of the behavioural logic of the application. The Challenge offers participants the chance to demonstrate their commercial and open source tools, use cases, and applications. Submissions are solicited in these categories: - Benchmarks (test cases, suites) with evaluations of (their own, other) rule engines and/or rule translators, possibly drawing on our growing pool at http://ruleml-challenge.cs.nccu.edu.tw - Case studies (use cases) implemented via engines/translators employing rule standards such as RIF, RuleML, CLIPS, Common Logic, SBVR, and ISO Prolog. We welcome all demos about tools and applications using rules such as: - Derivation rules, including query and integrity rules - ECA rules, including production rules, reaction rules, and rule-based CEP languages Authors of demo are also invited to submit a Challenge demo paper for publications in the conference proceedings, see the submission section below for submission details. Prizes will be awarded to the two best applications from each category. All accepted demos will be presented in a special Challenge Session. A submission to the RuleML Challenge has to meet the requirement that declarative rules explicitly play a central role in the application. Basically this means that: Rules are explicitly represented in a declarative format and they are decoupled from the application (rather than being compiled or hard-coded into the application logic). The demo should preferably (but not necessarily) be embedded into a web-based or distributed environment so that there will be a need for features related to the RuleML conference topics, as listed in the call for papers. For more details and the demo site web link please consult the RuleML-2009 Challenge website: http://ruleml-challenge.cs.nccu.edu.tw Student Grant Awards ============================================================================ Two travel grants are available to students who are authors or co-authors of papers or demos accepted for presentation at the symposium. The grants include free registration and cover travel expenses up to 1000 dollars. Conference Language ============================================================================ The official language of the conference will be English. Submission ============================================================================ Authors are invited to submit original contributions of practical relevance and technical rigor in the field, experience reports and show case/use case demonstrations of effective, practical, deployable rule-based technologies or applications in distributed environments. Papers must be in English and may be submitted at http://www.easychair.org/conferences/?conf=ruleml2009 as: Full Papers (15 pages in the proceedings) Short Papers (8 pages in the proceedings) RuleML-2009 Challenge Demo Paper + Show Cases (3-5 pages in the proceedings) Please upload all submissions as PDF files in LNCS format (http://www.springer.de/comp/lncs/authors.html). To ensure high quality, submitted papers will be carefully peer-reviewed by 3 PC members based on originality, significance, technical soundness, and clarity of exposition. Authors are requested to upload the abstracts of their papers before June 9, 2009 and to upload their complete papers by June 16, 2009. The selected Papers will be published in book form in the Springer Lecture Notes in Computer Science (LNCS) series along with a CD with demo software and documents. The best paper from all submissions will be determined by the PC and a Best Paper Award will be handed over at the Symposium by a Sponsor. All submissions must be done electronically. A selection of revised papers will be resubmitted to a special issue of a journal. Submissions to the RuleML Challenge 2009 consist of a demo paper of 3-5 pages, describing the demo show case, and a link to more information about the demo/show case, e.g. a project site, an online demonstration, a presentation about the demonstration, or a download site for the demonstration. In case of product demos, the link can be password-protected: please submit a password for anonymous login from any Web browser, giving us the permission to pass the password on to 3 PC members. The submissions should satisfy the minimal requirements defined in the topics of interest and preferably exhibit some of the additional desiderata. The more desiderata are met by an application, the higher the score will be. The demos will be evaluated by the RuleML-2009 Program Committee and prizes will be awarded to the two best applications, sponsored by the RuleML Inc. non-profit organization. Review Process ============================================================================ The submitted papers will pass the blind review process. At least three members of the Program Committee will review each submission. Important Dates: ============================================================================ Abstract submission deadline: June 9, 2009 Paper Submission deadline: June 16, 2009 Notification of acceptance: July 18, 2009 Camera ready due: August 9, 2009 Symposium dates: November 5-7, 2009 RuleML Challenge: November 5, 2009 Conference Venue ============================================================================ RuleML-2009 will take place at the Bellagio in Las Vegas collocated with the Business Rules Forum. Keynote Speakers ============================================================================ - Sandro Hawke, W3C RIF Team Contact The Future of Rule Interchange - TBA Programme Committee ============================================================================ General Chair -------------------- Adrian Paschke, Freie Universitaet Berlin, Germany Program Chairs -------------------- Guido Governatori, NICTA, Australia John Hall, Model System, UK Liaison Chair -------------------- Hai Zhuge, Chinese Academy of Sciences Publicity Chair -------------------- William Langley, NRC-IRAP, Canada Track Chairs Rule Transformation and Extraction -------------------- Erik Putrycz, Canada Mark Linehan, IBM, USA Rules and Uncertainty -------------------- Matthias Nickles, University of Bath, UK Davide Sottara, University Bologna, Italy Rules and Norms -------------------- Thomas Gordon, Fraunhofer FOKUS, Germany Antonino Rotolo, CIRSFID, University of Bologna, Italy Rule-based Game AI -------------------- Benjamin Craig, National Research Council, Canada Weichang Du, University of New Brunswick, Canada Rule-based Event Processing and Reaction Rules -------------------- Alex Kozlenkov, Betfair Ltd., UK Adrian Paschke, Free Univ. Berlin, Germany Rules and Cross Industry Standards -------------------- Tracy Bost, Valocity, USA Robert Golan, DBMind, USA RuleML Challenge -------------------- Yuh-Jong Hu, National Chengchi University, Taiwan Ching-Long Yeh, Tatung University, Taiwan Wolfgang Laun, Thales Rail Signalling Solutions GesmbH, Austria RuleML 2009 Sponsors ============================================================================ Silver Sponsors -------------------- Franz Inc NICTA (National ICT Australia) Ltd Corporate Semantic Web Bronze Sponsors -------------------- Logic Programming Associated Ltd Modelsystems Ltd ruleCore RuleML 2009 Partners ============================================================================ W3C, World Wide Web Consortium Belgian Business Rules Forum MIT Sloan CIO Symposium International Association for Artificial Intelligence and Law Event Processing Technical Society BPM Forum Belgium October Rules Fest SKG2009 5th International Conference on Semantic, Knowledge and Grid RR-2009 3rd International Conference on Web Reasoning and Rule Systems

 

Share/Save/Bookmark

When CEP Might be a Bad Idea

May 10th, 2009 7 comments

Recently I was in discussion with a group of developers. They had spend some days in trying to understand what’s behind the CEP TLA.

By some creative googling they had found all the vendors that I expected them to find when researching CEP: StreamBase, Aleri, ruleCore, Apama, Tibco BusinessEvents, IBM  Business Events and Agent Logic.

This clever bunch of developers seemed to have a pretty good understanding on what CEP was. But something they really could not agree on what kind of problems CEP was unsuitable for.

As one said, "From the vendors web site I gather CEP can be used for just about anything apart from brewing coffee"

The rest of the discussion circled around the topic: When is CEP a bad idea?

To make a long dicsussion short, here’s (I might forgotten something though…) what we came up with…

CEP is not a Good Idea:

  • When polling SQL will do the jobb
  • If you could do a normal report and mail it to people instead. Requires fairly stable data and willing to use old booring (translation: productive and stable) technology
  • When nobody cares about getting the results in real-time
  • If your system is driven by user requests and there’s always a user waiting for an answer for each request.
  • When 100 lines of code will solve the same problem.
  • If you need an answer to a question at a specific point in time (this surprised me until I learned that the guy that came up with this was heavily into production rule systems)
  • Anything that needs hard real-time. All agreed that CEP was more for business type of stuff than actually controlling plants and machines.

I think this was a good coffee break exercise as the biggest problem in any real systems that I have seen comes from using tools and products which are totally unsuitable for the task. So understanding when not to use a specific type of building block is critical for any architect and developer.

 

 

 

 

Share/Save/Bookmark

A new Trend in CEP

May 8th, 2009 2 comments

In the last few months there seems to be a new trend emerging in the world of CEP.

More and more vendors start to attach the CEP label to their software. Suddenly through some marketing magic, many vendors think they need to start selling something "CEP Enabled" or a product which "does CEP" or even a "CEP Engine". Could it be the same vendors which re-branded to SOA a couple of years ago? ;)

This in my view is good and bad.

Good, beacause it provides at least some sort of evidence that the idea and concept of CEP is getting some attention. The vendors must think that their software in category X will sell better with a nice shiny CEP sticker attached to it.

Bad, because it might lead us down the same path that killed SOA. With everything CEP, there’s not way user’s will understand the point of CEP. The general definition "something that processes events" is so broad so that basically everything can be turned into CEP at marketing level.

But I think the Good overweights the Bad. More talk about CEP gets the idea of real-time processing more attention, which can’t be bad…

 

 

Share/Save/Bookmark

When Will Oracle and Others add Streaming Queries?

May 5th, 2009 7 comments

For a long time now I have expected at least one of the big RDBMS vendors to add support for streaming queries into their base product.

To me it seems like a good idea, an idea which might require a lot of work with the internals of the database engine in order to make it work. But still I thought it would be worth the effort.

So… Oracle, Sybase, Microsoft and IBM – Where’s your CEP efforts?

Scania Vabis

If you could do both streaming queries and the traditional SQL queries in Oracle, SQLServer, DB/2 or any of the major database I think user interest in CEP would raise by a magnitude.

Today we have a couple of small vendors (Aleri with their Coral8 engine and StreamBase for example) doing this with their own custom built engine. If one could do streaming queries in a natural way in the existing RDBMS vendors products we might end up in a situation where you don’t need a specialized engine for that. Good for the users, bad for a number of CEP vendors.

I think there are two options here. It’s too hard to implement streaming queries in a RDMBS without affecting pure SQL performance and without turning the code into a mess. Or, the vendors are actually building this as we speak but it takes a bit longer than I thought.

My best guess is that Oracle, will beat everybody to it. No idea when, but soon (or later) I think there will be streaming queries which feels as natural as the other type of queries.

If I know Oracle correctly, Their solution will be tightly integrated with the RDBMS engine, you will probably be able to do a lot of fancy stuff with it, but in true Oracle style there will be lots of configuration and endless setup to do before it works (translates to lots of nice billable hours for the consultants ;)  )

This would also solve the problem when your database sees a lot and lot of updates to it and you need to react to a specific state in the RDBMS. Triggers can solve a part of that problem, but are avoided by most developers as I know. So rapidly changing data, like appending events to the INBOUND_EVENTS table will (hopefully) in the future feed queries in a natural way – Just like Coral8/Aleri and StreamBase does today

 

Share/Save/Bookmark

So Why Not Use An In-Memory db Instead?

April 24th, 2009 11 comments

The world of CEP is constantly evolving and it will take some time before the world+dog agrees on what CEP is,  and more importantly what it is not.

Nobody (well, there’s people for everything…) would put heavy computation inside a normal SQL database in the year of 2009. I suppose you could, for example PL/SQL can do about anything. But it is not natural and most developers feel instinctly that it’s wrong.

For CEP, developers and architects have not developed these instincts yet.

This is not a surprise as the world of CEP evolves constantly and it’s still unclear to what the current vendor’s event processing offerings will evolve to in 10 years or so. I’m sure there’s new types of event processing tools being developed as we speak. Adding even more concepts usable for event processing.

So basically every developer and vendor has their own idea on what CEP is and how it should be done and more importantly what kind of problems should be solved using CEP.

For me, questions and statements like "Could CEP have used to stop the financial melt-down" is in the same category to ask SQL to do optimization of a logistics chain. To me and my personal view of CEP, they sound just wrong. You need financial algorithms (or whatever they are called in that world) to solve problems in that domain. Maybe running inside a CEP product, but still not CEP.

In this confusion the most asked question that I get is  – "Can’t we just use an in-memory database to do this?" . I really do understand the question. They see all these SQL based approaches and equate that to a special purpose engine doing stuff in memory instead of storing data on disk. So what’s the big deal they ask and move along…

The short answer is, for some type of problems; Go ahead and use a "normal" database, for example MySQL with an in-memory storage engine. After all, a server with 256Gb (quick check: A Dell R900 with 256Gb RAM and six cores is about $40k) memory does not cost that much anymore. So much of the performance benefits claimed by CEP vendors are rapidly decreasing in importance and more importantly not worth paying for as you can solve to problem by adding hardware. The group of customer which need more performance than a single server can deliver when doing standard in-memory SQL is constantly shrinking.

For a sub-set of problems, the conceptual power of SQL just is not enough. You can do lots of trickery with plain old SQL. But at some point it does not feel natural anymore and you need a PhD in SQL in order to understand what’s up.

Then you need something more powerfull and conceptually more suitable for just event processing. Even if you have one event per second, your SQL can get totally unmaintainable. So the complexity is in the complexity of the rules you apply, not in the number of events processed.

The smart vendor and customer will soon understand this and start focusing on concepts tailored specially for event processing and not old re-used concepts which are tweaked to the max to work in a new and fundamentally different paradigm.

I’m sure this process is happening as we speak (or otherwise I predict that I can add more RIP into my vendors list) and I’m excited to see what will come out of this in the years to come.

 

Share/Save/Bookmark

Books Relevant to CEP

April 20th, 2009 Comments off

Today I thought I’d complete my draft on books that I think are relevant to CEP. To my great surprise Opher Etzion had a post on the exact same topic!! Scary, I need to start wearing that tin foil hat again…

So I deleted my draft and refer you to his blog instead: here.

The reason I though I’d publish something that was more or less ready was that my blog writing quota went to answering a post on the complexevents.com forum, which contains an interesting discussion about hierarchical event processing. You can check it out here.

 

How this truck is relevant to CEP is left as an exercise  to the interested reader:

A truck passing a geofence

 

Meanwhile I’m off to start my new life and prepare with some new gear for tomorrows gym session…

 

Share/Save/Bookmark

Is it OK to Re-publish an Event?

April 6th, 2009 9 comments

Here’s a question that I can’t make up my mind on. So help me out…

For the sake of argument, lets agree that an event processing system/agent/application/engine works by; consuming events, performing some processing and eventually publishing an event as result.

With that in mind.

Is it generally allowed for the event processing system to (re) publish the same event it consumes?

For the question to make sense we also have to agree that any event processing system publishes an event when it wants to notify the outside world of an activity.

 

Share/Save/Bookmark

Why CEP is so difficult?

March 31st, 2009 5 comments

I was asked a time ago why CEP (Complex Event Processing) is so difficult?

Or rather, why is it so hard to build your own generic and functionally complete CEP server?

I think one of the answers is that you need to do two conflicting things:

  1. Process events in memory only
  2. Store events and state in persistent storage

 

You need to do 1. because of your need for speed.

You need to do 2. in order to build a reliable and predictable system.

 

The magic blend of these two are required for most CEP systems. It’s here where you need better than average architects and developers to produce anything usable.

 

Example – Why..

Just a simple example to understand why you need both. Say you are looking for a sequence of events:

     A,B,C,D,E

You expect this sequence to occur during a seven day period. When you see event A, you start looking for the rest of the events.

Lets say you saw events B and C after a couple of days. Then the server is re-booted suddenly, just quickly, but your system is killed unexpecedly.

When your event processor is brought back online again you need to remember that you have seen A, B and C and are looking for the next event in the sequence which is D.

This forces you to save state information in some kind of persistent storage (database). If you are looking for thousands and thousands of long event sequences you will quickly get really bad performance if you keep writing state information to the database constantly.

 

When You Don’t Need Persistence

There’s a special type of event processing where you don’t need persistent storage

A class of CEP solutions where the problems can best be described as data driven algorithms do rarely require persistence. Tools like Coral8/Aleri, Esper, StreamBase are commonly used to do this kind of processing. This type of processing, commonly done using some kind of extension to the SQL concept. Typical uses that I have seen have mostly simple queries which constantly produce output. If there are patterns in the queries they tend to be simple and short lived.

If you only need this kind of state-less processing then you can get really high performance instead of systems like ruleCore where the persistence is a fundamental concept and can’t really be switched off.

If you are shopping for a complete event processing platform you should at least evaluate everything with full persistence switched on. Provided that the CEP system provides real usable persistence, not all do…

 

 

Share/Save/Bookmark

Sybase builds on Coral8 source

March 16th, 2009 9 comments

So this is interesting. If I understand this correctly, Sybase now has the same engine as the merged Aleri+Coral8. Or at least one of the Aleri’s engines, I think they have two engines which they have plans to support.

It will be interesting how Aleri handles competetion from Sybase as they could now, at least theorethically, build the same kind of solutions.

So first Coral8 sells it’s source code to Sybase and then Aleri buys Coral8. Interesting times…

"Sybase is entering the Complex Event Processing business with the introduction of Sybase CEP, which is being offered – initially at least – as an option to its RAP – The Trading Edition capital markets analytics platform. The product results from the vendor’s licensing of the source code for Coral8’s CEP engine," – from here

 

 

Share/Save/Bookmark

On the Coral8 and Aleri Merger

March 12th, 2009 11 comments

So it seems that there’s now one less company in the event processing space.

Aleri bought Coral8, in what is called by the vendors as a merge. One company less in CEP space is not necessarily a bad thing as the resulting Aleri is much larger. Which can be a good thing.

If someone had asked me last week which companies in the CEP space would merge – Aleri and Coral8 would been my last guess. My best guess would have been a dashboard focused BAM company acquiring an event processing company to get a good CEP processing engine.

This is a total surprise to me. I like both these companies and their approach to CEP. Both places have lots of nice people working for them too.

But I really don’t get this merger. Someone somewhere has a plan which they fail to communicate. I read the official merger announcements and read a lot of comments on blogs and on the web. But I still don’t get the point, and I would think I know more of these companies and their technology than the average Joe Programmer out there.

I’m pretty sure that there were a lot of intelligent people doing lots of thinking before they agreed on this one. So my guess is that there’s a clever plan for all this somewhere. I just don’t get it.

What I see is (was) two companies with similar approach to event processing and products which did more or less the same thing – if we look at the big picture. So why merge two companies which are so similar? Maybe there’s something in their marketing approaches and existing customer base that makes this a good idea?

From a technical point of view, I think the new merged Aleri will have tons of work in trying to figure out how to merge these two codebases. Something which is commonly incredibly difficult to do taken into account company politics, developer egos, and the fact that the companies are far away from each other. Something like this can take years to complete and consume insane amounts of resources. So, how are Aleri going to keep up the speed needed while doing this merge.

What code to keep, what to throw away? At the same time keeping those few customers happy and avoiding internal politics.

But hopefully Coral8 and Aleri can show us how a merger is done in an efficient way.

As I see it the real value of Coral8 is their CCL execution engine. Not the portal and other fancy graphical stuff added recently. Execution of CCL is what made Coral8 rock. So buying Coral8 and not using it would be total waste of efforts. To get a really good streaming data processing engine is more or less the only thing I see which would make this merger make sense. If, and now I’m speculating, the Coral8 engine was seen as superior to Aleri’s own one, I would make a great base to build on. The Coral8 engine is really good at processing data streams and probably the best data stream processing engine out there. Maybe complemented with a Splash module from Aleri?

Just some speculations, I’m excited to see what comes out of this.

 

 

Share/Save/Bookmark

Coral8 and Aleri is now one…

March 9th, 2009 Comments off

Difference Between Context and Situation

March 3rd, 2009 10 comments

A common task in many event processing systems is to detect patterns of events.

If combined, these patterns will eventually form a situation consisting of multiple patterns over time.

So basically a detected instance of a situation is a specific sequence of events.

For example:

Situation definition:

order(door_id, command), order(door_id, command), error(door_id, msg), or

order(door_id, command), error(door_id, msg), order(door_id, command)

A specific instance of this situation could be:

order(123, close)
order(123, open)
error(123, open_fail)

Telling us that the door closed but the open failed… So far, pretty simple concept…

But what about an environment in which we have thousands of doors?

Do we like our situation definition to apply to all of them?

Most likely no… Here’s where the context management comes in.

 

The idea is to detect the situation in a very specific and limited context. The context restricts which events are considered when evaluating the situation detection process.

A context could be defined in a number of ways:

  • The most obvious restriction is that the events talk about the same door. Here its encoded in the first parameter 123.
  • The time of the day could be a restricting factor. Consider only events between 7am and 5pm.
  • Include only events in the context only if the events brings us information about doors which we have a service agreement on
  • Include only events from doors which are closer than 3 miles from Gunnar (the guy who will fix any problems).

 

As you can see from above, the definitions of the situation and context are independent. This is important as it allows the situation to be detected in multiple contexts.

Also a rule, maybe a decision rule (not CEP style activation rule), could be used to determine which context should be used or you could simply pair up a situation with a context.

So, simply:

Situation – A sequence of Events.

Context – Limits which events are considered for the situation

 

 

 

Share/Save/Bookmark

Syndera – R.I.P.

February 18th, 2009 7 comments

It seems that Syndera – www.syndera.com – was completely bought off* by Tibco and is now shut down.  My earlier speculations about Syndera was right. So now we have one less CEP company alive.

Hopefully the rest of the gang is doing fine. My Google Alerts have recently brought me some good news on Aleri and Coral8 so they are hopefully doing fine.

 [Edit: Here's a nice summary of Syndera]

 

*)  What is amazing is that a company like Syndera, which actually has (had) an idea on how make money and had a pretty good story on the value they were providing is only worth $1M (That’s what Tibco bought them for). I just don’t get this.( I’m not supposed to either, so it’s ok.) Compared to Twitter which is valued at $250M. To me Twitter has more or less no apparent value and there seems to be no idea on how to make money. Still investors push money into it. Are the investors totally insane? They could as well burn their money on a nice bonfire. Hopefully there are some investors left which are not on acid. I have the same distinct feeling now as I hade just before the .com bubble burst. How long before this web 2.0 bubble bursts. There seems to be tons of companies like Twitter getting VC money currently. All these will die soon if you ask me.

 

 

Share/Save/Bookmark

New Event Model

February 16th, 2009 2 comments

Soon (whatever that means to a software developer…) we are releasing a new feature set for ruleCore. 

Note – I said feature set, not a new version.

One of the things about delivering software as a service (SaaS) [or use your favorite term for this] is that we should not bother our users with things like versions and upgrades. Versions, patches, upgrades, hardware sizing and all that are a thing of the past. It’s sooo 2005.

If you are to deliver software as a service, sorry I mean as a Cloud (these things change names faster than I can type), then I think you should go all the way and do only services and nothing else. I’m starting to realize that Cloud software (if it is still called Cloud) must be designed differently from ordinary Box software. So it’s best to do either one and not mix them.

Anyway. To my point. In the new feature set we are introducing a new format for events. It’s something I’d like to get comments on.

Previously you did something like:

Event
  Header
    Fixed stuff
  Body
    Any valid XML

 

It’s the any valid XML part which is modified. So now you have:

Event
  Header
    Fixed stuff

  Body
     Property A: 'value 1'
     Property B: 'value 2'
     ...

 

 

The change might not look so big, but the implications of it are.

Instead of seeing the event body as a block of XML we now see it as a collection of named properties. The properties can be any valid XML but must be mapped to a property name.

The old approach is to write an XPath which gets us an anonymous value. This value is directly used in rule evaluation.

The new approach uses the properties in rule evaluation. The value of the property is found in the same old way with an XPath.The difference is that the XPath is defined in the event definition. So for each even definition we have a number of XPaths which are used to find the values of some named property.

Another interesting feature is that ruleCore knows how to derive a property from another one. So if you say something like "vehicle SameColor color", where vehicle and color are properties, ruleCore knows how to go from the vehicle property to the color property. Which means that you can apply different relations to vehicles and ruleCore knows how to look up the appropriate property. For example:

- Entity SameSize Entity
- Entity InFrontOf Entity
- Entity Inside Entity
 

Here the relation SameSize would need to find the Weight property of both entities and the InFrontOf entity would need to find the Location property. The Inside relation would need to find the location and bounding box of the entities in order to figure out if the left hand entity is inside the right one.

This makes things a little more complicated to set up, but much more flexible and powerful.

The point is that the properties are not just names. They have well defined semantics. Semantics that ruleCore knows a lot of, and can do a lot of interesting stuff with.

For example:

Let’s say that a property tells us that this is an event with a new location of a truck.

RuleCore would know the actual semantics of a real-world trucks and can based on that information assume a lot of things. For example we can make assumptions of maximum speed and we know that trucks can physically move faster than that speed.

So if we see a truck position in London at noon we know it’s an error to see it in Barcelona one hour later.

 

Weaving in semantics in this way opens up a lot of interesting ways to create rules. If we know that one event is from a tractor and the other one is from the trailer we can use the relation coupled to understand if the tractor and trailer is attached to each other. Or rather, should be attached to each other. If they are, the we can assume that they should for example travel in the same direction with just a couple of meters from each other. If they are not, someone just stole the trailer.

These properties can also be used to create a very rich context in which to evaluate the situation detection.

We could for example have an event view (that’s what we call the context) which contains only

  "location update events from heavy vehicles currently northbound on route E20 which have not stopped for the last hour"

As some other of the new features are that ruleCore knows how to read a map it actually knows which vehicles are on a certain road. Neat if you ask me.

So what I’m currently thinking of if this new functionality, and the extra work involved in defining rules, are worth the effort. What do you think? Does this make any sense to you?

Being from Scandinavia, I like simple and elegant things. This adds some complexity but might actually make things more elegant if we manage to get it right…

 

 

Share/Save/Bookmark

Another SQL Based Vendor

February 2nd, 2009 8 comments

There seems to be faith in the SQL based approach to stream processing. Here’s another vendor with a variation on the same concept: SQL stream. I wish them the best of luck!

Now we have a number of SQL based approaches. Quickly, can you tell the difference between Coral8, StreamBase, Aleri, Esper and Oracle’s tools (whatever brand name they have currently)? It seems that the language concept is more or less the same. I know, the vendors are unlikely to agree, citing features which are unique to them. That is certainly true for a specific problem for which one of these tools might be preferable over another.

But still, for the prospective customer, the difference is not in the languages as such. But in everything that surrounds them. The execution engines, tools, user interfaces, fancy dashboards. Not to forget soft factors as quality of support and perceived company stability.

It’s interesting to see that so many vendors go for the SQL based approach as there are many voices in the community that really don’t think SQL is a good idea for event processing. Some bloggers are a bit subtle about their opinions (company politics perhaps?) but others more independent are rather clear where they stand. For example, David Luckham, the father of the whole complex event processing field, writes in:

The features for defining event patterns have improved, although I do wish some of them would shovel SQL below decks and provide higher level event pattern definition languages!

I both agree and disagree.

First, the SQL based tools today are really good at what they are used for. And that is NOT event processing. It is marketed as such, but it’s actually DATA STREAM processing. Which is sounds similar, but is not event processing. For stream processing purposes, SQL rocks! For event processing, it can be make to do the job. But not in a natural way.

Personally I think that you need a better way to do EVENT processing than SQL. Using a query/poll based (SQL) concept and tweaking that into submission to handle event processing might not be the best option for processing in a event driven world.

I’ll repeat: Data stream processing is NOT event stream processing!

But as always, the users are always right. If they prefer SQL. Then SQL it is.  We’ll see when the market is mature enough to start doing real event processing instead of todays stream processing.

The weird thing is that in the early days, say 2004 or so, many of the vendors actually marketed their tools as data stream processing tools. Then somehow marketing took over and decided that they should market the same tools as event processing tools instead. So in many eyes they now have a lousy event processing tool instead of an incredible data stream processing tool. The products are more or less the same though. Only marketing changed as I can see it. Confusing.

Anyway, that’s in theory. Maybe the users really don’t care is the vendors call it CEP, the CEP label seems to be stuck to so many things so it starts to mean "any kind of data processing which is not like traditional database processing"

 

 

Share/Save/Bookmark

Context Aware Event Processing

January 17th, 2009 6 comments

In real life it’s common to see how different messages get a varying meaning when presented in different contexts.

For example, a quote out of context can more or less be turned into whatever the messenger wants it to be. The same goes for pictures where the photographer can create totally different pictures of a scene by simple placing them into a different context, in photography this is done by carefully selecting your angle, background and crop.

That brings us to a very interesting concept in event processing. Namely context aware event processing, its one of the two major ingredients in event processing. The other is the art of detecting interesting combinations of events.

The context is apparent when you process static data. The data you query forms the context. Your SQL query is executed in the context of the data in the database. It’s obvious that the same query results in different sets of answers when executed against different data. In event processing, the context is a bit more complex concept.

Creating Context

In event processing we need to create a context for us. Normally an event processing system is operating on a stream of events. This have to be turned into a meaningful context. You could ofcourse process events one by one, but that tends to be so simple so you won’t actually need specialized CEP software for that.

For example

Let’s say you would like to be informed about trucks which stay still for more than one minute.

Depending on the context, this could mean different things:

  • If detected on a large road with heavy traffic it’s something that need to be attended to immediately
  • If detected inside a city during rush hour, it’s not too much to care about and might only lead to and informative message that the transport is late.

These two contexts give the same situation completely different meaning and it can also be used to ignore the situation if it happens in certain context like inside company premises.

Context Management

The context can viewed as your dynamically managed virtual view into the stream of inbound events. The context contains a sub set of events from the view which, at any point in time, are somehow relevant as a group to the situation you are detecting.

The events which make up your context are normally somehow related using a number of common properties. Some examples:

  • Time – Manage the events in the context based on time constraints. For example only events from the last hour or those between nine and five.
  • Location – Events which notify us about changes in entities located in certain geographical areas, or perhaps those which are close to each other.
  • Content – Event content such as primary keys of different kinds can be used to keep only semantically related events available.
  • Relation – Include only events which are related somehow.  Relations can be dynamic and change over time. Example relations are caused by, is sub class of, is super class of, belongs to same category and so on.
  •  Entity – Include only events which are notifications about state change in the same business entity. For example the same person, truck, order, business process, customer or server

Your Evaluation Context

So what’s the point of all this context definition and management?

Situation detection (aka pattern detection) and other processing is performed in this context! 

This means that the same situation can be defined independently of the context.

Thus allowing same situation to be evaluated against different context in provide a modular and powerful processing model.

In order for this to work smoothly, the context must be treated as a first class citizen in the event processing model used. A context should be defined totally independently of any other processing which is performed using it.

When a situation is detected it is in turn used as a context to any action you like to perform as a result of the detection. This new context together with the original context in which the situation is detection provided a good base for creating meaningful result events.

The context is very important when reporting about a detected situation. In the truck stop example above, it would not be terribly valuable to know "A truck stopped". You need more context! At least the id of the truck and the location and perhaps road number where it stopped and the drivers name. Nice additions would also be how long it stood still and perhaps its movements during the last couple of minutes. All this information is picked up from the context while assembling the response.

All this context in the outbound (or whatever method your event processor uses) action event can be used to initiate a business process to handle the exceptional situation.

Example

 I think most event processing tools have some kind of context management available or at least functionality to create and manage your own context. I’ll use Coral8 and ruleCore here as example just because I know them best and don’t need to look up how they work. Maybe you can add more examples?

In Coral8 there’s a concept called windows. Normally each query contains a window which the query is executed against. There’s a extension of this in the later versions called named windows which are defined independently of the query. These named windows can be used to hold a context. A number of queries can write data into the and they can also be read by multiple queries. The named window can be maintained using CCL and thus there’s a very powerful and flexible way of deciding what events are present in the named view. For example (Copied from the Coral8 docs):

CREATE WINDOW WindowTradesMicrosoft
SCHEMA ‘WindowTradesMSFT.ccs’
KEEP 3 ROWS;

INSERT INTO WindowTradesMicrosoft
SELECT volume, price
FROM StockTrades
WHERE symbol = ‘MSFT’;

INSERT INTO AvgPriceMicrosoft
SELECT AVG(price)
FROM WindowTradesMicrosoft;

 

In ruleCore the context is called event stream views. A view is defined to include only events with certain properties. For example that they come from the same geographic zone or that they are all about the same entity. For example you could write something like:

 

<View name="Interesting Vehicles according to XYZ">

    <Type>
        <Event>Location</Event>
        <Event>X1</Event>
    </Type>


   <ZoneMatch/>


    <PropertyMatch property="Color"/>


    <MaxAge>00:10:00</MaxAge>


</View>

 

To create a view (context) which contains events of type Location and X1 from the same zone with the same color during the last ten minutes.

In ruleCore we use a declarative approach to context management. In the view definition you specify what properties the events must have in order for them to be present in the view.

Other event processing tools might also give the user the option to maintain the context using a procedural approach. You could basically have small piece of code (or other procedural logic) which updates the view continuously. Using some kind of pseudo code from a fictional event processing system this could look like:

if event.type ="Location" and someComplexFunction(event)   
     context.insert(event) 

if event.color = "Green" and databaseLookUp(event) 
    context.insert(event)

if event.color = "Red" and event.entity = "server1" 
    context.insertOrReplace(event)

 

 

Rich context management is one of the things I think sets event processing apart from traditional data processing and you should spend some time in getting it right when designing an event processing system or using one!

 

(When put into context, the boring picture of the Swedish "no parking sign" gets a bit more interesting with the two cars parked in front of it. Also, published in the context of a blog written in English it can be a bit amusing too. (Utfart is Swedish for Exit, nothing else…))

 

 

 

 

Share/Save/Bookmark

Where’s CEP heading?

January 16th, 2009 Comments off

Last night some parts of Sweden had a blistering temprature of -38C (-36). A pretty cold night by any standards.

If you like Swedish high-tech take a look at Boston Power. Probably the world best batteries, invented by Christina from Sweden.

Sadly Sweden lacks the start-up culture so many companies do just like Boston Power, they invent here in Sweden and then move to the US to build a company of the innovation.  Sweden pays the bills for education and research and US gets the profits from the company building effort. There are a number of other companies in the US that looks American but are actually Swedish.

I have seen a worrying trend in the CEP marketplace recently. The replication of the efforts done by companies which have produced message oriented middleware for integration purposes.

I took a quick look at the current vendors in this space and much of the advertised features compete directly with other message oriented product. Features like event routing/dispatching and transformation looks just like a variation of good old message oriented stuff known to everyone into the integration (EAI) world.

I think that those vendors which fails to realize that this is a problem already solved will have hard time in selling their products.

I asked a tech sales guy what the difference was between their old and widely used businedd integration tool and the new fancy event product. The products I asked about event look similar and you can do most of the stuff in the old tool, but of course there everything is called messages and not events. The answer, well, he was not sure exactly what the difference was once I started to ask about the details. The marketing of the products are certainly different, for any practical purposes I did see no point in investing in the new tool. Apart from the cool factor of having an event processing tool.

The other problem I see is that many vendors position their products in a way so that they are evaluated against traditional databases, BI, DW and other traditional tools. There’s no feeling of "you really, really, really can’t do this with your traditional tools". It works fine for those targeting the financial sector as there the pain of using the traditional tools is just too great.

So, are CEP and CEP vendors doomed? No, not at all. A couple of pure play CEP vendors will survive this crisis and CEP features will find its way into the traditional tools from many vendors. For example, the gains of using CEP to solve finance problems seems to be huge. Suddenly you can solve problems which are unsolvable without CEP. Those who gets this will be rich …

 

Share/Save/Bookmark

StreamBase Gets Cash

January 10th, 2009 8 comments

It seems that there’s at least someone out there who thinks that event processing is a good place to invest money in.

StreamBase apparently got about $6M of Series D funding from Battery Ventures, Accel Partners, Bessemer Venture Partners, Highland Capital Partners and In-Q-Tel.

This means that StreamBase have got nearly $50M in funding since its start. I suppose VCs like to have something like x10 return on their investment so someday StreamBase need to sell awful lot of expensive licenses. Or perhaps somebody realizing that StreamBase is good value for money at $500M instead of giving someone those $500M to build a competing software…

Being a systems programmer, I have no idea if this is a good thing. In one way it suggest that StreamBase does not sell enough to keep things running using licensing fees. On the other hand it could be seen as a good sign that a number of companies are investing in CEP. I suppose there are a number of briljant thinkers at these companies which are sure, to the point of $50M, that there will eventually be a healthy CEP business. Or they know something I don’t, which is rather probable.

Anyway, congratulations StreamBase! I’m pretty sure they are on track.

 

 

 

 

Share/Save/Bookmark

The State of the CEP

December 30th, 2008 1 comment

Another year has passed in the world of Complex Event Processing and everything looks great, or does it?

The CEP Market

I have followed the CEP business for a while now. In 2001, the early days of CEP, when we started with ruleCore there were basically nothing called CEP. Then suddenly I started to hear this talk about Event Stream Processing and after a while it changed to Complex Event Processing – CEP- About the same time marketing departments (basically the founders plus friends) of a handful of startups went nuts. Bold claims were published almost daily and "we are faster than you" were the typical marketing message for several years.

This year have been rather quiet from a marketing perspective and the most balanced year this far. Very few bold claims of being first, being fastest or having coolest features. Mostly rather boring press releases about new versions and a some talk about new partnerships and customer stories.

Maybe the CEP companies started to use professional PR firms? Probably a good idea. But a bit boring, I miss the performance claim war and the personal touch of marketing from startups in their early days.

Also the marketing machines of IBM, Tibco, Progress and Oracle now have an event processing solution to deal with. It seems that event processing is hiding somewhere inside these giants below layers of SOA, BPM and other more mainstream technologies. Only to be seen occasionally. But I’m expecting to see more from these giants during 2009, I think they are just warming up and trying to understand what to do with this new thing called event processing. Maybe we will even see some announcements from Microsoft? (who will they buy to get up to speed?)

It seems that finance is the dominant market these vendors go after. Others, like Coral8, have started to broaden their message and talk more about applying event processing to business intelligence problems. Especially Coral8 seems to be heading in the right direction. It will be nice to see how their new strategy works, if they don’t get acquired by someone first.

There are also marketing activity from other vendors attempting to get the attention of telco, CRM, fraud detection and logistics users. It seems that everyone is still trying to figure out where to go and exactly what the best marketing message looks like.

I think 2009 will be a defining year for CEP and most vendors will discover where event processing solutions are most appreciated by the users. During 2009 I would expect most vendors to follow the money and go for sectors where they can quickly increase their revenue. In 2009 it’s time to start making money in the CEP business.

The SQL Based Approach

We have during 2008 seen number of products getting more advanced and powerful all based on the same idea that streams should be queried using same type of queries which are used to query data at rest.

Most of these products have some kind of textual language which look similar to SQL. A couple have graphical variations on the theme in the connect-the-boxes-with-arrows style. These vendors have added a lot of features to their own SQL-derived language in order to make it work with streams of data. Most languages are powerful and can be used to create complex systems with some ease. These SQL dialects can be found in products from Coral8, Esper, BEA WebLogic Event Server, StreamBase and Aleri.

Personally I see that these SQL based products are great tools for creating data driven algorithms and performing computations on all kinds on data feeds. It’s no surprise that these vendors are targeting the financial markets. It seems that all vendors in this camp have had some success in attracting some customers from the financial sector.

It will be interesting to see what happens if (when?) the major RDBMS vendors announce the ability to register SQL queries in their databases and run them continuously. When this happens, the CEP vendors with SQL based products might have a hard time in making a point for a separate product for processing data streams. Their products will just look too similar to the newly added streaming SQL (or whatever the RDBMS vendors will call that new feature).

My guess is that we will see announcements from at least one major database vendor in 2009 with features for processing streaming data and continuously executing queries. It might not be a bold guess that Oracle will be the first one implementing something like this and then followed by the regular open source copy cats. When this happens I would not be surprised if some of these CEP vendors re-invent themselves as solution providers for the financial industry. After all, they have a great tool for solving problems in that domain.

Rules

A number of vendors have chosen a rule based approach to complex event processing. The ruleCore CEP Server, IBM Business Events and TIBCO BusinessEvents, Progress Apama and Agent Logic all use some form of rules.

In this category we have as many approaches to event processing as they are vendors. Depending on the background of the vendor you can see different aspects highlighted in the various products.

RuleCore provides a foundation for adding reactivity to event-driven SOA, with location aware event processing as its speciality. True to its heritage Progress Apama talks mostly about algo trading, smart order routing but are finding its way into other areas like Telco and RFID. Agent Logic seems to focuse on the GUI (Can we see CIA/NSA operatives busy with creating new rules each day?) in order to give the spooks a great tool to listen in on various telco trunks (I made this up, but if I were NSA/CIA that’s what I use it for).

TIBCO and IBM feels a lot like the traditional message broker tools where you control the flow of events and take decisions depending on state. If you have been working with traditional message brokering tools you might wonder what the excitement is all about as this sounds like old tools with new name, now they are called events instead of messages.

Aleri have a combination of a SQL based language and a special purpose stream processing language called SPLASH, which I will continue to watch with interest.

At ruleCore we have spent the whole year in improving the product by adding support for location (geospatial) aware rules. Basically native support for GPS events. So now we can easily create rules which track vehicles and detect deviations from expected behavior or security violations.

Geofencing is now really easy to do using Google Maps and the new support for geospatial rules. We also improved the context aware rule evaluation features and have now a really nice declarative rule evaluation model for tracking business entities.

Others

Apart from the vendors I already mentioned there are also a small number of other vendors that I have not heard much about this year. Not sure if they are silent or just not on my radar screen due to lack of time. If wish I had more time to follow all the interesting companies doing event processing out there. There are Event Zero, Pion, Senactive, WestGlobal, RTM and maybe others which I can’t remember right now. A quick look on their web pages indicate that there are at least some activity going on and lots of ambition to bring event processing to the masses.

Customers?

So, how many customers are there out there using any of these CEP products?

I hate to sound pessimistic. But from a distance, I don’t see that many customers actually using CEP. I’m sure many of the vendors have a number of customers which might keep them busy for a while. But I think we could still fit all CEP users into a single room. Not sure if the small number of customers is a problem. There are not many CEP vendors to begin with, so even a small group of customers can keep us running.

If you explain the CEP concept to a groups of developers I’m sure many of them go "hey, that’s something we’re been doing for a long time". So I think there are many developers which are "doing CEP" but not actually using a tool specially made for event processing. The normal Java, EJB, .Net tools seems to work for now. All these will eventually start to look at "real" CEP tools.

Community

In 2008 we saw a lot of activity in the Event Processing Technical Society. It’s mission is

"To promote understanding and advancement of Event Processing technologies, to assist in the development of Standards to ensure long-term growth, and to provide a cooperative and inclusive environment for communication and learning".

It was founded by 29 companies which more or less define the whole event processing world as we see it today. During 2009 there will be lots of interesting activity in a number of workgroups. Interoperability is the one I’m going to watch closely and hopefully have time to take part in.

The Future

A couple of years ago, about 1998, a company called iSpheres developed a CEP product. Things were going fine and then management came and shared their infinite wisdom and called for a vertical marketing approach. Energy trading was the silver bullet! Energy in the form of electricity can’t be stored to it has to be traded in real-time. So all forces was focused on solutions for trading electricity. Then came Enron…

Back to 2008 and the era of financial meltdown. Look at the current crop of CEP companies focusing on algorithmic trading and other solutions for the financial industry. Do I even need to point out the obvious here?

Some of the CEP companies will probably have a hard time during 2009. I suspect that at least a couple will go belly up, others will be acquired to avoid that fate and many others will have to fight for venture capital.

If I put on the customer/user hat for a while, I still can’t see a convincing message from many of the major CEP vendors. Most users that I talk to just don’t get it either. It is nor clear to them why they should invest time and money in these products.

An informal survey that I have made shows clearly what I have suspected; There is simply no demand on a larger scale for event processing! Most users just conclude that the current generation of tools like their favorite SQL database, application servers and programming environments can do the job. I’m not saying that there are anything wrong with the current CEP tools. It’s just that very few feels that they have a need for this kind of tools.

The major challenge for the event processing vendors in 2009 will be to convince the users that they need a completely new technology to solve streaming problems. And yes, most users would benefit from stopping development of their own streaming solutions and buy one instead. Surely most vendors will win customer during 2009, but at a high cost and only after many sales calls and visits.

But I’m still optimistic. Event processing is like a heavy train with a undersized engine. It will take a while to accelerate, but when it have gained speed, it’s hard to stop. (Here I thought of inserting a joke about our state controlled railroads, but it didn’t seem fair to kick at someone already laying down).

The products of many CEP vendors have shown great improvements in 2008 and they start to look really good. So there’s nothing wrong with the technology.

Let’s see if the customers will find the world of CEP during 2009…

(On an unrelated note I would like to add a bold prediction that during 2009 we will see something like the 2000 dot-com bubble burst. There are just too many companies with very hard to understand business ideas. They all seem to have in common that everything are given away free and revenue should be made by some other clever and complex scheme like advertising, services, or whatever. Sites like Google and Facebook will be the first ones to hit by sudden problems and will drag most of the social-something and web2.0-something companies with them creating the second IT crisis. Everyone with a business idea (that is, how to make money) that I can’t understand in 10 seconds will go down.)

 

 

 

Share/Save/Bookmark

Events, Observations and Vehicle Location

December 25th, 2008 5 comments

 I got some good comments on my last post on event identity from my readers, some of which can be found in the comment section of my previous post.

I must been a bit tired when writing that last post as I complicated things when I started to think about event IDs. But thanks to my helpful readers, I’m now back on track again after a couple of nights of sleep and lots of xmas food high on carbs and fat…

A good way to think about event id seems to be this:

The event is a notification about an observation,

the observation observes a real-world event,

which can be observed by multiple observers,

thus multiple event objects can contain information about the same real world event…

 

With this in mind the event identity is nothing more than the identity of the observation.

The problem I started to think about in the last post is still a real problem but it happens so to speak at the next level.

What if we want our event processing rules to see only one copy of each real-world event? This would simplify the creation of rules in a good way. This kind of event fusion capability is something that would be a great feature in the event processing platform you are using…

 

Let’s look at an example…

 

To continue with our vehicle tracking example (we are doing lots of location aware stuff with ruleCore right now so that’s why I’m stuck in using vehicles in my examples all the time. But the problem is the same independent of what you are tracking)

A common thing one does with trucks, is to wait for them. So a typical reaction rule in ruleCore would be to detect the arrival of a truck, so you can do something else while waiting for them.

Optimally, the rule should be simple to write: 

"Trigger rule when vehicle enters zone A".

In a complex environment things are not that easy. We could detect the fact that a vehicle has entered zone A using a number of methods:

  1. Somebody scans the vehicle or its cargo using RFID scanner which is known to be located inside zone A.
  2. Security notes that a vehicle has passed the gates.
  3. The driver calls in and says he’s arriving in a few minutes.
  4. We note that the drivers cell phone is inside zone A, using the cell network’s positioning mechanism.
  5. And the obvious one – The GPS of the truck sends a location update from inside zone A.

Just in this simple example we have five types of events which could be used to determine that a vehicle is inside zone A. To make life easier for the rule desgner, we would like to provide these five events as an unified ZoneEntry(ZoneId, VehicleID) event instead.

There are probably a number of solutions for this. Currently in ruleCore we use rule hierarchies to solve this. We have domain specific rules at lower level feeding rules at higher level with these high level events.

But I really would like to find a generic solution for this one.

I think the military rooted sensor information fusion world have lots of good ideas on this one.. Stay tuned…

 

Share/Save/Bookmark

Event Models – Identity

December 22nd, 2008 8 comments

A couple of days ago I wrote a bit about event models and tried to sort out what the fundamental properties of an event are.

One of those properties seems to be the identity of an event. This property is something I see as the major defining characteristics of event processing. The thing that sets event processing apart from mere data processing.

Normally data, like tuples in a database, are all anonymous. If you insert the value 123 into a database you can’t distinguish it from another 123 in the same database. The data looks all the same. But events report on a real world events and should be unique.

Each event (instance) should have its own universally unique id!

Technically this is very easy. Just let someone, commonly the event generator, assign an UUID to the event.

But, what’s in an event id?

First of all, if we add the requirement (which I think we should) that events are create-once-never-change type of things, it is enough to use the identity of an event in place of the event itself in many cases. Events can be replicated and distributed rather easily and found by their id as the search key. Basically we can store info such as “Event with id 1939 occurred because of event  with id 1914” without storing the events themselves.

I tend to think of events as “Notification about state change in some (business) entity”. With this in mind, the event id would be the id of a particular state change.

And here’s were things get tricky…

Lets say we have the event:

  • Type: Collision
  • Payload: Vehicles that collided

To complicate things, or rather make them more realistic, lets say this event can be reported by a number of event generator sources. This, for example, could happen because the collision is detected by a bystander, a surveillance camera and by the vehicles themselves at the same time.

So we have at least five different entities which detect the same collision.

Now, should all the entities generate an event with the same identity? After all, they are all creating a notification about the same thing?

If multiple event generators observe and report about the same event occurrence, they would obviously need to be coordinated if we require that they generate events with the same id. This is not practical or maybe not even possible. So we need some other solution for this. (hint: multi event fusion)

What if the different events describe a slightly different aspect of the collision? It’s still the same collision.

When creating event processing logic, like rules in ruleCore, you wan’t to be shielded from this problem. The rules should see only one collision! Otherwise you will easily end up in rules triggering multiple times and you have suddenly reports about multiple collisions when there in reality have been only one.

Something as simple as a event id could have a rather complex semantics and you should be aware of this when designing your event model.

Or is it just me complicating things?

 

 

 

Share/Save/Bookmark

Let’s Dance – Diskotanssi

December 19th, 2008 2 comments

 

While doing some googling on Erlang I found and interesting use of it: http://discoproject.org/, a project from Nokia.

It combines two completely different languages:: Erlang and Python.

It also does this in a manner that is rather clever: Use Erlang for the overall picture and Python to do the actual work.

It seems like a nice way to do heavy crunching using MapReduce in a cloud like Amazon’s elastic computing cloud. Although not directly suitable for event processing, it seems like a nice way to deal with huge amounts of data. Maybe for offline event data analysis?

 

Share/Save/Bookmark

Erlang – Take 1

December 18th, 2008 1 comment

Now I have had some time to look at Erlang, from the event processing perspective

The whole language feels a bit weird. That is, it’s not like any of the "normal" languages like Java, C#, C, Python. The first reaction is; How can you ever acomplish anything in this? But after a couple of hours reading and setting your programming mind into a different mode, the zen of Erlang starts to appear.

First of all – The language seems to be very small. Basically, being a functional language, everything is done using functions. I have not yet found any loops, if-then-else or other more traditional language constructs.

What surprised me is that I actually know some Erlang already! We do lots of coding in Python and use a number of Pythons functional programming features. These seem more or less inspired by Erlang! Even the syntax looks the same in a few places.

If you look past the rather outdated Erlang syntax and just accept a number of "odd" features like the fact that variables can be set only once, there are a number of interesting features built right into the language.

From the event processing view one of the most interesting features of Erlang is that is favors designing your applications around a large number of processes. Basically every function you call can be a process, this allows for a different way of thinking about concurrency. It seems that starting 100k or so processes is not a big deal, compare this with the amount of threads you can start in a conventional system.

Erlang marketing says that 99.9999999% uptime can be achieved. Which you really can’t say about a normal piece of software. Maybe these telco guys know something we other business software programmers don’t.

Everything about Erlang feels a bit 1980, but there are some fundamental concepts which seems to be usable in the new world of cheap servers where everyone is looking for better ways to get that free lunch for scalability…

A simple example of how Erlang code looks like:

Pid = spawn(ipctest, consumer, []),

pingpong() ->
    N = 100000,
    Pid = spawn(ipctest, consumer, []),
    Start = erlang:now(),
    Message = {ping, self()},
    dotimes(N, fun () ->
               Pid ! Message,
               receive pong -> ok end
           end),
    Stop = erlang:now(),
    N / time_diff(Start, Stop).

consumer() ->
    receive
    message -> consumer();
    {done, Pid} -> Pid ! ok;
    {ping, Pid} ->
        Pid ! pong,
        consumer()
    end.

Share/Save/Bookmark

Event Model – What Are We Looking For?

December 16th, 2008 2 comments

Today I woke up to the most unexpected news – There’s been an earth quake here in Sweden! For many of you this might not sound like something to get exited by. But considering that the last one was ages ago and this one was the biggest one in hundred years, I’m for sure a bit excited.

Sweden is normally the most stable place on earth. Both geographically, wheatherwise (and politically, I wonder if there’s a correlation between these…). So every little blizzard or other acts of nature is a big thing here.

I have been write lately about event models and event semantics. But I think it’s time to step back a bit and try to define our goals with an event model.

I’ll start with the format of events, this is easier than semantics….

When we designed the event format for ruleCore we had a couple of things in mind and I think these are rather generic in nature:

  • First, use XML. For better or worse we are stuck with XML. There are so many tools and standards related to XML so it’s a good choice.
  • There should be a small number of required elements and attributes which must exist in all events. These are all common to all events.
  • The inbound events should be flexible enough to contain an user defined part which is as flexible as possible.
  • Both the common and user defined part should be verifiable against a schema.
  • Events are immutable, so they can’t be changed after creation. But, for practical purposes systems might want add meta-data to an event. Make sure there’s an well defined part to put all kinds of data without breaking anything.
  • It is very common to search for the event type of an inbound event in order to find its definition.

 

I’ll use an example to show what we ended up with. This might serve as an base to your own event format or model:

<SpeedingViolation
  rc:eventTimestamp="2009-11-21T13:35:16.398+01:00"
  rc:eventId="c4e00004-abb1-4bee-8c54-8efbb1a5178b"
  xmlns="http://www.rulecore.com/2008/user"
  xmlns:rc="http://www.rulecore.com/2008/base">

 
  <rc:EventHeader>
 
   <rc:SecurityInfo>
    <rc:Credentials>3536ab16-393b-4447-a892-0a3e161f23a6</rc:Credentials>
   </rc:SecurityInfo>
 
   <rc:CausedBy>
     <rc:Event eventId="87b353e6-765befcda05c" eventType="CameraCptr" index="1"/>
     <rc:Event eventId="0e04849e-39c1db18b5a1" eventType="RadarData" index="2"/>
   </rc:CausedBy>
 
  </rc:EventHeader>
 
  <rc:EventBody>
 
    <Road>Route 293</Road>
    <SpeedLimit>120</SpeedLimit>
    <Speed>145</Speed>
    <Vehicle>AND-239</Vehicle>

   </rc:EventBody>
 
</SpeedingViolation>

You can just ignore all the namespace stuff for now, that’s left for the advanced ruleCore course…

As you can see there are three major parts in the event:

  1. The root element with some attributes
  2. The EventHeader
  3. The EventBody

The idea here is that the root element gives away the type of the element, this allows for easy validation against an schema and allows us to find the event type in a well defined location. It’s alway the root element so it is easy to find using XPath.

Attributes – The attributes of the root element are the fundamental attributes of the event. They must be there for all events. We have choosen to have the timestamp and id of the event as the required attributes.

Header – The event header contains meta-data. This can be used basically for anything and can be used to associate information to the event which is practical to attach to it.

In this example we have security information, which is completely system dependent and only used by the ruleCore CEP Server hosted service. Other systems might use the header for other system info too. By clever use of namespaces it is possible to allow systems to add elements into the header without any risk of breaking things.

The CausedBy element is also meta-data which is convenient to attach to the event. There’s nothing that prevents this information to be stored in an external place. But we found it practical to keep this information with the event too. Here any system can find information about the events which caused this event to happen.

Body – The EventBody is the third and perhaps most interesting part of the event. This is the user defined payload. By smart design of the XML Schemas you can verify the contents of the body against a schema too. By proper namespace usage the content here will not clash with anything else either.

In this example the header tells us that this is a speed violation event and the body contains information about the violation. The id of the vehicle and its speed and so on.

 

 

 

 

 

Share/Save/Bookmark

Events in XML – Two Options

December 14th, 2008 8 comments

Option 1

         

Option 2

<Event>
  <Header>
    <EventType>Exception</EventType>
    <EventId>92839289382</EventId>
  </Header>
  <Body>
      <Err>Marco messed up</Err>
  </Body>
</Event>

 

<Exception id="98928392839">
  <Header>
  </Header>

  <Body>
      <Err>Marco messed up</Err>
  </Body>
</Exception>

If we get practical for a short while, before going on and talking about event semantics and event models.

If we agree that XML is one good way to represent events in most common cases, we have basically two ways of doing it. Above you can see two different approaches.

Option 1 uses a common XML element as the root and says that this is an "Event". The type, id and other properties can be defined in the header element of the event. The body would then contain a payload suitable for this particular type of event.

Option 2 uses the root element to name the type of the event. The properties of the event are given in the root element and the Body element contains the payload just as in option 1.

It might be a matter of taste which ones to prefer, my taste is for option 2. It’s just more XMLish…

There are a number of features of it I like:

  • The event type is apparent directly from the root element. Many times processing is based on event type and it can be critical to find it easily
  • Having the root element define the event type lets us easily use a schema to verify it. The schema, if done properly, can contain two parts. One base which defines the overall structure of the event. That is the <Header> and the <Body> tags. The base also contains the definition of the header and which attributes to expect in the root element.
  • The extension to the base schema, specific for the event type, defines what can go into the Body.

So each event would have a well defined properties and header part, but still having a totally flexible body.

At the same time everything can be validated against schemas.

It might not come as a suprise that we use option 2 in the ruleCore event model.

Option 1 is not completely wrong either, but iit does not feel as XML Schema friendly to me and it requires some more parsing to get the event type.

This is the to main ways I see using XML, do you have ideas for other ways of doing it?

That’s if you prefer XML. I can see that there are a place for more compact binary formats too. So the event model and the definition of semantics should be neutral to the format used to encode events. But for practical purposes XML might be a good candidate in most cases.

 

 

Share/Save/Bookmark

Event Models – Continued

December 13th, 2008 7 comments

I had some very good comments on my previous* post about event model semantics.

The issue: What are the fundamental properties of an event, and what are their semantics. The prime candidates seem to be:

  • Id
  • Type
  • Detection timestamp

Others, which could be present, but might not be required:

  • Location
  • Entity
  • Occurrence time stamp
  • Event class or category

Lets look at some practical examples..

<ZoneEntry id="bd79d85a-12c1-4a1f-9047-3dc34fcab2cc" time="2008-12-13 20:45:44">
  <Header>

    add stuff here tomorrow
  </Header>
  <Body>
    add stuff here too tomorrow
  </Body>
</ZoneEntry>

I’m using XML here as an example, but it could as well be something else… In this example we have an event of type ZoneEntry, it was detected just a couple of minutes ago and its globally unique id is bd79d85a-12c1-4a1f-9047-3dc34fcab2cc.

As we can see I envision the event to contains two containers, one if the header and the other is the body.

The header contains information about the event itself, for example which events it depends on, security related information, information about the event sending system and other meta data type of information.

The body is the actual payload of the event. 

This is a good start, lets continue adding to it tomorrow. But until then, lets enjoy the weekend.

 

 

* Due to an unfortunate snafu (that is, I messed things up) my previous posts are not available currently, they are not lost. Just sitting in a SQL dump ready to be imported whenever I have the time to figure out how…

Share/Save/Bookmark