Tuesday, March 27, 2012

Events:The Great De-coupler - Determine the Independance score of an event system.

Events are the great de-coupler because their very nature changes our prospective from command-oriented computing to situation-oriented.

Think about it. Most of the things we have learned in CIS courses was how to help the computer issue commands. It is easier, in some senses, to just command the reaction that we want rather than put it in context so that the situation can be understood by people who didn't write the event publisher. When we simply command the reaction we want, we are increasing coupling. We are tying a dependency from the application that raised the event to the application that reacts to it. I contend that by properly creating the event object with the appropriate references and ontology that support those references that we can decouple our events and our applications even further! We can even go to the point of reacting to events that are created outside of our sphere of influence automatically.

A goal of the event processing community should be to make semantically-aware events that can be consumed by an application or engine with no knowledge of or organizational affiliation to the event's publisher.

To start off, we need to design events to live outside of a particular application, department and organization and even an particular industry. We have already taken strides when various taxonomies and ontology for industry are created. They take the real-world nouns, processes, and state-changes of the industry and give it a namespace so that all may call things the same. This trend promises to continue and become better at defining them.

The table below defines the various levels of independence for an event object. This isn't just about how do we create a good event object that the system can use. This is about building an ecosystem of information that surrounds the set of events for a given organization giving it a rich environment to influence.

For reference sake:
  • Event: The change of state of a noun.
  • Event Object: A container that encapsulates the meta-data associated with an observation of an event

DecoupledThe event object [EO] must exist independent of any computing algorithm that would operate upon it's content.
Must be independent of reactions to it.
This leaves algorithm that manipulates the container, moving it, etc. This doesn't mean that things can't operate on its content, just that the event object doesn't rely on the operation for its existence. (Obvious exception is the event creation algorithm)
Subscribe-ableA mechanism must exist so that a subscription of the event object can be requested and the event object delivered upon publication.This is the classic event-driven architecture.
Descriptive not prescriptiveThe event object must indicate what happened (state change/action/lack of state change) and not indicate how to react to what happened.

An event object must be independent of commands
This one separates out awareness from commands meaning that an event object should describe the situation not issue a command. There are a lot of messages that are decoupled and subscribe-able but the message issues a command or issues constraints to the reaction to a notification. These would not satisfy this level of events.
  • EO must indicate/identify the noun (object) the event is in reference to
  • EO must indicate the state that the noun changed to
  • EO may indicate the state of the noun that the noun changed from
An event object must be independent of embedded knowledge in the ecosystem.
This basically says that an event must self-containing it's direct references. The Event Object should answer the questions: What happened? To whom? When? Where? What are the details? What is the certainty?...

All of these are referenced inside the event object.
ReferencedEO references must be governed by a taxonomy/ontology

An event object must be independent of semantic ambiguity
Referenced means that all pieces of context are referenced to some source outside the event. Similar to Fielding's REST concepts. It should be a URI or something that can identify the subject.
DimensionalNouns have defined dimensions that are made up of defined states or values. Since events are an observation of the change of state for a noun, the referenced taxonomy should have a well defined dimensions, states, and values for all nouns under its jurisdiction.In my view, nouns have dimensions (attributes) that either describe a state or a property. In this maturity, the dimensions must be defined as well as what legal state/value can exist in that dimension.
Cross-referenceableThe taxonomies must be systemically translatable to other taxonomies causing semantic interdependence.
By developing cross-referenced taxonomies, organization outside of the sphere of influence of the event publisher can understand and utilize the events without requiring human intervention.

Monday, March 19, 2012

Finding Events: Top-Down vs. Bottoms Up

Whenever I talk to serious computer professional, there is always the debate of which is better: Top-Down or Bottoms-Up. Of course, like a good consultant, the answer is obvious… it depends.

I think in the general computing realm there are practical reasons for doing it one way or the other, but there is also something to how the minds of different people work. My mind typically likes to think of things from the top down. I ask the question “what do I need to get (or become) what I (or the system) wants to be (or know).” My colleagues who prefer a bottom-ups approach ask “what can I do with what I have”. As I contemplate events, these questions seem germane and both viewpoints separately have value, but combined allows for greater success. So this isn't really a competition between the two viewpoints but rather a recipe for putting things in place to allow the viewpoints to work in tandem.

I’ve been giving a lot of thought to how do we do event processing. There are a number of great books on the topic. (Just a couple of examples: David Luckham (and here), Opher Etzion & Peter Niblett and Roy Schulte and Mani Chandy) There is a lot in these books about how to engineer the event reaction. How the machinery works. This is important because as an industry we want to build systems that react to stimuli and manipulate those event objects. I wrote a blog entry last year called the 4Ds: Detect, Derive, Decide and Do which breaks down the areas of concern for manipulating and reacting to events. However, what I haven’t read much of yet is what is an event metaphorically speaking. Sure, we have the EPTS glossary which defines it. We just sort of know intrinsically. Yet, we still question how do we find them in our organization. As an event practitioner, what should I call an Event in my business and how do I prepare myself to exploit the knowledge of their occurrence?

I won’t be so vain to say what I’m about to give is a methodology; nothing that formal. But these are my tricks of the trade; my mode of thinking. Take it for what it is worth, but I think if you can master this, then the practical stuff (getting the engineering to work) will be much easier and more beneficial.

To understand my mind of thought, I view these below as axioms:

  • All events have at least one subject which is the noun(s) whose state has changed.
  • What we manipulate within computer systems is not the event itself but rather observations of state changes for nouns that are significant enough to warrant attention.
  • There maybe multiple observations and multiple observers of the same event and each observation maybe made from a different perspective.
  • All state changes to nouns occur as the result of some process; although the process may not be known and/or not in our control.

Here it goes:

Top-Down Event Thinking
The top-down mode of thinks starts by saying “I need to know that something happened.” That something could be an order was fulfilled, fraud has occurred, the bridge has reached a critical stress point, etc. One thinks of this event typically because one wants a reaction. As shown in the 4Ds, one wants to do something as a result of the event. When the order was fulfilled, I want to issue a commission check or when the bridge has reached a critical stress point, I want to get maintenance people over there.

The question for the event practitioner is “how do I know that the event occurred?”

There are three ways:
  • I can sense it (some instrumentation can create a signal when the event is detected)
  • I can be told it (some human or other system says that an event occurred)
  • I can derive it (some pattern of events occurred within a sphere of observation which allows that event occurrence to be derived)

The EPTS vocabulary says that the first two are “raw” events and the third is a “derived” event. From a top-down thinking prospective, raw events are kind-of boring. I have some sensor or some outside observer who tells me “something happened” and I react. (To imagine raw events think of a thermometer sensing the room’s temperature or a data entry clerk hitting submit button when the order leaves the building (is shipped))

A derived event, from a top-down thinking perspective, is much more fun. When you can’t directly instrument to detect that something has occurring or be told by some other observer, then you have figure out “what would tell me the same thing?” In my 4D post, I mentioned a use-case involving a toll-road wanting to know if someone is speeding. Police officers with speed guns have a direct instrumentation of a car’s speed except the problem is they can’t be everywhere. In addition, people aren’t going to tell you “I sped”. Instead, the event practitioner will have to figure out a way with the set of available, observable events. In the example case, such an observable event is the time an individual car went through two known locations. So, by determining the elapsed time between these two events observations and knowing the distance between the locations, an average speed can be calculated and compared to the legal limits.

The quest for top-down events usually comes because someone wants some sort of reaction to occur. In every industry there are many times people will say “if only I knew ___, I could ____”. As event practitioners, these statements should be captured and contemplated. Business analysts, architects should be able to accessible that list and the event practitioner should be always trying be probing to understand what raw and derived events are being generated by various projects. One of the event practitioner's key responsibilities is to be an activist and take an enterprise-wide view on events and get the projects/applications to publish events that maybe useful to the enterprise but not yet useful to the individual project.

I suggest having a wiki page (or however you like to organize things) where you can quickly capture the desired events and the corresponding reaction. Organize an index by both the event and the reaction. As your business analysts and architects go through their work, they should be checking against this list. Also, as other events cause the same reaction or other reactions to the same event are discovered, they can be documented.

Multiple levels of non-raw events
It is very important to realize that when you consider a target event (one that you want to react to) that you might be able to determine a pattern of events that will derive your target event, but lack the ability to observe all of the events that make up the pattern. To illustrate this we may know that event A occurs whenever event B and event C occur in order within a certain period of time. However, we don't have a means yet for observing event B. This happens all the time because of an old technique called “divide and conquer.” You’re dividing the problem up into smaller problems and then solving the set of smaller problems.

As the event correlation occurs in more and more layers, you can imagine a tree of events: Event A as the root and its branches show that B and C occur. B and C are also shown by their branches, etc. At the leaves of this tree are raw, directly observable events. So, one of the main jobs of an event practitioner is provide a mechanism to capture these trees and therefore, these event patterns should also have a section in your event catalog.

Bottom’s Up Event Thinking
Bottoms up deals with manipulating the event that I can observe and determining how to use them to give you more information. i.e. I see that every time B and C happen, A occurs.

There are thousands if not millions of events in any organization that occur everyday. Only some of them are collected in a manner than can be manipulated in an event processing system; most are not. What you want to be able to do is to discover which state changes have high value, and archive what they mean, where the occur and in relationship to which nouns and processes. To accomplish this you need to have three things. 1) A catalog of these events that is easy to reference, 2) a very good semantic meaning of the event, 3) a sense of the object (noun) that the event has as it subject.

Catalog of Events
A catalog of events is not something, to my knowledge, that you can purchase directly. To do it correctly it needs to be associated with a good Master Data Management strategy and tooling. But even having this as a wiki or a spreadsheet is leaps better than not. So what do you keep in your catalog. Carefully thinking about this will be the difference between success and delusions of adequacy.

Name: The bard’s question of “what is in a name?” still rings true. The name has to describe in a couple of words what this is all about. I like to use the format Noun_ToStateChange or Noun_RelationshipToStateChange_Object. A couple of examples: CustomerOrder_Received, Wife_Married_Husband. The point is not to make it drawn out but to give a sense for why this is important.

Description: Details, details, details. In this section you should describe from the view point of the consumer of the event what this means. An example for CustomerOrder_Received might be “The order was received by the business. No processing has occurred to fulfill the customer’s order other than its entry in the order entry system. This event will be published regardless of channel receipt of the order was facilitated through.” The idea is to remove any ambiguity that may arise because of the name.

Observer/Publisher: Who/What is responsible for observing and publishing this event. An example for CustomerOrder_Received might be “The system that is the system of record for the order will publish its receipt.”

Notification Channels: This is a list (maybe a list of one) where this event will be published. It should include the type of channel (ESB, WebService, MQ, EPN …) and corresponding information like the topic or queue name. The idea is two fold. To give the places so that potential/required publishers know where to publish as well as potential subscribers.

State Transition Diagram: The State Transition Diagram (STD) should be developed for all of the major nouns of the organization. If it is determined that there should be an event published about this noun, then part of the analysis definitely should be a STD along the noun dimension(s) that the particular state change represent. This field should indicate the location (typically a URL) of the STD.

Good semantic description of the event

The question you are trying to answer is “what does the event mean”. On the surface this seems simple. But different viewpoints are going to think differently about the same occurrence. An example… A customer of mine received a report, we’ll call it report CR, from their clients via several integrators. So they one file may have the report from several clients. My customer published an event “CR report file received”. I was confused when I saw this event. Is it saying the physical file from an integrator was received so therefore the system needs to tear it apart into each client’s CR report or was it publishing one event for every client who had their CR report in that file from the integrator? A good semantic description (meaning) in an accessible place would have cleared my confusion.

The object (noun) the event has as its subject.
One of my base tenets is that all events are state changes of some noun. The noun maybe esoteric, but it is still the noun. Therefore each event has a subject and that subject is noun that the business care enough about to track. As such, there should be in your event catalog an index on the nouns of the organization. A clever company will someday understand the relationship events have with an organization’s nouns which are managed in the master data management system, and therefore this clever company will include an event catalog in their MDM.

Finding Events
Here are some good places to start to capture “events”. High value processes have two important characteristics: they deal with things that are already important to the organization and two because they are processes, we know that something “kicked them off”; find what did. Another good place to find events is the swivel chair integrations that occur. However, the largest stash of event is the ERP system or whatever is the fundamental system that keeps things running.