Architecture Design Record - Event Storage¶

Problem¶

The elos shall archive and store events based on a configurable rule set to provide later retrieval of events or to browse and search for specific events matching a certain pattern.

The difference between events provided by the event logging versus the event processing subsystem, is that the event logging system can be searched without a subscription and therefore enables a client to retrieve historical events he never subscribed for.

We can make a categorization of events in

Historical events
Real time(*) events .

(* real time in the meaning of the selected subscription and event retrieval strategy of the elos client)

All events published via elos are passed to the event logging subsystem where the events shall be prepared for later retrieval. Due to the possible amount of events that can occur and the limits , the following problems come up:

Requirements on the event logging subsystem:

Healthiness or harmfulness for the hardware (flash wear by log storage)
Flexible to available space (store all relevant events)
Reliability that an event is stored
Reliability that an event is read as stored (protection from modification (Integrity))
Speed - persisting
Speed - reading / searching
reliability of storage (protection from loosing of relevant events (Availability))
protection of access to events (Confidentiality)

The security targets Confidentiality, Integrity and Availability (CIA) are addressed here.

Influencing factors¶

The following constraints have to be taken into account for a suitable solution:

Live time of Flash storage, less writes of blocks as possible
Save against power loss – ensure atomic writes of events, do not corrupt event storage due to incomplete writes
Different sizes of available storage
Different types of storage

Assumptions¶

The following assumptions have been made in the decision process:

Flash devices or similar (e.g. EEPROM) are used for storage
Available storage capacity for event storage is somewhere between a few 100 Bytes (e.g. EEPROM) and several GBs.
The storage time of events ranges from 0 (not stored at all) up to theoretically unlimited (hence >15a)
Events that are older than their intended storage time shall be actively deleted to save space. (retention policy)

Considered Alternatives¶

1) Distributed event storage¶

The basic idea is to not restrict the final used storage technology for events to a single one. The canonical format of the events contains already suitable machine readable information to divide events in several storage classes like:

shall be kept as long as possible – event or payload is unique and probably indispensable
event should be kept – but it is not critical , can be dropped in favor of more important messages
event can be kept – not necessary to survive restarts no need to persist
event is discarded right away - never stored

There are even more categorizations thinkable.

The distributed event storage shall be capable of matching incoming events against configurable event storage groups and link those groups to specific storage backends. These backends can be of different types and locations. A backend shall be defined by the following attributes:

retention policy
criticality assurance (atomic writes, read reliability, …)

(may be not needed as only ‘criticality assurance’ is the important part)

storage technology (DBMS, Plain-File, RAW-Device, …. )
backing device type)

After the definition of event storage classes for historical events and the characterisation of event storage backends the remaining task for event logging system is to do the classification of an event against the given storage classes and then do the mapping to the corresponding storage backend.

overview distributed event log storage

To store and load events two different process are necessary.

Event Classification and Mapping Engine¶

Each event must be mapped to a event storage class, which is linked to a event storage backend. This shall be done by by a configurable rule set of RPN-Filters

Search-Engine¶

The process to load events is a kind of inverted store process. To search for events it must be predicted which storage backends shall be queried.

The implementation of both could use dynamic libraries to allow dynamic configuration and extension.

pros

not bound to a single storage technology, can provide benefits of different storage backends as needed
easy extensible if new storage technologies become available it can be combined or replaces former ones

cons

overhead due to necessary classification of incoming events
- But: some kind of filtering is necessary anyway in any solution
probably a large dependency list, due to unbound possible storage technologies used
- But: They shall be configurable as dynamic loadable plugins (Dynamic Shared Objects) and therefore the system integration design process is responsible to make a well balanced selection.

2) Direct event log storage¶

This solution is straight forward by selecting a single storage technology as storage backend and pass each event through to be persisted on the final backing storage. Optionally a filter mechanic can be used to reduce the amount or to decide if an event shall be stored or not.

pros

simplicity, straight forward take the event and store it without further processing
less dependencies – only one storage technology involved

cons

less flexible and hard if not impossible to find the single storage technology that serves all the requirements (except an implementation of solution one is taken as backend)

Decision¶

We choose the distributed event storage approach, number 1.

Rationale¶

The distributed approach gives us the opportunity to continuously develop and evaluate new storage backends and combine them with the current implementation. If only one back end is configured the solution 1 becomes equal to solution 2 from the behavioral point of view.