EQUIP refactoring notes

Chris Greenhalgh, 2004-12-20, updated 2005-01

Introduction

The design goals for EQUIP have shifted over time. In its first version it is something of an over-arching and all-inclusive framework, with IDL, code loading, etc. Over time this has shifted to empasise ease of use for both programmers and users. For some forms of use this is supported in ECT through its hosting of standard components and provision of GUI tools. However this still leaves ECT in a framework/hosting role. In addition to this we wish to consider easier use of EQUIP/ECT from non-framework applications across a range of languages and platforms, e.g. C/C++ applications (such as Chromium and/or OpenGL applications), C# applications, applications on less capable platforms (PDAs, phones).

Goals

Facets

One way of looking at the essence of EQUIP is as a transparently distributable Model in the sense of the Model-View-Controller pattern.

EQUIP combines a number of functions/roles that could/should be more clearly separated (for extension, management, etc.), including:

Although EQUIP allowed arbitrary extensions to objects (methods, etc.) they are first and foremost Data Objects.

A dataspace is a (logical??) bag for putting objects in.

Like JMS and Hibernate access to objects in a dataspace should probably be managed via single threaded and generally short-lived sessions (or similar). This provides a clean threading/coordination model and can support transactions.

It feels like it could be important to have multiple simultaneous views on the 'same' dataspace, e.g. how another process is seeing this, how i am seeing it now, how i was seeing it then... It should be possible to compare and diff these as well.

What can we say about the Data Objects that EQUIP2 might 'manage'?

What will happen to them? when? how? where do they come from? where do they go?

  1. an application obtains a reference to a Data Space object.
  2. it establishes a connection (or something like that) to the Data Space
  3. in the course of some activity is opens a session with that connection
  4. within that session it
    1. creates some Data Objects and adds them
    2. looks up some Data Objects by query of some sort (template match, field match, key??)
    3. modifies some such Data Objects
    4. deletes some such Data Objects
  5. closes the session (or aborts it)
  6. ...
  7. closes the connection

On the responsive side...

  1. ...
  2. it establishes a long-running session (session factory? monitor?) and configures it to run certain code under certain circumstances
    1. esp. particular changes in the Data Objects in the data space
    2. timer events?
  3. as these occur the corresponding code is run within appropriately constructed session(s)
  4. ...
  5. closes the session/unconfigures the monitor/whatever

Following hibernate we might identify 'managed' Data Objects as those currently actively managed by EQUIP, i.e. within a (or more than one) session. These might be instrumented, tracked, etc. Other objects are ignored by EQUIP and recieve no special treatment.

Orthogonally the process may configure/manage...

  1. the data space
    1. the persistence of Data Object and changes
    2. relationships to other data spaces
    3. indexing
  2. the relationship(s) between sessions
    1. the replication or communication of Data Object, changes, etc. 
      1. including prioritisation, reliability, discard policies, etc.
    2. the durability, reliability, consistency etc. of changes made in sessions
    3. the links between changes and events/triggers
    4. fetching/cacheing policy in relation to sessions and also queries
Data Object structure issues... matching (search/query) and mutation both expose an implicit or explicit model of the internal structure of a Data Object, i.e. what are the terms/elements used to describe 'match', and when is an object the 'same' but 'changed', or a 'new' object? Different technologies and approaches have different internal/structural models, e.g.

Communication

It is important to:
Walk-through:
  1. a client application wishes to perform an RPC-like operation on a particular server...
  2. it places a request object in the client dataspace which is the outbound queue to that particular server
  3. the communication manager determines (following poll or DS change notification) that there are outbound requests...
  4. the communication manager performs a scheduling round against candidate messages, and may select and schedule one or more communication activities [suppose it schedule this request...]
  5. the task manager allocates a thread to the schedule communication activity
  6. the communication activity attempts an HTTP post (say) to the identified server [how did it know to use an HTTP post? how did it get the server URL? how does it know whether to do this reliably?]
  7. if a response is successfully received then...??
Tricky questions about relationship to persistence, transactions, failures/restarts, continuations...
So, in the outbound dataspace we need:
Should metadata be an additional orthogonal capability of the dataspace, or should a metadata-holding type be used in the first place?