EQUIP2 is a completely new version of EQUIP which focusses on (what I consider to be) the key essence of EQUIP: providing an integrated state/event model for developing loosely coupled (usually distributed) applications. This document introduces the key element(s) of EQUIP2, in particular the main Java API.
NOTE: many of the current links in this file assume that you are
looking at the version in the CVS repository with the rest of the
source also checked (see Availability).
Related documents (see also index.html):
At the heart of EQUIP2 is the notion of a dataspace, which is a
logical container for data. At least initially (and as in EQUIP1) the
kind of data that can be placed in a dataspace is objects. In many
ways, the dataspace model can be seen as a particular Data Access
Object approach. See also the master plan, below.
Dataspaces can be created in various ways, there can be several
within a single process, they can have different characteristics (e.g.
persistence) and can support different object classes. However they all
support a common API, equip2.core.IDataspace. The
intention is that dataspaces will normally be created by some kind of
application/container configuration mechanism, rather than by explicit
application code. Application code will then obtain references to
dataspace uses a standard naming/lookup facility, such as the equip2.naming cross-platform
subset of the Java Naming and Directory Interface (JNDI, javax.naming) API. However, in
the first examples of EQUIP2 you may well find dataspaces being created
explicitly.
Some initial examples of concrete dataspace implementation are:
In many cases a dataspace is local to a single process and
inter-process communication (including, for example, distributed
dataspace synchronisation) is left to non-EQUIP2 code. The main
exception to this is the (planned) Remote Dataspace facility, which
uses a standard client-server RPC model to access a dataspace in a
remote process. This facility should be used with care as it requires
good and consistent network connectivity to work well, and can
experience serious performance decreases in performance with certain
kinds of concurrent use.
Application code only interacts with the dataspace in the context of a "session". A session should be a fast and self-contained unit of work performed on/with the dataspace, and has an explicit beginning and ending. If a dataspace support transactions they each session will be a single transaction. Client code obtains a session handle (interface equip2.core.ISession) from a dataspace using the method:
package equip2.core;
public interface IDataspace {
public ISession getSession();
....
}
Sessions are managed on a per-thread basis, so the same thread can
obtain the same session reference within a nested method call.
The session interface is the main interface used by EQUIP2
applications. A session is begun and ended using the following methods:
package equip2.core;
public interface ISession {
// session types:
public static final int READ_WRITE = 1;
public static final int READ_ONLY = 2;
public static final int WRITE_ONLY = 3;
public void begin();
public void begin(int sessionType);
public void end();
public void abort();
public boolean isActive();
....
}
Each active session is single threaded, i.e. the same thread which
calls begin() must make
all subsequent calls within that session, including end(). This makes it much
easier to integrate with transaction-based systems. Between calling begin() and end() the session will return isActive() as true.
By default (calling begin())
gives a read/write session, in which any operation can be performed.
Some dataspace implementations may be able to provide additional
optimisations if sessions are explicitly begun as ISession.READ_ONLY or ISession.WRITE_ONLY (or in some
case may only support read-only or write-only sessions). Using a
read-only session also allows the session implementation to avoid
cacheing (and copying) objects returned to the application and checking
for changes to those objects, giving performance and memory utilisation
benefits. Not copying these objects could also allow some dataspace
implementations to implement lazy fetching of results (although no
current dataspace do this because of issues if such objects are used
after the session end).
The are two main groups of operations that can be performed on a
dataspace, within a session:
If any exceptions occur during a session within ISession operations
then these should be detected by the session and cause it to abort
itself. These will be rethrown to the application code as
RuntimeExceptions. Incorrect uses of the API (e.g. calling synchronous
operations other than between begin and end) are currently also
signalled by RuntimeExceptions.
If an error or other condition is detected by the application
code during a session then it should call abort() to terminate the
session without making any changes to the dataspace. After calling abort (or
after any exception thrown from a session method) the session will be
ended and no session operations can be performed until begin() is called again to
start a new session (the same session object may be reused).
Note that (at present, anyway) in the case of a single thread making
with multiple concurrent sessions on (different dataspace in) a single
remote dataspace server, the failure or aborting of any one session
will result in all of that thread's sessions in that remote dataspace
server being immediately aborted.
Note that if a session remains active (after calling begin() and before calling end()) for "too long"
(currently about 5 seconds) then it will be aborted within the
dataspace.
This is used to detect unended sessions due to bugs or remote
connection failures. Sessions should ALWAYS be fast. Currently the
abort will call Thread interrupt()
on the unended session thread; note that this will result in
an InterruptedException at some later stage in that thread! This might
be a bad idea, so this might change; it is there at the moment because
it is the only way to push any kind of failure report to the thread
asynchronously.
In the case of an interrupt being raised by a remote dataspace the
interrupt will be re-thrown in the local process as a RuntimeException
from the next operation which the original thread attempts on a
dataspace in the same remote server.
If communication with a remote dataspace is lost then the
client-side communications will raise an exception after some timeout
period has elapsed. Currently with PART communications this is 15
seconds. Independently the server-side will use the timeout mechanism
described above to time out any in-progress session(s) from that client.
The data objects which are used with a dataspace in EQUIP2 can be
Plain Old Java Objects (POJOs) or JavaBeans, and do not need to
implement any particular interface or extend any particular base class
(although in general java.lang.Object
methods equals(Object)
and hashCode() must be
implemented).
See Simple_Bean_Generator.html
for details of a simple utility to create suitable java beans and
helper classes from a simple XML class desription.
On J2ME each application-specific class to be used with
EQUIP2 has to have an additional helper class which provides
class-specific support to EQUIP2, including:
NOTE: at present class-specific helpers are also required on J2SE
until reflection-based helpers are implemented.
A number of standard object classes are provided/supported:
NOTE: at present float, double, java.lang.Float and java.lang.Double are not supported to provide compatibility with J2ME CLDC 1.0. This should probably get fixed before too long (perhaps by adopting a float support library for CLDC 1.0).
The current session interface and approach is inspired by the Hibernate Object/Relational
dataspace system. Typically you will define your own object classes
which are used in your application code; to use them in an EQUIP2
dataspace you will also have provided suitable helper
classes (or make use of the standard ones provided).
Any object may be either: "unmanaged" or "managed" by any active
dataspace session. A managed object is "part" of the dataspace, and at
the end of the session the current state of the object will be
preserved in the dataspace. An unmanaged object is completely
independent of the dataspace, and the application can do whatever it
wants with it.
An object is added to the dataspace (becomming managed) using the equip2.core.ISession method:
public void add(Object object);
It is an error to add an object to a dataspace if there is already
an equal object in it (as determined by the object helper's equality
test - usually the object's own equals(Object
o) method). NOTE: at present this error may NOT be detected, or
may not be detected consistently; this should change in the future, so
don't do it - use addOrUpdate instead
if appropriate.
NOTE: errors such as this MAY be detected and signalled (by an
exception) immediately, or only when an attempt is made to end the
session.
Once the session is ended the object (like all objects) becomes
unmanaged again.
An object can be deleted from the dataspace using the equip2.core.ISession method:
public void remove(Object object);
The object helper's equality test (usually the object's own equals(Object o) method) is
used to find the object to delete from the dataspace. It is an error to
remove an object which is not present in the dataspace, but as already
noted this error may not be detected/signalled until the session is
ending. After calling remove the
object is immediately unmanaged.
An unmanaged object which is equal to an object already in the
dataspace can be made managed using the equip2.core.ISession method:
public void update(Object object);
A common example of this is that the application has retained a
reference to an object that was added in a previous session; since that
session has finished the object is now unmanaged, but can be
reassociated with the new session using update.
NOTE: if an object's helper defines an identity for the object which
is not the whole object (see above,
e.g. just the name of a name-value pair) then this is used by remove and update to determine which if
any object/value currently in the dataspace corresponds to the object
being removed or updated. For example, suppose a data object is
modified by the application between sessions and then update is called
with the modified object: reassociating the object with its previously
known value of the object depends on the identity being unchanged. If
the update is successful then the object in the dataspace will be
changed (replaced) at the end of the session to reflect its value at
that time, i.e. including the changes made between the sessions.
WARNING: using update can result in
loss of data! If the object was changed in the database between
the original get and this update (e.g. by a concurrent or interleaved
activity) then those interleaved changes will be lost! So only use
update if you are certain that (a) there could not have been other
changes to that object in the mean time or (b) you definitely want to
discard any changes made in the mean time. Otherwise, you will need to
re-get the object and (currently manually) merge the changes into the
latest value.
It is possible to check whether a particular object is current
managed using the equip2.core.ISession
method:
public boolean isManaged(Object object);
This is actually equivalent to the code fragment "session.matchUnique(object)==object"
(see below).
The following convenience method performs an add or update according
to whether the object was previous known by the dataspace or not; after
the method the application is sure that the object is managed:
public void addOrUpdate(Object object);
Most access to information in the dataspace is by querying. However,
for object classes which define an identity (see above) the
following equip2.core.ISession
method retrieves the corresponding object/value from the dataspace, or
null if it is unknown:
public Object get(String classname, Object identity);
public Object get(Class clazz, Object identity);
For object classes with no specific identity the whole object/value
is compared with the identity. You must provide the class of the object
for which you are searching. Currently only an object of exactly that
class will be matched/returned.
NOTE: all object reference returned from dataspace query operations
are managed objects (until the end of that session).
The main (current) query operation is equip2.core.ISession method:
public Object[] match(Object template);
public Object[] match(Object template, boolean readOnly); // see below
public java.util.Enumeration matchReadonly(Object template); // see below
This returns a (possibly zero sized) array of managed objects which
math the given template object, as determined by the template object
helper's match method.
NOTE: it is probably NOT a good idea to define custom match methods at
the moment, since the kind of matching currently supported is likely to
also be assumed/embodied in various dataspace implementations.
As noted above, the current
query/matching rules are:
The following equip2.core.ISession
convenience method throws a MatchUniqueException if there is more than
one
match (returns null if there is no match):
public Object matchUnique(Object template) throws equip2.core.MatchUniqueException;
The count operation
returns only the number of values that would have been returned by a
call to match with the
same argument:
public int count(Object template);
public int count(Object template, boolean readOnly); // see below
public int countReadonly(Object template); // see below
With some dataspace implementations this may be more efficient that
using match (or matchReadonly) and discarding
the actual values.
The following equip2.core.ISession
operations support (essentially) the normal match and count signatures, but allow a
read-only operation to be performed within a READ_WRITE (or of course READ_ONLY) session:
public Object[] match(Object template, boolean readOnly);
public int count(Object template, boolean readOnly);
As with results in a read-only session, a match with readOnly true returns the resulting objects in an unmanaged state, i.e. not cached by the session and not checked for updates or removal. This normally makes this operation more efficient, but means that the results will not take account of changes being made to (copies of) those objects within the current session.
The following equip2.core.ISession
operations match returns
an java.util.Enumeration over
the matched objects, rather than an array; it also returns the
resulting objects in an unmanaged state. This allows some dataspace
implementation to stage the
loading of results from persistent storage, reducing memory
requirements when fetching large result sets:
public java.util.Enumeration matchReadonly(Object template);
The application can suggest preferred data fetch sizes to the
dataspace implementation (only some dataspaces may implement this)
using the ISession methods:
public void setFetchSize(int size);
public int getFetchSize();
At present (2006-10-03), NO dataspace implementation
has special support for matchReadonly,
but all implementations support the API operation (even if it is only
implemented using a normal match).
The default dataspace has implements matchReadonly using a normal
match (working notes on
this aspect of EQUIP2 can be found in EQUIP2_Large_Dataspace_notes.html).
Note that QueryTemplate
setMaxResults/setFirstResult functions cannot be optimised by
the dataspace when used with a read-write match (all values have to be
fetched).
As noted above, it is also preferable to begin a session explicitly
as READ_ONLY or WRITE_ONLY when relevant, as
the former case avoids caching fetched objects for all operations (match and get, as well as matchReadonly), and both allow
a dataspace opportunity to perform additional optimisations (depending
on the dataspace implementation). As noted above, the copying and
cacheing required for a READ_WRITE
session also defeats the lazy loading strategies used by the
Hibernate dataspace implementation.
Some dataspace implementations (most likely Hibernate) may also
provide some additional optimisation for read-only count compared to match (see above). The QueryTemplate methods setFirstResult and setMaxResults (see below) may also be used (generally
with addOrder) to
constrain the number of objects actually constructed by a match
operation. Note that such optimisation are normally ONLY
effective in a READ_ONLY
session or with read-only match/count.
package equip2.core;For example, a template to match objects of class "mypackage.MyClass", with a property "age" between 10 and 20:
public class QueryTemplate {
public QueryTemplate(Class clazz);
// (currently) simple-typed properties only...
public QueryTemplate addConstraintEq(String propertyName, Object value); // == value
public QueryTemplate addConstraintNe(String propertyName, Object value); // != value
public QueryTemplate addConstraintLt(String propertyName, Object value); // < value
public QueryTemplate addConstraintLe(String propertyName, Object value); // <= value
public QueryTemplate addConstraintGt(String propertyName, Object value); // > value
public QueryTemplate addConstraintGe(String propertyName, Object value); // >= value
public QueryTemplate addConstraintIsNull(String propertyName); // == null
public QueryTemplate addConstraintNotNull(String propertyName); // != null
public QueryTemplate addConstraintContains(String propertyName, Object value); // Set.contains(value)
public QueryTemplate addConstraintLike(String propertyName, String value); // % wildcard, _ any char
public QueryTemplate addConstraintIn(String propertyName, Object values[]); // == any element of values[]
public QueryTemplate addConstraintNotIn(String propertyName, Object values[]); // == no element of values[]
public QueryTemplate addOrder(String propertyName); // ascending
public QueryTemplate addOrder(String propertyName, boolean descending); // descending
// the following options, intended for working with larger data sets, are only
// optimised by matchReadonly/match(readOnly)/READ_ONLY sessions.
public QueryTemplate setFirstResult(int index); // count from 0
public QueryTemplate setMaxResults(int count); // defaults to unbounded ('0')
}
session.match(new QueryTemplate(Class.forName("mypackage.MyClass"))
.addConstraintGt("age", new Integer(10)).
.addConstraintLt("age", new Integer(20));
Simple coercions are applied to the specified values, so for example
(the string) "10" can be compared with an integer or float-valued
property.
Depending on the dataspace implementation such queries may be mapped
to comparable direct queries on the underlying storage mechanism, e.g.
the Hibernate-based persistent dataspace should map a QueryTemplate to a Hiberate Criteria query.
Note that there is currently (2006-04-24) apparently no way to map
an addConstraintContains
query to a native Hibernate Criteria query; there is a work-around in
place but it relies on post-filtering the values returned by Hibernate
(less efficient).
Normally Java string comparison is case sensitive (binary), and so
equality constraints and order are normally also case sensitive.
Sometime case insensitive matching or ordering is desired. In
this case a forced-case version of the property should also be defined
for the object, for use with case-insensitive comparison or ordering.
This may be supported by custom coding of the case-sensitive property's
setter method to ensure that the forced case version is always set. It
is recommended that lower case is used as the forced case by default.
The equip2.core.ISession
interface provides access to asynchronous dataspace monitoring
facilities via the method:
public IEventManagement getEventManagement();
The equip2.core.IEventManagement
interface returned allows monitors for dataspace events to be
registered with the dataspace, allowing an application to respond
asynchronously to changes in the dataspace. The supported dataspace
events/interfaces are described below.
The simplest dataspace change event is
equip2.core.DataspaceChangeEvent,
whose public interface is:
package equip2.core;
public class DataspaceChangeEvent {
public IDataspace getDataspace();
}
This simply reports that some change has been made to the dataspace.
The corresponding listener interface is:
package equip2.core;
public interface IDataspaceChangeListener {
public void dataspaceChanged(DataspaceChangeEvent dce);
}
The un/registration methods in the equip2.core.IEventManagement
interface are:
public void addIDataspaceChangeListener(IDataspaceChangeListener listener);
public void removeIDataspaceChangeListener(IDataspaceChangeListener listener);
NOTE: all listeners are called outside of any session: if a listener
wishes to perform any operations on the dataspace then it must begin
(and end) a new session.
NOTE: all listeners are called after
the session which caused the change being reported. Other listeners may
have already been called and made further changes to the dataspace.
Concurrent threads manipulating the dataspace may also have begun (and
perhaps ended) sessions on the same dataspace since the event which
originally caused this change event to be generated.
The more complex dataspace change event is
equip2.core.DataspaceObjectsEvent,
whose public interface is:
package equip2.core;
public class DataspaceObjectsEvent {
public IDataspace getDataspace();
public Enumeration getDataspaceObjectEvents();
}
This reports a list of specific object change that have been made to
the dataspace.
The enumeration is a list of equip2.core.DataspaceObjectEvents
(Note lack of plural), the public interface of which is:
package equip2.core;
public class DataspaceObjectEvent {
public IDataspace getDataspace();
public int getApparentChange();
public int getRealChange();
public Object getOldValue();
public Object getNewValue();
}
The "apparent" change (i.e. how the listener will normally regard
the event) is one of:
DataspaceObjectEvent.ADDED
DataspaceObjectEvent.REMOVED
DataspaceObjectEvent.MODIFIED
As with EQUIP v.1 item monitors (dataspace item listeners), a
listener will always see an ADDED event for an object, zero of more
MODIFIED events, and finally a REMOVED event, in that order. The "real"
change (i.e. what triggered the event) is one of:
DataspaceObjectEvent.OBJECT_ADDED
DataspaceObjectEvent.OBJECT_REMOVED
DataspaceObjectEvent.OBJECT_MODIFIED
DataspaceObjectEvent.LISTENER_ADDED
DataspaceObjectEvent.LISTENER_REMOVED
The LISTENER_ADDED change
occurs when a listener is first registered, and is scheduled to
be informed of matching objects that existed in the dataspace at the
time of registering the listener.
Note: LISTENER_REMOVED events are
not currently supported, and have to be delivered after the listener
has been removed.
The corresponding listener interface is:
package equip2.core;
public interface IDataspaceObjectsListener {
public void objectsChanged(DataspaceObjectsEvent dose);
}
The un/registration methods in the equip2.core.IEventManagement
interface are:
public void addIDataspaceObjectsListener(IDataspaceObjectsListener listener);
public void addIDataspaceObjectsListener(IDataspaceObjectsListener listener,
Object templateValue);
public void addIDataspaceObjectsListener(IDataspaceObjectsListener listener,
Object templateValue, int realChangeMask);
public void removeIDataspaceObjectsListener(IDataspaceObjectsListener listener);
Note: these add listener methods should be called OUTSIDE of any
dataspace session, as they will create their own temporary internal
session if interest is registered in LISTENER_ADDED events.
The templateValue is
used in the same way as the argument to "match(Object template)" in the
synchronous API (above), and constrains the
listener to only be called with events for objects which match the
provided template. The default is null, i.e. match any object.
The realChangeMask is
used to select a subset of possible change events, according to their
real change type. It is a bit-mask build up from:
DataspaceObjectsEvent.OBJECT_ADDED_MASK
DataspaceObjectsEvent.OBJECT_REMOVED_MASK
DataspaceObjectsEvent.OBJECT_MODIFIED_MASK
DataspaceObjectsEvent.LISTENER_ADDED_MASK
DataspaceObjectsEvent.LISTENER_REMOVED_MASK
The default is all EXCEPT for
LISTENER_REMOVED_MASK, which is also currently unsupported (see
note above).
NOTE: all listeners are called outside of any session: if a listener
wishes to perform any operations on the dataspace then it must begin
(and end) a new session.
NOTE: because listeners are called after the sessions which caused
the events being reported (as noted above),
the listener must NOT assume that a session created within the callback
will find the dataspace to be the same as that determined from the
events received up to the present time: there may be other dataspace
change events already scheduled for delivery to this listener but not
yet delivered. However, the system guarantees that all relevant changes
will be delivered to the listener in the order in which they occurred.
Listeners are silently unregistered if they throw an exception. The
following IEventManagement methods
allow a client to check if a listener is currently (still) registered
for active exceptions:
public boolean isIDataspaceChangeListener(IDataspaceChangeListener listener);
public boolean isIDataspaceObjectsListener(IDataspaceObjectsListener listener);
In the remote dataspace case (currently only supported by PART communications) an exception may be raised by the communications attempting to deliver the notification, resulting in a listener's removal without the client's direct knowledge. In most cases clients can assume that communication is bidirectional, so the failure and unregistration of a listener is likely to coincide with the general failure of other operations performed using the same DataspaceConnection with which that listener was registered. In addition, the above operations can be used to explicitly poll the server to check for such a failure. If clients know that (in some particular application) there will be regular updates to the dataspace, then a client-side timeout between calls to the listener will alert the client to the likely failure of communications to the listener.
It is expected that in many cases application will define their own
inter-process communications, with semi-independent dataspaces on each
device. This allows applications to optimise their own communication
and replication, e.g. to manage network usage and cope with
intermittant connectivity. At some point in the future suggested
methods and mechanisms for doing this may be included in EQUIP2.
However, there are also situations in which a common and simple
facility to access remote dataspaces is useful, e.g. for management and
configuration, or for server-side integration. In these situations
communication bandwidth is regarded as relatively plentiful and
reliable. This is comparable to JBDC for standardised access to remote
Relational Databases. To support this the following API facilities are
provided.
Note that this is a pluggable framework, which requires
protocol-specific classes to loaded at runtime. One is a native EQUIP2
version using EQUIP2 RPC support (see EQUIP2_Remote_Dataspace_Protocol.html),
currently over TCP. The other is part of the experimental EQUIP2/PART
integration, which
may be found in the part
subdirectory. The PART implementations are currently fragile, e.g. if a
process terminates during a session the system will remain deadlocked,
or if a connection is temporarily lost it will not recover (and may not
signal that loss in some cases, e.g. listeners only).
This is done via the equip2.net.DataspaceServer
class:
package equip2.net;
public class DataspaceServer {
public static DataspaceServer getDataspaceServer(String url);
public void addDataspace(String name, IDataspace dataspace);
}
One of more dataspace server objects are created using the getDataspaceServer static
method; the url specifies how the server may be contacted. For example,
URLs beginning with "equip2:"
will be passed to the equip2 implementation, which at least also
supports tcp socket communication (different URL scheme to PART,
following J2ME, e.g. "equip2:socket://:1002"
for a server). URLs beginning with "part:"
will be passed to the IPerG PART platform implementation, which at
least supports tcp socket communication (e.g. "part:tcp://localhost:1002") and
bluetooth communication.
Local dataspaces are then registered with the dataspace server to
make them accessible to remote processes using the addDataspace method. The name argument is used to
identify that particular dataspace within the local server, and is
required by remote clients when connecting to the dataspace.
This is done via the equip2.net.RemoteDataspace
and equip2.net.DataspaceConnection
classes:
package equip2.net;
public class RemoteDataspace {
public static RemoteDataspace getRemoteDataspace(String serverurl, String dataspacename);
public DataspaceConnection getConnection() throws IllegalArgumentException,
equip2.io.ConnectionNotFoundException, java.io.IOException;
}
The getRemoteDataspace static
method is used to obtain a RemoteDataspace
object corresponding to the remote dataspace, as identified by
the URL required to contact the dataspace server and the name of the
dataspace within that server (see above). For native EQUIP2 remote
access the dataspace URL is of the form "equip2:socket://hostname:port";
for EQUIP2/PART the remote dataspace URL is of the form "part:tcp://hostname:port".
An actual connection to the server and hence the remote dataspace is
obtained using the RemoteDataspace's
getConnection() method.
The equip2.net.DataspaceConnection
obtained implements the standard equip2.core.IDataspace
interface in the usual way, except that requests are despatched to and
handled by the remote dataspace.
Note: the overheads of network communication will make using a
remote dataspace relatively slow. Since the current dataspace
implementation do NOT allow sessions to be active concurrently you
should be minimise the use of remote dataspace access and the duration
of individual sessions, or responsiveness will be lost for all users of
the dataspace (including local ones).
I need to add another facility to (probably) RemoteDataspace to
allow a remote client to list the dataspaces known to a particular
dataspace server.
Some standard/helper applications:
Other examples are the test applications:
CVSROOT :pserver:anonymous@dumas.mrl.nott.ac.uk:/mrl/src/cvsroot
Directory Equator/equip2
It will be licensed under the (new) BSD open source licence
(boilerplates not yet in place). By making any CVS contributions you
assert that that contribution can also be licensed in this way :-)