Assumptions
Many assumptions have been made in the formulation of this protocol. This
section documents the assumptions that are felt to be significant.
Hardware Assumptions
As would be expected, there are a huge range of devices which could be
encountered as clients using this protocol. We describe three machines:
a low end, a "median" specification, and the emerging higher capacity machines.
A "low-end" machine, which will often be encountered in existing implementations,
and which may be required to have the Argo protocol implemented on them,
in situ, would be:
-
16MHz Z80 or HC11 8-bit processor
-
32K to 128K RAM
-
32K Flash ROM
The "typical" machine found in an external environment, requiring ruggedised
construction, would be:
-
33MHz 386EX 16 bit processor
-
1MB RAM
-
1MB Flash ROM
A "high-end" machine, starting to find its way into limited deployment
currently, would be:
-
300 MHz Pentium 32-bit processor
-
RAM ???
-
ROM ???
Network Assumptions
It is assumed that the network connecting the client devices to the brokers
supports TCP/IP.
All multi-byte data values will be sent in Network Order (Big-Endian).
Often, communications will be over satellite links, incurring high per-byte
costs. Typical latency on such a link would be 600ms in each direction.
Serial communications are typically implemented as SLIP at 600 baud.
It is interesting to note that typical Service Level Agreements for
satellite communications links offer 99.9% availability - equivalent to
9 hours per year downtime.
The types of communications fabrics typically encountered include:
-
microwave
-
fibre
-
leased line
-
geostationary satellite (GEO)
-
low earth orbit satellite (LEO)
-
Asynchronous Transfer Mode (ATM)
-
Radio Frequency (RF)
Software Assumptions
Embedded systems are today programmed almost exclusively in C.
Currently no producction machines have sufficient CPU or memory to support
a Java Virtual Machine, however progress in Embedded Java is being closely
monitored.
Operating System softwware is generally proprietary and minimalist in
nature, though there is increasing interest in making use of Linux on higher-end
systems.
It is important to remember that this is a specialised protocol for
communication from remote, "low end" devices, into a message broker. There
is no assumption that this protocol will be used for general application
programming on workstation class machines.
The Message Identifier used for QoS 1 and QoS 2 message delivery is
a rolling 16 bit value. This assumes that not more than 65,536 messages
will be "in flight", i.e. in the process of being exchanged between
sender and receiver, at any time. For more information, see the Quality
of Service section.
It is assumed that messages requiring acknowledgement will "overlap",
in the sense that several messages could be sent out without waiting for
any of the responses to come back. This is in contrast to strictly "serial"
communication, where each message would have to be dealt with in
its entirity before moving on to the next.
All strings will be encoded using UTF-8. The
only modification to this is that the Topic field in a PUBLISH message
may undergo compression.See the PUBLISH section
for more details.
Quality of Service Assumptions
The assumptions made for level of assurance in message delivery are covered
in the Quality of Service section.
Scaling Assumptions
It is useful to define typical "low", "medium" and "high" sets of scaling
assumptions.
Scale |
Industry Example |
Typical message traffic |
low |
Gas |
up to 120 bytes at intervals between 2 mins and 1 hour, median 15 mins. |
medium |
Liquids |
up to 120 bytes every 5 seconds per site |
high |
Electricity substation |
messages can be milliseconds apart, with very high peak traffic. |
A typical system, involving a number of clients communicating
with a single broker, would comprise 500 to 1000 client devices,
with an average of 50 messages per second, peaking at 200
messages per second.
It should be noted that for different industry applications, message
arrival rates, message sizes and peak traffic volumes differ widely.
Discussion
-
ASC on concurrent messages
-
There is a question about the use of concurrent writes versus sequential
blocking for guaranteed messages: i.e. do we wait for an ACK for a packet
before sending the next one, or is it desirable to transmit a sequence
of messages and gather ACKS asynchronously? Clearly in cases where connections
are charged by the second, it is preferable not to have to wait for each
message to commit at the broker end before we get the next one into the
comms pipeline. However, there are implications for the need for sequence
IDs, and also for responses to other protocol messages (such as Subscribe),
which would be resolved in different ways, based on the serialisation policy
of the stream. For example, we would need a Subscribe-specific ACK packet
type if there was overlapped acknowledgement, so as not to confuse a mid-stream
Subscribe acknowledgement with the ACK from a guaranteed packet.
-
Comment
-
Messages can be overlapped, and hence the Message Identifier is needed.
-
ASC on retries
-
How often are retries attempted on failed packets? Is it on a timer, in
which case is it a fixed period, or is it only on notification of an error
on the communication line?
AN Comment
-
It is an application specific parameter, and no specific retry interval
is defined by this specification. Typical values can range from 100s of
milliseconds to some minutes.
-
ASC on UTF
-
Is UTF-8 the most appropriate character encoding? This boils down to size
of messages and hence cost, versus the National Language requirements of
an international marketplace.
-
AN comment
-
Even in non-English speaking countries, strings are still specified in
English, and so in fact ASCII with either a 0x00 terminator, or a preceding
length byte would be perfectly alright. However, UTF does not impose a
significant overhead, and in light of IBM's NLS requirements it would be
appropriate to use the UTF encoding.
Last Modified: 27-Mar-99