Assumptions

Many assumptions have been made in the formulation of this protocol. This section documents the assumptions that are felt to be significant.

Hardware Assumptions

As would be expected, there are a huge range of devices which could be encountered as clients using this protocol. We describe three machines: a low end, a "median" specification, and the emerging higher capacity machines.

A "low-end" machine, which will often be encountered in existing implementations, and which may be required to have the Argo protocol implemented on them, in situ, would be:

16MHz Z80 or HC11 8-bit processor
32K to 128K RAM
32K Flash ROM

The "typical" machine found in an external environment, requiring ruggedised construction, would be:

33MHz 386EX 16 bit processor
1MB RAM
1MB Flash ROM

A "high-end" machine, starting to find its way into limited deployment currently, would be:

300 MHz Pentium 32-bit processor
RAM ???
ROM ???

Network Assumptions

It is assumed that the network connecting the client devices to the brokers supports TCP/IP.

All multi-byte data values will be sent in Network Order (Big-Endian).

Often, communications will be over satellite links, incurring high per-byte costs. Typical latency on such a link would be 600ms in each direction.

Serial communications are typically implemented as SLIP at 600 baud.

It is interesting to note that typical Service Level Agreements for satellite communications links offer 99.9% availability - equivalent to 9 hours per year downtime.

The types of communications fabrics typically encountered include:

microwave
fibre
leased line
geostationary satellite (GEO)
low earth orbit satellite (LEO)
Asynchronous Transfer Mode (ATM)
Radio Frequency (RF)

Software Assumptions

Embedded systems are today programmed almost exclusively in C.

Currently no producction machines have sufficient CPU or memory to support a Java Virtual Machine, however progress in Embedded Java is being closely monitored.

Operating System softwware is generally proprietary and minimalist in nature, though there is increasing interest in making use of Linux on higher-end systems.

It is important to remember that this is a specialised protocol for communication from remote, "low end" devices, into a message broker. There is no assumption that this protocol will be used for general application programming on workstation class machines.

The Message Identifier used for QoS 1 and QoS 2 message delivery is a rolling 16 bit value. This assumes that not more than 65,536 messages will be "in flight", i.e. in the process of being exchanged between sender and receiver, at any time. For more information, see the Quality of Service section.

It is assumed that messages requiring acknowledgement will "overlap", in the sense that several messages could be sent out without waiting for any of the responses to come back. This is in contrast to strictly "serial" communication, where each message would have to be dealt with in its entirity before moving on to the next.

All strings will be encoded using UTF-8. The only modification to this is that the Topic field in a PUBLISH message may undergo compression.See the PUBLISH section for more details.

Quality of Service Assumptions

The assumptions made for level of assurance in message delivery are covered in the Quality of Service section.

Scaling Assumptions

It is useful to define typical "low", "medium" and "high" sets of scaling assumptions.

Scale	Industry Example	Typical message traffic
low	Gas	up to 120 bytes at intervals between 2 mins and 1 hour, median 15 mins.
medium	Liquids	up to 120 bytes every 5 seconds per site
high	Electricity substation	messages can be milliseconds apart, with very high peak traffic.

A typical system, involving a number of clients communicating with a single broker, would comprise 500 to 1000 client devices, with an average of 50 messages per second, peaking at 200 messages per second.

It should be noted that for different industry applications, message arrival rates, message sizes and peak traffic volumes differ widely.

Discussion

ASC on concurrent messages: There is a question about the use of concurrent writes versus sequential blocking for guaranteed messages: i.e. do we wait for an ACK for a packet before sending the next one, or is it desirable to transmit a sequence of messages and gather ACKS asynchronously? Clearly in cases where connections are charged by the second, it is preferable not to have to wait for each message to commit at the broker end before we get the next one into the comms pipeline. However, there are implications for the need for sequence IDs, and also for responses to other protocol messages (such as Subscribe), which would be resolved in different ways, based on the serialisation policy of the stream. For example, we would need a Subscribe-specific ACK packet type if there was overlapped acknowledgement, so as not to confuse a mid-stream Subscribe acknowledgement with the ACK from a guaranteed packet.
Comment: Messages can be overlapped, and hence the Message Identifier is needed.
ASC on retries: How often are retries attempted on failed packets? Is it on a timer, in which case is it a fixed period, or is it only on notification of an error on the communication line?; It is an application specific parameter, and no specific retry interval is defined by this specification. Typical values can range from 100s of milliseconds to some minutes.
ASC on UTF: Is UTF-8 the most appropriate character encoding? This boils down to size of messages and hence cost, versus the National Language requirements of an international marketplace.
AN comment: Even in non-English speaking countries, strings are still specified in English, and so in fact ASCII with either a 0x00 terminator, or a preceding length byte would be perfectly alright. However, UTF does not impose a significant overhead, and in light of IBM's NLS requirements it would be appropriate to use the UTF encoding.

Last Modified: 27-Mar-99