Assumptions

Many assumptions have been made in the formulation of this protocol. This section documents the assumptions that are felt to be significant.
 
 

Hardware Assumptions

As would be expected, there are a huge range of devices which could be encountered as clients using this protocol. We describe three machines: a low end, a "median" specification, and the emerging higher capacity machines.

A "low-end" machine, which will often be encountered in existing implementations, and which may be required to have the Argo protocol implemented on them, in situ, would be:
 

The "typical" machine found in an external environment, requiring ruggedised construction, would be:  A "high-end" machine, starting to find its way into limited deployment currently, would be:
 

Network Assumptions

It is assumed that the network connecting the client devices to the brokers supports  TCP/IP.

All multi-byte data values will be sent in Network Order (Big-Endian).

Often, communications will be over satellite links, incurring high per-byte costs. Typical latency on such a link would be 600ms in each direction.

Serial communications are typically implemented as SLIP at 600 baud.

It is interesting to note that typical Service Level Agreements for satellite communications links offer 99.9% availability - equivalent to 9 hours per year downtime.

The types of communications fabrics typically encountered include:

Software Assumptions

Embedded systems are today programmed almost exclusively in C.

Currently no producction machines have sufficient CPU or memory to support a Java Virtual Machine, however progress in Embedded Java is being closely monitored.

Operating System softwware is generally proprietary and minimalist in nature, though there is increasing interest in making use of Linux on higher-end systems.

It is important to remember that this is a specialised protocol for communication from remote, "low end" devices, into a message broker. There is no assumption that this protocol will be used for general application programming on workstation class machines.

The Message Identifier used for QoS 1 and QoS 2 message delivery is a rolling 16 bit value. This assumes that not more than 65,536 messages will be "in flight", i.e. in the process of being exchanged between sender and receiver, at any time. For more information, see the Quality of Service section.

It is assumed that messages requiring acknowledgement will "overlap", in the sense that several messages could be sent out without waiting for any of the responses to come back. This is in contrast to strictly "serial"  communication, where each message would have to  be dealt with in its entirity before moving on to the next.

All strings will be encoded using UTF-8. The only modification to this is that the Topic field in a PUBLISH message may undergo compression.See the PUBLISH section for more details.
 

Quality of Service Assumptions

The assumptions made for level of assurance in message delivery are covered in the Quality of Service section.
 

Scaling Assumptions

It is useful to define typical "low", "medium" and "high" sets of scaling assumptions.
 
Scale Industry Example Typical message traffic
low Gas up to 120 bytes at intervals between 2 mins and 1 hour, median 15 mins.
medium Liquids up to 120 bytes every 5 seconds per site
high Electricity substation messages can be milliseconds apart, with very high peak traffic.

 A typical system, involving a number of clients communicating with a single broker, would comprise 500 to 1000 client devices, with an average of 50 messages per second, peaking at 200 messages per second.

It should be noted that for different industry applications, message arrival rates, message sizes and peak traffic volumes differ widely.
 


Discussion

ASC on concurrent messages
There is a question about the use of  concurrent writes versus sequential blocking for guaranteed messages: i.e. do we wait for an ACK for a packet before sending the next one, or is it desirable to transmit a sequence of messages and gather ACKS asynchronously? Clearly in cases where connections are charged by the second, it is preferable not to have to wait for each message to commit at the broker end before we get the next one into the comms pipeline. However, there are implications for the need for sequence IDs, and also for responses to other protocol messages (such as Subscribe), which would be resolved in different ways, based on the serialisation policy of the stream. For example, we would need a Subscribe-specific ACK packet type if there was overlapped acknowledgement, so as not to confuse a mid-stream Subscribe acknowledgement with the ACK from a guaranteed packet.
Comment
Messages can be overlapped, and hence the Message Identifier is needed.
ASC on retries
How often are retries attempted on failed packets? Is it on a timer, in which case is it a fixed period, or is it only on notification of an error on the communication line?

AN Comment
It is an application specific parameter, and no specific retry interval is defined by this specification. Typical values can range from 100s of milliseconds to some minutes.
ASC on UTF
Is UTF-8 the most appropriate character encoding? This boils down to size of messages and hence cost, versus the National Language requirements of an international marketplace.
AN comment
Even in non-English speaking countries, strings are still specified in English, and so in fact ASCII with either a 0x00 terminator, or a preceding length byte would be perfectly alright. However, UTF does not impose a significant overhead, and in light of IBM's NLS requirements it would be appropriate to use the UTF encoding.




Last Modified: 27-Mar-99