Protocol exploration

Post date: Nov 9, 2019 6:38:57 PM

Now I started to fear that the UPS I bought does not use the same 'Smart' protocol that is documented everywhere on the web, and supported by NUT and apcupsd. That would mean the smart UPS if effectively a dumb brick for my application, as I have no feedback on the grid power and the remaining runtime. To verify, I wanted to look at the serial data going between the UPS and PowerChute. After fiddling with some tools, I had success with 'HHD Free Device Monitoring Studio', as it shows the data and the underlying system calls to the port API transparantly when PowerChute is running. Alternatively, one could make a tap adapter such as this one.

From the captured data, it was immediately clear that a purely binary protocol is used, and the port is used at 9600bps,8,N,1. A sample exported capture is attached to this post.

From the captured traffic, we can conclude that, after an initial reset (0xF7 0xFD), the host PC sends a kind of 'next'-command (0xFE), after which the UPS replies with a fixed length (19 bytes) message. The serial sniffer also handily shows the ASCII representation of the byte stream, revealing a few recognizable strings, such as the serial number and the model name and SKU. This allows to deduce the message format:

[ Msg ID | 16 byte data | 2 byte checksum? ]

The first byte has to be some kind of message ID, as it increments message after message, and after passing ID 0x7E restarts with the same message content. Guessing from the readable ASCII strings, the last 2 bytes are likely some kind of checksum or CRC. Since I could not identify any type of CRC with my limited knowledge, I posted a question on Stack Overflow. One user was so kind to conclude from the samples that it could not be a CRC.

While waiting for responses, I quickly cobbled together a Python script to imitate the protocol. This seemed to work, but when passing message ID 0x7E, the UPS would simply stop responding to my 'next' commands. Comparing with the captured traffic from PowerChute, a crucial command seems to be sent to the UPS after message 0x7E. At this point I was kind of deflated. I needed a way to identify the checksum if I would ever want to send messages back to the UPS and make the stream continue.

Eventually I decided to try to find as many relevant checksum-implementations in Python on the received data, in the hopes of find a match. Since these are relatively low-end UPSs, I don't expect 32-bit checksums. Eventually, I found a match with 'Fletcher's 8-bit checksum', implemented by hashpy.

The checksum matches if the first 17 bytes of each message are used in the Fletcher 8-bit checksum.

Oh, and apparently this binary protocol is referred to as Microlink. In fact, APC put up a silly explanation for switching to this proprietary protocol. But I just want to interface my UPS with a microcontroller's UART...