+++ title = "Studying a communication protocol" summary = "Step 2: Using a shark to sniff packets" date = "2024-05-01" tags = ["Reverse Engineering", "Attendance Reader", "TCP", "Sniffing", "Wireshark"] categories = ["Projects"] series = ["Attendance Reader"] series_order = 2 +++ In the previous article, we started studying how the attendance reader client works, we even attempted to decompile its executable. In this article, I'd like to explore the communication protocol that the client uses to talk to the reader. There are basically two reasons why I didn't immediately reverse-engineer the protocol: 1. If I could decompile the executable code, I could create an alternative client much more easily; 2. Sometimes it's not possible (not easily, at least) to *sniff* a communication 'cause of [TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security). However, decompiling DLLs is far from easy because: > There's no magic "go back" button, there's a "generate shitty C code with > random-ass variable names" button, but that's not a very good button > > **fasterthanlime** in the [How does the detour crate > work?](https://www.youtube.com/watch?v=aLeMCUXFJwY&t=174s) video If you're interested, the NSA has developed its own decompiler called [Ghidra](https://ghidra-sre.org/), check it out. ## Client configuration In the last article, we only installed the client for Windows but never opened it. Since we need a client that can actually interact with the reader to intercept the communication, I reopened my VM with [Windows 10 AME](https://archive.org/details/windows10-ame-21h1-2021-08-09/) and finished configuring the client: {{< carousel images="images/01-client-setup/*" aspectRatio="16-9" interval="1000" >}} Once the configuration is completed (and after manually modifying some configuration files because the client still couldn't see the reader on the network), we can request the reader's data over the network. After opening the client **as an administrator**, pressing the button to download data, and waiting **two minutes**, a total of 3543 attendances appeared on the screen. Something's odd: why does it take two minutes to transfer the equivalent of a file weighing just under 200 kiB? Doing some quick math: {{< katex >}} $$ \frac{3543\ \textrm{lines}}{120\ \textrm{seconds}} \ \cdot\sim460\ \textrm{bit per row} = 13.26\ kib/s $$ 13 kibps of useful throughput on a 100 Mbps connection? ***This sucks!*** I don't want to know what disaster of italian corporate coding could have caused this, but I have a feeling I’m about to find out... ## *The quieter you become...* To analyze the network, I will use [Wireshark](https://wireshark.org), a very popular tool for this type of operations. After installing it and adding our user to the `wireshark` group, we can run it and begin to *sniff* all packets on our network interface. ![Wireshark in operation](images/02-wireshark-working.png "Here's Wireshark listening to all the packets circulating on my network.") If this is your first time using a tool like this, you might notice that even in a small Local Area Network there are a lot of packets flying around — too many to analyze individually. This is where filters come and save the day. If we type the following string into the filter bar: ``` ip.addr == ``` We will see only packets that come *from* or are directed *to* the specified IP address. We can also filter traffic that passes through a specific TCP port with: ``` ip.addr == && tcp.port == ``` Filters in Wireshark are a vast argument; here's a [link to the official documentation](https://wiki.wireshark.org/DisplayFilters) for those interested. Once we start recording with the correct filters, we can start another full scan of attendances on the official client, and we should see the packet exchange between the client and the device in real-time. ![Wireshark with the IP filter](images/03-wireshark-with-filter.png "The packets exchanged between the client and the device.") At the end of the process, we've recorded an astonishing 14,423 packets, carrying 3,543 attendances. *Things just get stranger...* By taking a quick look at the traffic, we can deduce a few things: 1. The transport layer uses the TCP protocol on port `5005`; 2. [TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security) is not used, *phew*; 3. There are at least three phases: * An initial setup phase; * A second phase in which data is exchanged with a few but large packets; * A third phase with many but small packets, where you can occasionally observe employee names in ASCII. !["test" user in the ASCII box](images/04-test-name.png "A familiar name appears in the ASCII box at the bottom right.") To study the protocol in more depth, we'd need only the content of the TCP packets. This is where Wireshar comes in handy. If we select a packet from the TCP communication we're interested in and right-click, selecting `Follow` > `TCP Stream`, Wireshark will automatically open the payload of all packets and show only the level-7 traffic. If we view the data as `Raw`, Wireshark will display the exchanged data in hexadecimal format, with messages sent by the client in red and responses from the attendance reader in blue. Now we can copy the payloads into our preferred text editor and start to study the protocol. ![The TCP stream shown by Wireshark](images/05-wireshark-tcp-stream.png "This is what the message exchange looks like when we open the TCP packets.") ## Fuck around and find out Now we just need to understand the communication protocol, which, unfortunately, isn't in a text-based format like ASCII or UTF-8. It may seem complex, but it only took me an afternoon to find a comprehensive enough solution for what I need to do. ### Requests Client requests are all 16 bytes long and have this structure: ```regex ^55aa([0-9a-f]{24})([0-9a-f]{4})$ ``` * The first two bytes are always `55 aa` (`01010101 10101010` in binary); * The next 12 bytes specify the client command. I will call them "payload" from now on; * Finally, there are two **little-endian** bytes indicating the packet number, starting from `00 00`. I noticed that the server doesn't check if the last two bytes are sent sequentially, so they can remain at `00 00` throughout the message exchange. ### Responses Server responses do not have a fixed length and are divided into two parts, which I will call "header" and "payload." The header is always present and is 10 bytes long, while the payload can be absent. When there's no payload, the message acts like a kind of `null`/`ACK`. ```regex ^aa55([0-9a-f]{16})(?:55aa([0-9a-f]+))?$ ``` * The first two bytes are always `aa 55` (`10101010 01010101` in binary); * The following eight bytes are the header. Usually, they are `01 01 00 00 00 00 00 00`, but they can change; * If a payload is present, the message continues with `55 aa` (`01010101 10101010` in binary); * The remaining bytes are the payload. --- ### Ping If we want to perform a "ping" and check if the server responds, we can send a request with the payload set to `01 80 00 00 00 00 00 00 00 00 00 00`: ``` 55aa0180000000000000000000000100 aa550101000000000000 ``` The server will then respond with a packet without a payload and the header set to `01 01 00 00 00 00 00 00`. ### Employee name Knowing the ID of an employee, it's possible to ask the server for their name by sending a request with a payload set to `01 c7 xx xx xx xx 00 00 00 00 14 00`, where `xx xx xx xx` is a 32-bit **little-endian** integer representing the employee ID. ``` 55aa01c7xxxxxxxx0000000014000100 aa55010100000000000055aaxxxxxxxxxxxxxxxxxxxx4c0000000000595a7c7c0000 ``` The first 10 bits of the payload contain the employee's name; if it's shorter than 10 characters, the remaining space will be filled with null terminators (`\0`). These messages comprise almost the entirety of the third phase I described in the last chapter, the one with many but small messages. This suggests that the client quickly dumps the attendance data, then spends two whole minutes downloading the employee's name **for each attendance**, even if it's been requested before. Someone should teach these developers the concept of [memoization](https://en.wikipedia.org/wiki/Memoization)... ### Total number of records To ask for the total number of attendances registered on the device, you need to send a request with a payload of `01 b4 08 00 00 00 00 00 ff ff 00 00`: ``` 55aa01b4080000000000ffff00000100 aa550101xxxx00000000 ``` Where `xx xx` is the number of saved attendances represented as a 16-bit **little-endian** integer. 65535 maximum requests seem a bit too few, but I guess it's a future-me problem. ### Downloading all records The list of all attendances must be downloaded in blocks, continuing to request 1024-byte blocks from the server (approximately 85.333 attendances at a time) until the entire list is extracted. To do this, we first have to request the total number of attendances, then send a request with a payload of `01 a4 00 00 00 00 xx xx 00 00 00 04`, where `xx xx` is the total number of attendances in **little-endian**. ``` 55aa01a400000000xxxx000000040100 aa55010100000000000055aa ... ``` The server will respond with a 1026-byte payload, containing the initial records followed by two zero bytes. We can request another 1026-byte block by sending a request with a payload of `01 a4 00 00 00 00 00 00 xx xx 00 04`, where `xx xx` is a **little-endian** integer starting from `01 00`: ``` 55aa01a4000000000000010000040100 aa55010100000000000055aa ... ``` Once the records are finished, the server will start sending padding bytes set to `ff` to reach 1026-byte. ### Record structure Once we have all the registration blocks, we can break them down into individual registrations, each one 12 bytes long. I wasn't able to understand what all the bytes represent, but the important ones are: ```regex ..([26ae]).{5}([0-9a-f]{8})([0-9a-f]{8}) ``` * The second byte's two most significant bits indicate if the registration represents an entry or an exit: * If it's `00`, it's the first entry; * If it's `01`, it's the first exit; * If it's `10`, it's the second entry; * If it's `11`, it's the second exit; * The second-to-last four bytes represent the employee ID (in **little-endian**); * The last four bytes represent the date and time of the attendance (in **little-endian**). Initially I thought the date was represented as a UNIX Epoch, but it seems to have this format when shown as big-endian: * The first 6 bits represent the minutes; * The next 5 bits represent the hours; * The next 5 bits represent the days; * The next 4 bits represent the months; * The last 12 bits represent the years. --- I suspect that the first four bytes of each attendance may contain: * The seconds; * The recording method (if the employee checked-in with the PIN, fingerprint, or the badge); * The recorder ID. But since these aren't very important fields, I've decided to ignore them for now. ## Testing using the terminal If you want to test communication without writing any program that sends bytes over a TCP socket, you can use some basic core utilities like `netcat` and `xxd`: ```shell # If you're using Bash or Zsh function send_bytes { echo -n "$3" | xxd -r -p | timeout 1 nc "$1" "$2" | xxd; } # If you're using Fish function send_bytes -a ip port data echo -n "$data" | xxd -r -p | timeout 1 nc "$ip" "$port" | xxd end send_bytes 127.0.0.1 5005 55aa0180000000000000000000000100 ``` Trying some requests from the examples above, I can confirm everything seems to work correctly. In the next article, we'll see how to create a small Rust library to extract data from the reader.