342 lines
12 KiB
Markdown
342 lines
12 KiB
Markdown
+++
|
||
title = "Studying a communication protocol"
|
||
summary = "Step 2: Using a shark to sniff packets"
|
||
date = "2024-05-01"
|
||
|
||
tags = ["Reverse Engineering", "Attendance Reader", "TCP", "Sniffing", "Wireshark"]
|
||
categories = ["Projects"]
|
||
series = ["Attendance Reader"]
|
||
series_order = 2
|
||
+++
|
||
|
||
In the previous article, we started studying how the attendance reader client
|
||
works, we even attempted to decompile its executable. In this article, I'd like
|
||
to explore the communication protocol that the client uses to talk to the
|
||
reader.
|
||
|
||
There are basically two reasons why I didn't immediately reverse-engineer the
|
||
protocol:
|
||
|
||
1. If I could decompile the executable code, I could create an alternative
|
||
client much more easily;
|
||
2. Sometimes it's not possible (not easily, at least) to *sniff* a
|
||
communication 'cause of
|
||
[TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security).
|
||
|
||
However, decompiling DLLs is far from easy because:
|
||
|
||
> There's no magic "go back" button, there's a "generate shitty C code with
|
||
> random-ass variable names" button, but that's not a very good button
|
||
>
|
||
> **fasterthanlime** in the [How does the detour crate
|
||
> work?](https://www.youtube.com/watch?v=aLeMCUXFJwY&t=174s) video
|
||
|
||
If you're interested, the NSA has developed its own decompiler called
|
||
[Ghidra](https://ghidra-sre.org/), check it out.
|
||
|
||
## Client configuration
|
||
|
||
In the last article, we only installed the client for Windows but never opened
|
||
it.
|
||
|
||
Since we need a client that can actually interact with the reader to intercept
|
||
the communication, I reopened my VM with [Windows 10
|
||
AME](https://archive.org/details/windows10-ame-21h1-2021-08-09/) and finished
|
||
configuring the client:
|
||
|
||
{{< carousel images="images/01-client-setup/*" aspectRatio="16-9"
|
||
interval="1000" >}}
|
||
|
||
Once the configuration is completed (and after manually modifying some
|
||
configuration files because the client still couldn't see the reader on the
|
||
network), we can request the reader's data over the network.
|
||
|
||
After opening the client **as an administrator**, pressing the button to
|
||
download data, and waiting **two minutes**, a total of 3543 attendances
|
||
appeared on the screen.
|
||
|
||
Something's odd: why does it take two minutes to transfer the equivalent of a
|
||
file weighing just under 200 kiB?
|
||
|
||
Doing some quick math:
|
||
|
||
{{< katex >}}
|
||
$$
|
||
\frac{3543\ \textrm{lines}}{120\ \textrm{seconds}} \ \cdot\sim460\ \textrm{bit
|
||
per row} = 13.26\ kib/s
|
||
$$
|
||
|
||
13 kibps of useful throughput on a 100 Mbps connection? ***This sucks!***
|
||
|
||
I don't want to know what disaster of italian corporate coding could have
|
||
caused this, but I have a feeling I’m about to find out...
|
||
|
||
## *The quieter you become...*
|
||
|
||
To analyze the network, I will use [Wireshark](https://wireshark.org), a very
|
||
popular tool for this type of operations.
|
||
|
||
After installing it and adding our user to the `wireshark` group, we can run it
|
||
and begin to *sniff* all packets on our network interface.
|
||
|
||
data:image/s3,"s3://crabby-images/ea383/ea383d7a14cc06f78fab1145f99e01fb1f5a8eb6" alt="Wireshark in operation"
|
||
|
||
If this is your first time using a tool like this, you might notice that even
|
||
in a small Local Area Network there are a lot of packets flying around — too
|
||
many to analyze individually.
|
||
|
||
This is where filters come and save the day. If we type the following string
|
||
into the filter bar:
|
||
|
||
```
|
||
ip.addr == <Device's IP>
|
||
```
|
||
|
||
We will see only packets that come *from* or are directed *to* the specified IP
|
||
address. We can also filter traffic that passes through a specific TCP port
|
||
with:
|
||
|
||
```
|
||
ip.addr == <IP> && tcp.port == <Port>
|
||
```
|
||
|
||
Filters in Wireshark are a vast argument; here's a [link to the official
|
||
documentation](https://wiki.wireshark.org/DisplayFilters) for those interested.
|
||
|
||
Once we start recording with the correct filters, we can start another full
|
||
scan of attendances on the official client, and we should see the packet
|
||
exchange between the client and the device in real-time.
|
||
|
||
data:image/s3,"s3://crabby-images/fda3c/fda3c74ec2e4114ac92766f45a9ada68739169ce" alt="Wireshark with the IP filter"
|
||
|
||
At the end of the process, we've recorded an astonishing 14,423 packets,
|
||
carrying 3,543 attendances. *Things just get stranger...*
|
||
|
||
By taking a quick look at the traffic, we can deduce a few things:
|
||
|
||
1. The transport layer uses the TCP protocol on port `5005`;
|
||
2. [TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security) is not used,
|
||
*phew*;
|
||
3. There are at least three phases:
|
||
* An initial setup phase;
|
||
* A second phase in which data is exchanged with a few but large packets;
|
||
* A third phase with many but small packets, where you can occasionally
|
||
observe employee names in ASCII.
|
||
|
||
data:image/s3,"s3://crabby-images/39d22/39d22d55177a9fc21cdaede781301e7fc1576aae" alt=""test" user in the ASCII box"
|
||
|
||
To study the protocol in more depth, we'd need only the content of the TCP
|
||
packets. This is where Wireshar comes in handy.
|
||
|
||
If we select a packet from the TCP communication we're interested in and
|
||
right-click, selecting `Follow` > `TCP Stream`, Wireshark will automatically
|
||
open the payload of all packets and show only the level-7 traffic.
|
||
|
||
If we view the data as `Raw`, Wireshark will display the exchanged data in
|
||
hexadecimal format, with messages sent by the client in red and responses from
|
||
the attendance reader in blue.
|
||
|
||
Now we can copy the payloads into our preferred text editor and start to study
|
||
the protocol.
|
||
|
||
data:image/s3,"s3://crabby-images/5a360/5a360954c02b163ccf4cf31469822109be09790a" alt="The TCP stream shown by Wireshark"
|
||
|
||
## Fuck around and find out
|
||
|
||
Now we just need to understand the communication protocol, which,
|
||
unfortunately, isn't in a text-based format like ASCII or UTF-8.
|
||
|
||
It may seem complex, but it only took me an afternoon to find a comprehensive
|
||
enough solution for what I need to do.
|
||
|
||
### Requests
|
||
|
||
Client requests are all 16 bytes long and have this structure:
|
||
|
||
```regex
|
||
^55aa([0-9a-f]{24})([0-9a-f]{4})$
|
||
```
|
||
|
||
* The first two bytes are always `55 aa` (`01010101 10101010` in binary);
|
||
* The next 12 bytes specify the client command. I will call them "payload" from
|
||
now on;
|
||
* Finally, there are two **little-endian** bytes indicating the packet number,
|
||
starting from `00 00`.
|
||
|
||
I noticed that the server doesn't check if the last two bytes are sent
|
||
sequentially, so they can remain at `00 00` throughout the message exchange.
|
||
|
||
### Responses
|
||
|
||
Server responses do not have a fixed length and are divided into two parts,
|
||
which I will call "header" and "payload." The header is always present and is
|
||
10 bytes long, while the payload can be absent.
|
||
|
||
When there's no payload, the message acts like a kind of `null`/`ACK`.
|
||
|
||
```regex
|
||
^aa55([0-9a-f]{16})(?:55aa([0-9a-f]+))?$
|
||
```
|
||
|
||
* The first two bytes are always `aa 55` (`10101010 01010101` in binary);
|
||
* The following eight bytes are the header. Usually, they are `01 01 00 00 00
|
||
00 00 00`, but they can change;
|
||
* If a payload is present, the message continues with `55 aa` (`01010101
|
||
10101010` in binary);
|
||
* The remaining bytes are the payload.
|
||
|
||
---
|
||
|
||
### Ping
|
||
|
||
If we want to perform a "ping" and check if the server responds, we can send a
|
||
request with the payload set to `01 80 00 00 00 00 00 00 00 00 00 00`:
|
||
|
||
```
|
||
55aa0180000000000000000000000100
|
||
aa550101000000000000
|
||
```
|
||
|
||
The server will then respond with a packet without a payload and the header set
|
||
to `01 01 00 00 00 00 00 00`.
|
||
|
||
### Employee name
|
||
|
||
Knowing the ID of an employee, it's possible to ask the server for their name
|
||
by sending a request with a payload set to `01 c7 xx xx xx xx 00 00 00 00 14
|
||
00`, where `xx xx xx xx` is a 32-bit **little-endian** integer representing the
|
||
employee ID.
|
||
|
||
```
|
||
55aa01c7xxxxxxxx0000000014000100
|
||
aa55010100000000000055aaxxxxxxxxxxxxxxxxxxxx4c0000000000595a7c7c0000
|
||
```
|
||
|
||
If the response header is set to `01 00 00 00 00 00 00 00`, then this means
|
||
that the username was not found. However, if it is set to `01 01 00 00 00 00 00
|
||
00`, then the first 10 bits of the payload represent the employee's name.
|
||
|
||
If the name is shorter than 10 characters, the remaining space will be filled
|
||
with null-terminator characters `\0`.
|
||
|
||
These messages comprise almost the entirety of the third phase I described in
|
||
the last chapter, the one with many but small messages. This suggests that the
|
||
client quickly dumps the attendance data, then spends two whole minutes
|
||
downloading the employee's name **for each attendance**, even if it's been
|
||
requested before. Someone should teach these developers the concept of
|
||
[memoization](https://en.wikipedia.org/wiki/Memoization)...
|
||
|
||
### Total number of records
|
||
|
||
To ask for the total number of attendances registered on the device, you need
|
||
to send a request with a payload of `01 b4 08 00 00 00 00 00 ff ff 00 00`:
|
||
|
||
```
|
||
55aa01b4080000000000ffff00000100
|
||
aa550101xxxx00000000
|
||
```
|
||
|
||
Where `xx xx` is the number of saved attendances represented as a 16-bit
|
||
**little-endian** integer.
|
||
|
||
65535 maximum requests seem a bit too few, but I guess it's a future-me
|
||
problem.
|
||
|
||
### Downloading all records
|
||
|
||
The list of all attendances must be downloaded in blocks, continuing to request
|
||
1024-byte blocks from the server (approximately 85.333 attendances at a time)
|
||
until the entire list is extracted.
|
||
|
||
To do this, we first have to request the total number of attendances, then send
|
||
a request with a payload of `01 a4 00 00 00 00 xx xx 00 00 00 04`, where `xx
|
||
xx` is the total number of attendances in **little-endian**.
|
||
|
||
```
|
||
55aa01a400000000xxxx000000040100
|
||
aa55010100000000000055aa ...
|
||
```
|
||
|
||
The server will respond with a 1026-byte payload, containing the initial
|
||
records followed by two zero bytes.
|
||
|
||
We can request another 1026-byte block by sending a request with a payload of
|
||
`01 a4 00 00 00 00 00 00 xx xx 00 04`, where `xx xx` is a **little-endian**
|
||
integer starting from `01 00`:
|
||
|
||
```
|
||
55aa01a4000000000000010000040100
|
||
aa55010100000000000055aa ...
|
||
```
|
||
|
||
Once the records are finished, the server will start sending padding bytes set
|
||
to `ff` to reach 1026-byte.
|
||
|
||
### Record structure
|
||
|
||
Once we have all the registration blocks, we can break them down into
|
||
individual registrations, each one 12 bytes long. I wasn't able to
|
||
understand what all the bytes represent, but the important ones are:
|
||
|
||
```regex
|
||
..([26ae]).{5}([0-9a-f]{8})([0-9a-f]{8})
|
||
```
|
||
|
||
* The second byte's two most significant bits indicate if the registration
|
||
represents an entry or an exit:
|
||
* If it's `00`, it's the first entry;
|
||
* If it's `01`, it's the first exit;
|
||
* If it's `10`, it's the second entry;
|
||
* If it's `11`, it's the second exit;
|
||
* The second-to-last four bytes represent the employee ID (in
|
||
**little-endian**);
|
||
* The last four bytes represent the date and time of the attendance (in
|
||
**little-endian**).
|
||
|
||
Initially I thought the date was represented as a UNIX Epoch, but it seems to
|
||
have this format when shown as big-endian:
|
||
|
||
* The first 6 bits represent the minutes;
|
||
* The next 5 bits represent the hours;
|
||
* The next 5 bits represent the days;
|
||
* The next 4 bits represent the months;
|
||
* The last 12 bits represent the years.
|
||
|
||
---
|
||
|
||
I suspect that the first four bytes of each attendance may contain:
|
||
|
||
* The seconds;
|
||
* The recording method (if the employee checked-in with the PIN, fingerprint,
|
||
or the badge);
|
||
* The recorder ID.
|
||
|
||
But since these aren't very important fields, I've decided to ignore them for
|
||
now.
|
||
|
||
## Testing using the terminal
|
||
|
||
If you want to test communication without writing any program that sends bytes
|
||
over a TCP socket, you can use some basic core utilities like `netcat` and
|
||
`xxd`:
|
||
|
||
```shell
|
||
# If you're using Bash or Zsh
|
||
function send_bytes { echo -n "$3" | xxd -r -p | timeout 1 nc "$1" "$2" | xxd; }
|
||
|
||
# If you're using Fish
|
||
function send_bytes -a ip port data
|
||
echo -n "$data" | xxd -r -p | timeout 1 nc "$ip" "$port" | xxd
|
||
end
|
||
|
||
send_bytes 127.0.0.1 5005 55aa0180000000000000000000000100
|
||
```
|
||
|
||
Trying some requests from the examples above, I can confirm everything seems to
|
||
work correctly. In the next article, we'll see how to create a small Rust
|
||
library to extract data from the reader.
|