+++ title = "Re-implementing a protocol in Rust" summary = "Setp 3: Creating a library based on reverse engineering" date = "2024-08-01" tags = ["Library", "Attendance Reader", "TCP", "Rust"] categories = ["Projects"] series = ["Attendance Reader"] series_order = 3 +++ In the previous article, we managed to understand the meaning of the packets exchanged between the official client and the attendance reader. There is only one thing left to do: **Rewrite the API in Rust!** ![Rewrite it in Rust](images/01-rewrite-it-in-rust.jpg) ## Recreating the Official API To start, let's [install Rust](https://rust-lang.org/tools/install/) and create a new project using Cargo, Rust's package manager, using the following command: ```shell cargo new r701 ``` We can then open the project with our [text editor of choice](https://neovim.io/). Since we need to create a library, let's create the file `src/lib.rs` and start writing the struct that will describe our reader: ```rust // src/lib.rs use std::io::Result; use std::net::{TcpStream, ToSocketAddrs}; #[derive(Debug)] pub struct R701 { tcp_stream: TcpStream, sequence_number: u16, } impl R701 { pub fn connect(connection_info: impl ToSocketAddrs) -> Result { // Create a new R701 struct let mut new = Self { tcp_stream: TcpStream::connect(connection_info)?, sequence_number: 0, }; // Try to ping the endpoint new.ping()?; Ok(new) } } ``` Our struct contains two fields: * `tcp_stream`, which contains the descriptor of the connection to our reader; * `sequence_number`, which stores the number of the last packet sent. To test if our struct connects correctly, we can modify the file `src/main.rs` so that it connects to our endpoint: ```rust // src/main.rs use r701::R701; fn main() { let r701 = R701::connect("127.0.0.1:5005").unwrap(); println!("{:?}", r701); } ``` If we now run `cargo run`... ![Output of cargo run](images/02-tcp-working.png "Here is the client connecting") **Hurray!** Our client successfully connects to the TCP server! The next step will be to use the library [std::net::TcpStream](https://doc.rust-lang.org/std/net/struct.TcpStream.html) to execute the queries we derived from our attempt at [reverse engineering](/posts/2024/05/studying-a-communication-protocol) and obtain and process the responses. Since [all requests have a standard structure](/posts/2024/05/studying-a-communication-protocol#requests), we can create a method that takes as input the payload of a request (represented by a slice of 12 `u8`) and returns a `Vec` containing the response: ```rust { hl_lines=["6-23"] } // src/lib.rs impl R701 { // ... pub fn request(&mut self, payload: &[u8; 12]) -> Result> { // Create a blank request let mut request = [0x55, 0xaa, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]; // Insert the payload request[2..14].clone_from_slice(payload); // Insert the sequence number request[14..].clone_from_slice(&self.sequence_number.to_le_bytes()); self.sequence_number += 1; // Send the request self.tcp_stream.write_all(&request)?; // Create a buffer and return the response let mut buffer = BufReader::new(&self.tcp_stream); Ok(buffer.fill_buf()?.to_vec()) } } ``` We can verify that everything works correctly by sending a [ping packet](/posts/2024/05/studying-a-communication-protocol#ping) and expecting the correct response: ```rust { hl_lines=["7-10"] } // src/main.rs use r701::R701; fn main() { let r701 = R701::connect("127.0.0.1:5005").unwrap(); assert_eq!( r701.request(&[0x01, 0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]).unwrap(), [0xaa, 0x55, 0x01, 0x01, 0, 0, 0, 0, 0, 0], ); } ``` We could even make ping a method in our struct: ```rust { hl_lines=["6-16"] } // src/lib.rs impl R701 { // ... pub fn ping(&mut self) -> Result<()> { // Create a request with a payload of `01 80 00 00 00 00 00 00 00 00 00 00` let response = self.request(&[0x01, 0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])?; // If the response is not `aa 55 01 01 00 00 00 00 00 00` then return an error if response != [0xaa, 0x55, 0x01, 0x01, 0, 0, 0, 0, 0, 0] { return Err(Error::new(InvalidData, "Malformed response")); } Ok(()) } } ``` In this way we can also create methods to obtain the [name of an employee](/posts/2024/05/studying-a-communication-protocol#employee-name), the [total number of records](/posts/2024/05/studying-a-communication-protocol#total-number-of-records), and a [block of records](/posts/2024/05/studying-a-communication-protocol#downloading-all-records). If you are interested, all the source code is already available at [nicolabelluti/r701](https://git.nicolabelluti.me/nicolabelluti/r701/src/branch/main/src/r701.rs). {{< gitea server="https://git.nicolabelluti.me" repo="nicolabelluti/r701" >}} ## Extracting Attendances via the TryInto Trait Once we have created the method that allows us to extract a block of attendances, we need to find the idiomatic way to transform it from an array of bytes to a struct that represents a single attendance. To start, let's do some *refactoring* by renaming `src/lib.rs` to `src/r701.rs` and creating a new `src/lib.rs` containing these lines: ```rust // src/lib.rs mod r701; pub use r701::R701; ``` This way, the external interface of our library will not change, but we can organize our code into different files. Let's add the file `src/record.rs` and include it in `src/lib.rs` ```rust { hl_lines=[3,6] } // src/lib.rs mod r701; mod record; pub use r701::R701; pub use record::{Record, Clock}; ``` ```rust // src/record.rs use chrono::{DateTime, Local, TimeZone}; pub enum Clock { FirstIn, FirstOut, SecondIn, SecondOut, } pub struct Record { pub employee_id: u32, pub clock: Clock, pub datetime: DateTime, } ``` With this code, we have defined the structure of a record, which, as we mentioned in the previous article, consists of the employee ID, the date and time it was recorded, and the state (whether it is the first entry, the first exit, the second entry, or the second exit). Since we don't want to [go crazy managing time](https://www.youtube.com/watch?v=-5wpm-gesOY), let's import the [chrono](https://crates.io/crates/chrono/) *crate* for date management: ```shell cargo add chrono --no-default-features --features clock ``` To facilitate the conversion from a byte vector to our `Record` struct, we can implement the [TryInto](https://doc.rust-lang.org/std/convert/trait.TryInto.html) trait: ```rust // src/record.rs impl TryFrom<&[u8]> for Record { type Error = &'static str; fn try_from(record_bytes: &[u8]) -> Result { // ... } } ``` The finished code is available [here](https://git.nicolabelluti.me/nicolabelluti/r701/src/branch/main/src/record.rs#L32). We can test if the conversion is correct through a simple test: ```rust // src/record.rs // ... #[cfg(test)] mod tests { use super::*; #[test] fn valid_record_conversion() { let record_bytes: &[u8] = &[0x10, 0x23, 0x0b, 0x1d, 0x01, 0, 0, 0, 0xb2, 0x17, 0x01, 0]; assert_eq!( record_bytes.try_into(), Ok(Record { employee_id: 1, clock: Clock::FirstIn, datetime: Local.with_ymd_and_hms(1970, 1, 1, 0, 0, 0).single().unwrap(), }) ) } } ``` ## Putting It All Together with Iterators Once we have found a way to extract bytes from the device and a way to convert them into a struct, we need to find the idiomatic way to combine the two, and this is where iterators come into play. To implement the [Iterator](https://doc.rust-lang.org/std/iter/trait.Iterator.html) trait, we only need to define the `next()` method, which, starting from the first element, returns the next element. Once this method is defined, we will have access to many other tools, such as [map()](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map), [filter()](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.filter), [fold()](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.fold), and, if we import the [itertools](https://crates.io/crates/itertools) *crate*, also [sorted()](https://docs.rs/itertools/0.12.1/itertools/trait.Itertools.html#method.sorted) and [into_group_map_by()](https://docs.rs/itertools/0.12.1/itertools/trait.Itertools.html#method.into_group_map_by), just to name a few. First, let's create a new struct `RecordIterator` with a `from()` constructor that allows us to generate an iterator by taking a mutable reference to an `R701` struct as input: ```rust { hl_lines=[4,8] } // src/lib.rs mod r701; mod record; mod record_iterator; pub use r701::R701; pub use record::{Record, Clock}; pub use record_iterator::RecordIterator; ``` ```rust // src/record_iterator.rs use crate::R701; use std::io::Result; #[derive(Debug)] pub struct RecordIterator<'a> { r701: &'a mut R701, input_buffer: Vec, sequence_number: u16, total_records: u16, record_count: u16, } impl<'a> RecordIterator<'a> { pub fn from(r701: &'a mut R701) -> Result { // ... } } ``` The `from()` method requires the reader to provide the total number of timestamps and the first block of attendances, saving them respectively in the `total_records` variable and the `input_buffer` vector. The `next()` method of the `Iterator` trait will then take the first 12 bytes from the `input_buffer` and transform them into a `Record` struct using the `TryInto` trait that we implemented in the previous chapter. When the `input_buffer` is empty, the reader is requested for another block of attendances until all are read. If you are interested, all the code is already [available on Git](https://git.nicolabelluti.me/nicolabelluti/r701/src/branch/main/src/record_iterator.rs). ```rust // src/record_iterator.rs // ... impl<'a> Iterator for RecordIterator<'a> { type Item = Record; fn next(&mut self) -> Option { // ... } } ``` Just for completeness, we can implement an `into_record_iter` method in the `R701` struct to simplify the use of the iterator: ```rust { hl_lines=[2,"9-11"] } // src/r701.rs use crate::RecordIterator; // ... impl R701 { // ... pub fn into_record_iter(&mut self) -> Result { RecordIterator::from(self) } } ``` ## Making Everything *Blazingly Fast* First, let's create a main that creates a file with the same structure as the `AGLog_001.txt` file we saw in the [first chapter](/posts/2024/04/reverse-engineering-an-attendance-reader/#dumping-the-records-via-usb) of this series: ```rust // src/main.rs use r701::R701; fn main() { let mut r701 = R701::connect("127.0.0.1:5005").unwrap(); println!("No\tMchn\tEnNo\t\tName\t\tMode\tIOMd\tDateTime\t"); r701.into_record_iter() .unwrap() .collect::>() .iter() .enumerate() .for_each(|(id, record)| { let name = r701 .get_name(record.employee_id) .unwrap() .unwrap_or(format!("user #{}", record.employee_id)); println!( "{:0>6}\t{}\t{:0>9}\t{: <10}\t{}\t{}\t{}", id + 1, 1, record.employee_id, name, 35, record.clock as u8, record.datetime.format("%Y/%m/%d %H:%M:%S"), ); }); } ``` With this `main()`, we can obtain all the records in just under a minute, which is half the time taken by the [official closed-source client](/posts/2024/05/studying-a-communication-protocol/#client-configuration). We are slightly cheating, as our client cannot extract the ID of the recorder, the attendance recording method, and the seconds of the `DateTime` field, but for now we can ignore them as they are superfluous fields. ### Memoizing Employee Names To speed things up even more, we could avoid asking the reader for the name of the employee for each record. We can create a `HashMap` of names and, for each record, check if the name is already present in it. If not, we can ask the reader for the employee's name and then save it in the `HashMap`. This way, we reduce the number of requests to the minimum required. ```rust { hl_lines=[3,6,"16-20"] } // src/main.rs use r701::R701; use std::collections::HashMap; fn main() { let mut names = HashMap::new(); let mut r701 = R701::connect("127.0.0.1:5005").unwrap(); println!("No\tMchn\tEnNo\t\tName\t\tMode\tIOMd\tDateTime\t"); r701.into_record_iter() .unwrap() .collect::>() .iter() .enumerate() .for_each(|(id, record)| { let name = names.entry(record.employee_id).or_insert_with(|| { r701.get_name(record.employee_id) .unwrap() .unwrap_or(format!("user #{}", record.employee_id)) }); // ... }); } ``` With this simple modification, we go from obtaining all records in a minute to obtaining them in **one second**. Now that is *blazingly fast*! ### Limiting Attendance Reading to a Certain Time Frame Since I am interested in the data from the last month, we can use the [take_while()](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.take_while) and [skip_while()](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.skip_while) methods to exclude all elements prior to last month and to stop the iterator once all relevant records have been extracted: ```rust { hl_lines=[4,"7-8","16-17"] } // src/main.rs use r701::R701; use std::collections::HashMap; use chrono::{Local, TimeZone}; fn main() { let start = Local.with_ymd_and_hms(2024, 7, 1, 0, 0, 0).unwrap(); let end = Local.with_ymd_and_hms(2024, 8, 1, 0, 0, 0).unwrap(); let mut names = HashMap::new(); let mut r701 = R701::connect("127.0.0.1:5005").unwrap(); println!("No\tMchn\tEnNo\t\tName\t\tMode\tIOMd\tDateTime\t"); r701.into_record_iter() .unwrap() .take_while(|record| record.datetime < end) .skip_while(|record| record.datetime < start) .collect::>() .iter() .enumerate() .for_each(|(id, record)| { // ... }); } ``` This modification does not improve performance in any way, but there is one last very simple improvement we can apply for this specific use case... ### Reading Records in Reverse Instead of starting from the first record ever registered and excluding all records until we reach the first of the month we're interested in, we could read the records in reverse, starting from the most recent one and going back to the oldest. This improvement requires [a few modifications](https://git.nicolabelluti.me/nicolabelluti/r701/compare/f0ac5fe7..0dd05c0d#diff-44adb0ed617220e3fd4a4bbb2e361059ac47d9c4), but it is worth it considering that it reduces the time from just under a second to **0.2 seconds**! ```rust { hl_lines=["11-12",15] } // src/main.rs // ... fn main() { // ... println!("No\tMchn\tEnNo\t\tName\t\tMode\tIOMd\tDateTime\t"); r701.into_record_iter() .unwrap() .take_while(|record| record.datetime >= start) .skip_while(|record| record.datetime >= end) .collect::>() .iter() .rev() .enumerate() .for_each(|(id, record)| { // ... }); } ```