Skip to content

Latest commit

 

History

History
32 lines (28 loc) · 1.5 KB

File metadata and controls

32 lines (28 loc) · 1.5 KB

Readme

How to run?

To build the binary:

cargo build --release

To run the binary:

cd src && cargo run --release

You should change your location to src/ because I hard-coded the file name. Also, the predicted label for each test samples is output to the console

How to time it?

You should first build the binary, and then:

cd src && time cargo run --release

Summary

Debug build is much much slower than release build. Don’t use debug build! In v0, I implemented kNN with k=1, then use built-in sort for each test sample when k=7, it runs for about 24 seconds. In v1, I first considered using space partitioning tree, however, space paritioning tree will enforce several constraints on the distance function, so I choose a simpler method because I want the program to be simple. I choose to use a thiry-party crate that implements pattern defeating quick select algorithm, this is the go-to method when you just want to get k-largest item in a sequence but don’t want to sort it. The run time shrinks to around 14 seconds In v2, I add parallel processing to classification of each independent test samples, the program used about 6 seconds. In v3, I noticed that the classification used less than 1 second in v2, but it’s the file parsing that takes the most time, so I used the same trick in v2 to parallelize the parsing function, and the final program takes about 0.5 second to run.

History version

  • v0.rs
  • v1.rs
  • v2.rs
  • v3.rs(main.rs)