Routing is a critical part of the FPGA CAD flow. Its solution quality greatly impacts design speed and power, and its run time significantly contributes to overall compile time. As FPGA design size grows but per-core CPU performance stagnates, parallel FPGA routing algorithms that can leverage multiple compute cores become increasingly valuable. Most prior work in parallel FPGA routing uses coarse-grained approaches that route nonoverlapping nets in parallel; this work targets a complementary fine-grained form of parallelism in which the shortest-path algorithms that complete a single connection are multi-threaded. We speed up several related shortest path algorithms (Dijkstra’s, A*, and directed) by utilizing a concurrencyfriendly but weakly ordered data structure, the MultiQueue, and enhance the algorithms to compensate for its imperfect ordering of partial routings. Compared to the VTR 8+ router, these techniques achieve routing time reductions of 18.7× and 13.2× on average over the Titan benchmark suite when using Dijkstra’s and A* path search, respectively, on a 12-core CPU. These parallel algorithms achieve wirelength and critical path delay quality comparable to the serial router; they are also deterministic and serially equivalent. When applied to directed search routing, the parallel path search achieves a speed-up of 1.98× and a slightly higher quality than the serial VTR router, but is nondeterministic. Thanks to queue improvements, our router at 1-thread is 1.7× faster than VTR’s for Dijkstra’s and A* search, but comparable in run time for directed search.
I had the privilege of presenting this research paper at the International Conference on Field Programmable Technology (FPT) 2024 in the stunning city of Sydney, Australia. FPT 2024 received 69 paper submissions, and I’m honored that this work was one of the 19 papers accepted (27.5% acceptance rate)—where this paper was one of 3 papers nominated for the FPT 2024 Best Paper Award!