Avoiding Rust Deadlocks via Visualizing Lifetime
- Team Name: Song's research group at Pennsylvania State University
- Payment Address: TBD
- Level: 3
Project Overview πβ
Overviewβ
Rust is a new programming language designed to provide both safety guarantees that are like high-level languages and performance guarantees that are like low-level languages. To achieve this design purpose, Rust leverages static compiler checks to rule out severe memory and thread safety issues at the compilation time. At runtime, Rust behaves like C/C++ and could deliver performance that is as good as C/C++. Due to its safety and performance benefits, Rust has seen increasing adoption in building low-level systems software, such as OSs and browsers. Rust is also used to implement many software systems in the Web3 technology stack (e.g., substrate, polkadot, ink!).
Rust's compiler checks are based on a suite of ownership and lifetime rules. The basic rule allows one value to only have one owner variable, and the value is dropped (freed) when its owner variable ends its lifetime. Rust extends the basic rule to allow ownership to be moved and borrowed, while still guaranteeing all accesses to a value to be within its owner variable's lifetime scope. Besides safety checks, lifetime is also used for automated resource management. For example, there is no explicit Unlock() in Rust. A Lock() function call returns a reference to the protected variable, and when the reference ends its lifetime, Rust automatically releases the acquired lock (by implicitly calling Unlock()).
Rust's lifetime rules are complex and different from all other existing languages. It is challenging for Rust programmers to infer where a variable's lifetime ends. As a result, it is not uncommon for programmers to incorrectly identify the location where an implicit unlock is called. When a lock is held longer than programmers' expectation, the same lock may be acquired again or a different lock may be acquired before releasing the acquired lock, leading to a double-lock error or a lock-in-conflicting-orders error.
In our previous work [1], we conducted an empirical study on real-world Rust concurrency bugs. We inspected GitHub commit logs for five Rust applications and five Rust libraries to collect previously fixed concurrency bugs. In total, we found 37 deadlocks due to the misunderstanding of where the implicit Unlock() is called, including 30 double locks and seven locks acquired in conflicting orders. These deadlocks constitute almost all lock-related concurrency bugs (37/38) in our collection. They are all from popular Rust software systems (e.g., Servo, Parity-Ethereum, TiKV, Redox) and have severely hurt the reliability of those systems before they were fixed.
A brief description of the project. We propose to build an IDE tool for visualizing the lifetime scope of a user-selected Rust variable. We believe our tool can help Rust programmers avoid deadlocks at the development stage. After writing a piece of code involving a mutex, a programmer can select the return value of a locking operation or the locking operation itself (when the return is not saved to a variable). Our tool will visualize the lifetime scope of the return value (i.e., the critical section). The programmer can then inspect whether the end of the critical section is expected. In addition, our tool will conduct deadlock detection for the selected critical section and provide detailed debugging information for identified bugs, such as highlighting blocking operations or function calls leading to blocking operations.
How our tool will be integrated into Substrate/Polkadot? Both Substrate and Polkadot are implemented in Rust. Previously, double locks or locks in conflicting orders were fixed in Substrate [2, 3]. After applying our prototype, we identified four previously unknown double locks in Substrate or the dependent libraries of Substrate/Polkadot. We reported detected bugs. All of them were confirmed and fixed by developers [4, 5, 6]. We believe our tool will effectively prevent Substrate/Polkadot programmers from making similar mistakes and other types of mistakes our tool will reveal.
Why are we interested in creating this project? We are interested in building the tool due to three reasons. First, our previous empirical study shows that deadlocks due to the misunderstanding of Rust's lifetime rules are common in Rust programs. Visualizing lifetime can avoid these bugs during the development stage, benefiting the whole Rust community. Second, the misunderstanding of Rust's lifetime rules can also cause memory bugs such as use-after-free and double free. Thus, the proposed tool has the potential to combat memory bugs. Third, the experience of building the proposed tool can inspire similar tools for other programming languages featuring lifetime (e.g., Kotlin).
[1] Boqin Qin, Yilun Chen, Zeming Yu, Linhai Song, and Yiying Zhang. βUnderstanding Memory and Thread Safety Practices and Issues in Real-World Rust Programs.β In PLDI'2020.
[2] https://github.com/paritytech/substrate/pull/197
[3] https://github.com/paritytech/substrate/pull/6225/commits/61e3b8d53674687790d2b30bc450cd59e09f563d
[4] https://github.com/paritytech/parity-db/pull/8
[5] https://github.com/paritytech/substrate/pull/6277
[6] https://github.com/paritytech/parity-common/pull/396
Project Detailsβ
What have we already done? We have built a prototype of the proposed tool. Our prototype can visualize a selected variable and conduct double-lock detection. We published a demonstration paper at CCS'2020 to describe the prototype. The paper can be found here: https://songlh.github.io/paper/vr.pdf. We also recorded a video to explain the prototype, and the video can be found here: https://youtu.be/L5F_XCOrJTQ.
We applied the double-lock detection component to Substrate, Polkadot, and ink!. We found four previously unknown deadlocks. One is in Substrate. The other three are in the dependent libraries of Substrate or Polkadot. We reported all the detected bugs. All of them were fixed by developers based on our reporting. The information of the detected bugs is listed as follows:
[PR-1] https://github.com/paritytech/parity-db/pull/8
[PR-2] https://github.com/paritytech/substrate/pull/6277
[PR-3] https://github.com/paritytech/parity-common/pull/396
What are we going to do? We propose to extend the prototype along four directions:
First, we will integrate our existing implementation of lifetime computation and deadlock detection to rust-analyzer, so that our proposed technique can easily cooperate with different text editors.
Second, we will detect more types of deadlock bugs. Specifically, we will add the detection of locks with conflicting orders and misuse of mutex and non-mutex synchronization primitives (e.g., channel, conditional variable).
Third, we will identify and visualize more blocking operations that can potentially lead to deadlocks in a selected critical section such as receiving from a channel and waiting on a conditional variable.
Fourth, we will implement the visualization functionality by parsing the analysis results generated by rust-analyzer in a text editor (i.e., VS Code) and document our tool.
Ecosystem Fitβ
There is no existing project similar to ours.
Team π₯β
Team membersβ
- Name of team leader: Linhai Song
- Names of team members: Linhai Song, Yiying Zhang, Ziyi Zhang
Contactβ
- Contact Name: Linhai Song
- Contact Email: songlh@ist.psu.edu
- Website: https://songlh.github.io/
Legal Structureβ
- Registered Address: Information of our legal structure will be disclosed privately.
- Registered Legal Entity: Pennsylvania State University
Team's experienceβ
The team conducted an empirical study on memory bugs and concurrency bugs in real-world Rust programs. The study was published in PLDI'2020. Through this project, the team has built a comprehensive understanding of common errors made by programmers when coding Rust.
The team built a prototype for the proposed tool. The prototype was published in the demonstration track of CCS'2020, demonstrating the team's capability to build the proposed technique.
The team has another research paper on understanding concurrency bugs in Go published in ASPLOS'2019. The team has more than three years' research experience on concurrency bugs.
Team member Linhai Song has 10 years of expertise in programming analysis, and has published at top programming languages and software engineering conferences (e.g., PLDI, ICSE, FSE, OOPSLA).
Team member Yiying Zhang has conducted various systems research with papers published at OSDI and SOSP.
Team Code Reposβ
- LDoctor (ICSE '17)
- Rust-Study (PLDI '20)
- LegoOS (OSDI '18)
- pDPM (ATC '20)
- Pythia (USENIX SEC '20)
- LITE (SOSP '17)
- Hotpot (SoCC '17)
Please also provide the GitHub accounts of all team members. If they contain no activity, references to projects hosted elsewhere or live are also fine.
Team LinkedIn Profiles (if available)β
Development Status πβ
We have built a prototype of the proposed tool. We wrote one paper and recorded one video to describe the prototype.
We applied the bug detection component of the prototype to Substrate, Polkadot, and ink!. We found four previously unknown deadlocks. We reported all the detected bugs and all of them were fixed based on our reporting [PR-1, PR-2, PR-3].
[PR-1] https://github.com/paritytech/parity-db/pull/8
[PR-2] https://github.com/paritytech/substrate/pull/6277
[PR-3] https://github.com/paritytech/parity-common/pull/396