Breakthroughs in the State Key Laboratory of Processors :Automated Processor Design

Date: Jul 28, 2024

Processor design is a highly challenging and labor-intensive task. In conventional CPU design flow, a team of talented engineers uses formal programming languages (e.g., Verilog, Chisel, or C/C++) to implement the circuit logic of a CPU based on design specifications.  Then billions of dedicated test cases with both inputs and their expected outputs are developed to test the functionality of the circuit logic for verification and manual debugging. Thus, the conventional design flow remains highly complex and needs non-trivial manual effort in hand-crafted formal program coding. 

The goal of automated processor design is to automatically generate the processor logic that meets functional and performance requirements. Without tedious human-programming effort, automated processor design can significantly accelerate design iterations. Since A. Church,  one of the founders of computer science, proposed the "Church problem" in 1957, the automated processor design has become a long-term vision in artificial intelligence. However, due to the high accuracy requirements and large design space of processor chip circuits, state-of-the-art automated methods have not ensured this strict accuracy even on much easier small-scale cases. Thus the processor logic can only be designed by human experts. This has become the efficiency bottleneck in design throughout the entire design process.

Led by Prof. Chen Yunji, the director of the State Key Laboratory of Processors, the team focuses on solving these accuracy and scale challenges. They have proposed a verification-centered approach for automated processor design: starting from random circuits, the machine-learning method automatically executes iterations including verification, debugging, and repairing until the target circuit that meets the entire design requirements is obtained. In each iteration, the design quality can be improved.

Specifically, the team formalized the automatic processor design problem into a large-scale logic representation problem from the input/output examples, which are easily accessible from a large number of legacy or automatically generated test cases.  To address this accuracy challenge, we propose to automatically design CPU-scale circuit logic with a novel graph structure called Binary Speculation Diagram (BSD). BSD is an approximate representation of the well-known Binary Decision Diagram (BDD). It is theoretically proved that along with the BSD expansion, its accuracy increases gradually up to 100%. Therefore, by continuously expanding the BSD scale to meet strict accuracy constraints from a compact approximation, the automated design flow iteratively enables verified implementation for large-scale circuit design. The above method automatically designed a general-purpose processor with over 4 million logic gates, i.e. Enlightenment 1, within 5 hours, increasing the circuit scale by about 1000x against the state-of-the-art. The Enlightenment-1 is the world's first fully automated processor chip designed without human intervention, successfully running the Linux operating system and its Dhrystone performance is comparable to Intel 486. The related paper 'Automated CPU Design by Learning from Input-Output Examples' has been accepted by IJCAI 2024.

To further improve the performance of the automated-designed processor, the team proposed an automatic pipelining design method based on gate-level dependency analysis. Data dependency analysis is a key factor in the pipeline design. Unlike conventional data dependency analysis which can only be performed at the high level, overlooking some potentials in the netlist structures, this method automatically performs data flow analysis at the fine-grained gate level. Based on the analysis, a fine-grained pipeline control unit was constructed with the help of the proposed BSD. While ensuring the design functionality, the program execution efficiency is improved by gate-level forwarding and speculation, achieving an average performance improvement of 1.57x; More importantly, in some cases, it can find better pipeline designs than human designs, with an average throughput improvement of 31%. The related paper 'Revising Automatic Pipelining: Gate level Forwarding and Specification' has been accepted by DAC 2024.

Related links:

[1] Cheng, S. et al. Preprint at arXiv https://doi.org/10.48550/arXiv.2306.12456 (2023)


downloadFile