#### Education ### **Georgia Institute of Technology** Atlanta, GA Ph.D. in Computer Science Sept 2021 - Now • Advisor: Prof. Hyesoon Kim; Concentration: Computer Architecture ## **University of Michigan** Ann Arbor, MI Bachelor of Science in Computer Engineering Sept 2018 - May 2020 # Shanghai Jiao Tong University (SJTU) Shanghai, China Bachelor of Science in Electronic and Computer Engineering Sept 2016 - May 2018 • Dual degree program at University of Michigan and SJTU ## Research Projects ### **Modeling Offload Decisions for Computational Storage** Mar 2022 - Now Advisor: Prof. Hyesoon Kim Atlanta. GA - Researched heterogeneous system scheduling with focus on Computational Storage - Designed performance model and profiling mechanism for Samsung SmartSSD - Studied SSD internal architecture and CXL technology to exploit potential performance benefits with accelerator integration ## Dynamic Graph Processing with Process-In-Memory (PIM) Sep 2021 - Feb 2022 Advisor: Prof. Hyesoon Kim Atlanta, GA - Researched graph-related memory enhancements and PIM simulators - Developed utility tools to profile dynamic graph benchmarks with MultiPIM and studied the effect of dataset size, graph partitioning and thread allocation #### Tail latency optimization leveraging Programmable NIC Sep 2021 - Dec 2021 Advisor: Prof. Alexandros Daglis Atlanta. GA - Develop multi-threaded RDMA benchmarks with OpenMP to emulate request service time and measure the end-to-end request latency - Studied the relation between service time distribution and tail latency with load-latency graphs #### **Cross Page Translation Speculation** Aug 2019 - Apr 2020 Advisor: Prof. Trevor Mudge Ann Arbor, MI - Researched state-of-the-art memory architectures, including 3D-stacked last level cache, page table walk management unit, virtualization, process in memory technology, etc. - Proposed speculative Translation Look-aside Buffer (TLB) and nearest-entry TLB translation prediction by exploiting contiguity in physical address allocation - Profiled and analyzed graph benchmarks with different graph size using Linux perf on TLB and page table walk performance in both host and virtualization environment - Customize DynamoRIO memory system simulator for speculative TLB - Work published in SAMOS 2020: CoPTA: Contiguous pattern speculating TLB architecture ## **Reconfigurable Error Correction Code (ECC) Accelerator** Jan 2019 – Apr 2020 Advisor: Prof. Hun-Seok Kim Ann Arbor, MI - Studied and implemented ECC algorithms including Polar, LDPC and Turbo/Viterbi - Proposed an PE-memory interconnect architecture for general belief propagation ECC algorithms with deterministic control pattern; mapped three ECC algorithms onto the same proposed architecture - Simulated the architecture in MATLAB and evaluated the decode bit error rate performance; implement in Verilog to evaluate the decoder power, frequency and throughput - Work published in ISLPED 2022: A Unified Forward Error Correction Accelerator for Multi-Mode Turbo, LDPC, and Polar Decoding ### Work Experience ### **Systems Technology Research Intern** San Jose, CA Samsung Semiconductor *May 2022 – Aug 2022* - Studied Samsung's SmartSSD prototype product and designed a performance model to predict the acceleration performance based on device experiments - Designed and developed a tool with Intel pin to divide and profile separate workload regions for offload performance prediction ### **Hardware Engineer** Palo Alto, CA SambaNova Systems *July 2020 – Aug 2021* - Designed and optimized hardware programming for layer normalization on RDU accelerator platform with in-house assembly and integrated the template across software stack - Implemented DDR shim RTL for new chip generation; Adapted DDR4 to dual-channel DDR5 - Optimized DDR shim microarchitecture: proposed fine-grained credit management to adapt to more requesters; proposed fifo and arbitration structure to improve shim response throughput - Automated performance analysis of parallelism in convolution templates with python; collected throughput and resource utilization and analyzed patterns for optimal resource efficiency #### **Projects** ## Defending contention-based side-channel attack in on-chip networks Oct 2021 – Mar 2022 - Customize Garnet from Gem5 to reproduce network contention based attacks - Implemented and analyzed performance tradeoff #### R10K-Style Out-of-Order RISC-V Processor | EECS 470 Jan 2019 - May 2019 - Led a group of 5 people to decide on high level pipeline module abstraction - Implemented arbitrary way superscalar R10K style out-of-order processor using System Verilog and Synopsys, run at 127MHz and 2.0 CPI - Analyzed performance tradeoff on victim cache design, superscalar ways and load/store queue design in terms of clock period and CPI ### **Experiences** ## **Teaching Assistant** | Computer Architecture Sep 2019 - May 2020 - Coordinate course projects and hold office hours - Give and grade homework and exam problems #### **Writing Consultant** | *English Writing Center* Aug 2017 - Aug 2018 - · Advise SJTU students on writing topics including thesis, structure building and use of language - Attend weekly training with the consultant group to improve peer communication skills - Design workshops on writing topics such as database searching #### Leadership ### **Odyssey of the Mind Student Club** May 2017 - Aug 2018 President Shanghai, China - Led 10+ members in the creativity competition "Odyssey of the Mind" - · Guided weekly brainstorming events - Designed, built and tested 3 functional prototype vehicles based on Arduino microcontroller - Developed and tuned PID control algorithm using a gyroscope to control the vehicle's trajectory #### Skills **Software Programming Languages**: C/C++, Python, Shell, LaTeX, MATLAB, ARM Assembly **Hardware Programming Languages**: Verilog HDL, System Verilog, tcl **Developer Tools**: llvm, VS Code, Cadence, Synopsys, Xilinx Vivado/Vitis, Jupyter Notebook **Architecture tools**: Intel pin, Gem5(Garnet), MultiPIM, Ramulator PIM, DynamoRIO