Yuxuan Zhang

I'm a fifth year PhD at CIS department of Penn. My advisor is Sebastian Angel.

I'm broadly interested in bridging the gap between data center applications and server processors by a intersection of techniques in OS, compilers and hardware.

I currently build systems that leverage both hardware and software techniques to improve application's performance at runtime.

Here are my research statement, and my resume.

Education

PhD in Computer and Information Science, University of Pennsylvania [...]
- Ocolos: Online COde Layout OptimizationS
  Built a code layout optimization tool that can optimize the code layout of datacenter applications at runtime by first profiling and analyzing the application, then producing an optimized binary and finally inserting the machine code from the optimized binary to the target application process.
- RPG2: Robust Profile-Guided Runtime Prefetch Generation
  Built an online data cache prefetching system that can profile and analyze the behavior of data memory accesses and then make the decision of whether and where to insert the prefetch instructions into the running process.
- Merging serverless functions
  Building a feedback-driven serverless runtime that can merge caller-callee serverless functions to the same address space based on the call frequency and the call graph generated from distributed tracing, in order to reduce the overhead of RPCs.
MS in Electrical Engineering, University of Michigan, Ann Arbor [...]
- Two-way superscalar R10K Out-of-Order processor
  - Implemented 2-way associate non-blocking writeback data cache and its cache controller which maintains outstanding cache misses status.
  - Implemented key components such as Reservation Station, hardware register map table, Reorder Buffer, Load Store Queue of the OoO processor.
  - Modified visual debugging tools and re-design the testbench to support performance analysis of the OoO processor.
- Design and Verify a Cache Coherency Protocol
  - Designed and verified an invalidation based MOESI self-downgrade cache coherence protocol for the multicore memory system by enumerative model checker Murphi.
- Wikipedia Search Engine
  - Built a scalable search engine which supports information retrieval based on both tf-idf and PageRank scores.
  - Indexed webpages with Hadoop MapReduce framework to scale to large corpus sizes.
  - Built a new search engine interface with two special features: user-driven scoring and summarization.
BS in Electrical Engineering, Harbin Institute of Technology

Publications

OCOLOS: Online COde Layout OptimizationS
[paper] [code] [slides]
Y. Zhang, T. A. Khan, G. Pokam, B. Kasikci, H. Litz, J. Devietti.
Proc. International Symposium on Microarchitecture (MICRO), Oct. 2022.
Online COde Layout OptimizationS via Ocolos
[paper]
Y. Zhang, T. A. Khan, G. Pokam, B. Kasikci, H. Litz, J. Devietti.
IEEE Micro "Top Picks From the 2022 Computer Architecture Conferences", May. 2023.
RPG2: Robust Profile-Guided Runtime Prefetch Generation
[paper] [code] [slides] [poster]
Y. Zhang, N. Sobotka, S. Park, S. Jamilan, T. A. Khan, B. Kasikci, G. Pokam, H. Litz, J. Devietti
Proc. International Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), May. 2024.

Employment History

Software Engineer Intern, VMware [...]
- Monitor Team, Boston, 05.2022 - 08.2022
- Prevalidation during Pre-copy of memory pages
Research Intern, Microsoft Research Asia [...]
- Network Research Group, Beijing, 01.2018 - 07.2019
- GLane on GPU
Software Engineer Intern, NVidia

Miscellaneous

I enjoy writing fan fictions (website in Chinese), and my self-published fan fictions have been sold more than 1,100 copies in China.
I’m also a Japanese language learner in UPenn Japanese Language Program. I passed JLPT N3 and this year I'm targeting on N2.
I had been learning Chinese dance for 7 years and playing the piano for 5 years, and received certificates from Beijing Dance Academy and Central Conservatory of Music