Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC), 2014
Delays;Regulators;Fuzzy logic;Network-on-chip;Throughput;Pragmatics;Multiprocessor interconnection;Network-on-Chip;Chip Multiprocessor;Flow Regulation;Fuzzy Logic
Recommended citation: Y. Yao and Z. Lu, "Fuzzy Flow Regulation for Network-on-Chip based Chip Multiprocessors systems," 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC), Singapore, 2014, pp. 343-348, doi: 10.1109/ASPDAC.2014.6742913.
Download Paper
Published in 2014 Eighth IEEE/ACM International Symposium on Networks-on-Chip (NoCS), 2014
Stochastic processes;Delays;Interference;Calculus;Analytical models;Servers;System-on-chip[<35;31;32M
Recommended citation: Z. Lu, Y. Yao and Y. Jiang, "Towards stochastic delay bound analysis for Network-on-Chip," 2014 Eighth IEEE/ACM International Symposium on Networks-on-Chip (NoCS), Ferrara, Italy, 2014, pp. 64-71, doi: 10.1109/NOCS.2014.7008763.
Download Paper
Published in 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2016
DVFS, Multi-core
Top conference publication-HPCA
Recommended citation: Y. Yao and Z. Lu, "DVFS for NoCs in CMPs: A thread voting approach," 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), Barcelona, Spain, 2016, pp. 309-320, doi: 10.1109/HPCA.2016.7446074.
Download Paper
Published in 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2016
Switches;Resource management;Delays;Load modeling;Nickel;Tuning;Benchmark testing
Recommended citation: Y. Yao and Z. Lu, "Memory-access aware DVFS for network-on-chip in CMPs," 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 2016, pp. 1433-1436.
Download Paper
Published in 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016
Critical Section; CMP; NoC; OS
Top conference publication-ISCA
Recommended citation: Y. Yao and Z. Lu, "Opportunistic Competition Overhead Reduction for Expediting Critical Section in NoC Based CMPs," 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, Korea (South), 2016, pp. 279-290, doi: 10.1109/ISCA.2016.33.
Download Paper
Published in ACM Transactions on Architecture and Code Optimization (TACO), Volume 13, Issue 4 Article No.: 53, Pages 1 - 27, 2016
computer architecture, performance fairness, quality of service
Recommended citation: Zhonghai Lu and Yuan Yao. 2016. Aggregate Flow-Based Performance Fairness in CMPs. ACM Trans. Archit. Code Optim. 13, 4, Article 53 (December 2016), 27 pages. https://doi.org/10.1145/3014429
Download Paper
Published in IEEE Transactions on Very Large Scale Integration (VLSI) Systems (Volume: 25, Issue: 2, February 2017) , 2017
Delays;System performance;IP networks;Nickel;Regulators;Calculus;Network-on-chip;Chip multi/many-core processor (CMP);fuzzy control;multi/many-processor systems-on-chip (MPSoC);network-on-chip (NoC);traffic engineering
Recommended citation: Z. Lu and Y. Yao, "Dynamic Traffic Regulation in NoC-Based Systems," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 25, no. 2, pp. 556-569, Feb. 2017, doi: 10.1109/
Download Paper
Published in IEEE Transactions on Computers (Volume: 66, Issue: 11, 01 November 2017) , 2017
Energy efficiency;Power demand;Measurement;Benchmark testing;Energy efficiency;Network-on-chip;Program processors;Performance evaluation;power efficiency;DVFS;network-on-chip (NoC);CMP
Top transaction publication-TC
Recommended citation: Z. Lu and Y. Yao, "Marginal Performance: Formalizing and Quantifying Power Over/Under Provisioning in NoC DVFS," in IEEE Transactions on Computers, vol. 66, no. 11, pp. 1903-1917, 1 Nov. 2017, doi: 10.1109/TC.2017.2715018.
Download Paper
Published in 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2018
Instruction sets;Spinning;Liquid crystal on silicon;Coherence;Acceleration;Routing protocols;In Network Packet Generation;Critical Section;Synchronisation Primitive;Cache Coherency;Network on Chip;CMP
Top conference publication-HPCA
Best-paper candidate
Recommended citation: Y. Yao and Z. Lu, "iNPG: Accelerating Critical Section Access with In-network Packet Generation for NoC Based Many-Cores," 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), Vienna, 2018, pp. 15-26, doi: 10.1109/HPCA.2018.00012.
Download Paper
Published in IEEE Transactions on Computers ( Volume: 67, Issue: 10, 01 October 2018) , 2018
Measurement;Message systems;System-on-chip;Instruction sets;Voltage control;Load modeling;Power system management;Chip manycore processor (CMP);DVFS;network on chip (NoC);power/energy efficiency
Top conference publication-TC
Recommended citation: Z. Lu and Y. Yao, "Thread Voting DVFS for Manycore NoCs," in IEEE Transactions on Computers, vol. 67, no. 10, pp. 1506-1524, 1 Oct. 2018, doi: 10.1109/TC.2018.2827039.
Download Paper
Published in IEEE Transactions on Computers (Volume: 69, Issue: 3, 01 March 2020) , 2020
Power demand;Message systems;Tuning;Thermal management;Monitoring;Energy consumption;Power system management;Manycore processor;DVFS;NoC;power efficiency;CMP
Top transaction publication-TC
Recommended citation: Y. Yao and Z. Lu, "Pursuing Extreme Power Efficiency With PPCC Guided NoC DVFS," in IEEE Transactions on Computers, vol. 69, no. 3, pp. 410-426, 1 March 2020, doi: 10.1109/TC.2019.2949807.
Download Paper
Published in 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2021
Protocols;Program processors;Nonvolatile memory;Computational modeling;Semantics;Coherence;Computer architecture;non-volatile memory;persistent memory;persistency;total store order;coherence
Top conference publication-HPCA
I am co-first author
Recommended citation: P. Ekemark, Y. Yao, A. Ros, K. Sagonas and S. Kaxiras, "TSOPER: Efficient Coherence-Based Strict Persistency," 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Seoul, Korea (South), 2021, pp. 125-138, doi: 10.1109/HPCA51647.2021.00021.
Download Paper
Published in IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS, Volume: 13, Issue: 1, March 2023), 2023
Dynamic voltage scaling, multiprocessor interconnection, automata.
Recommended citation: Y. Yao, "Game-of-Life Temperature-Aware DVFS Strategy for Tile-Based Chip Many-Core Processors," in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 13, no. 1, pp. 58-72, March 2023, doi: 10.1109/JETCAS.2023.3244763
Download Paper
Published in IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS, Volume: 13, Issue: 1, March 2023), 2023
Artificial intelligence, artificial neural networks, AI accelerators.
Recommended citation: Y. Yao, "SE-CNN: Convolution Neural Network Acceleration via Symbolic Value Prediction," in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 13, no. 1, pp. 73-85, March 2023, doi: 10.1109/JETCAS.2023.3244767.
Download Paper
Published in Proceedings of the 2023 International Conference on Embedded Wireless Systems and Networks (EWSN), 2023
sensor network;store buffer;silent store;low power devices
Recommended citation: Weining Song, Stefanos Kaxiras, Luca Mottola, Thiemo Voigt, and Yuan Yao, ''Silent Stores in the Battery-less Internet of Things: A Good Idea?'' in Proceedings of the 2023 International Conference on embedded Wireless Systems and Networks (EWSN), Association for Computing Machinery, New York, NY, USA, 40-45.
Download Paper
Published in Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems (SenSys), 2024
Task decoupling, Internet of Things (IoT), energy harvesting, intermittent computing
SenSys is a top conference in wireless sensors
Recommended citation: Weining Song, Stefanos Kaxiras, Thiemo Voigt, Yuan Yao, and Luca Mottola. 2024. TaDA: Task Decoupling Architecture for the Battery-less Internet of Things. In Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems (SenSys). Association for Computing Machinery, New York, NY, USA, 409–421. https://doi.org/10.1145/3666025.3699347
Download Paper
Published in 2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2024
Bit-Parallel, Energy-Efficient Multiply Accumulate, Deep Neural Networks
Best paper candidate
Recommended citation: Y. Yao, X. Chen, H. Atmer and S. Kaxiras, "TangramFP: Energy-Efficient, Bit-Parallel, Multiply-Accumulate for Deep Neural Networks," 2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), Hilo, HI, USA, 2024, pp. 1-12, doi: 10.1109/SBAC-PAD63648.2024.00009.
Download Paper
Published in 39th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2025
Early acceptance paper [To be appeared]
Recommended citation: R. Aligholipour, P. Aimoniotis, S. Kaxiras and Y. Yao, "RXT: RefleXive address Translation for Pointer-Chasing Workloads," 39th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Milan, Italy, 2025.
Download Paper
Published:
This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!
Undergraduate course, Uppsala University, Department of IT, 2024
TL;DR Introduction to Computer Architecture using MIPS ISA and introduces ideas/thoughts behind the micro-architecture that implements the ISA.
I have been the course responsible since 2021
Course’s webpage at Uppsala University
Graduate course, Uppsala University, Department of IT, 2024
TL;DR Join us to gain hands-on experience in FPGA acceleration, neural network optimization, and hardware-software co-design, while mastering the Xilinx Zynq-7000 FPGA system! 🚀
I have been the course responsible since 2020.
Course’s webpage at Uppsala University
Published:
In this project I modified PARSEC-3.0 benchmarks using static linking with gem5 hooks for the x86_64 architecture.
Published:
WHISPER benchmark suite with static linkage.
Published:
Some of my personal configurations for some of my personal favoriate GNU Tools
Published:
TangramFP: Energy-Efficient, Bit-Parallel Multiply-Accumulate for Deep Neural Networks