Performance Analysis - yszheda/wiki GitHub Wiki

Roofline Performance Model

Profiling

Profilers

Latency

Clock Cycle

TMA (Top-down Microarchitecture Analysis) Method






  1. Branch Mispredicts
  2. Machine Clears: e.g. memory ordering violations, self-modifying code, load illegal address ranges

  1. Base
  2. Microcode Sequencer:MS用于解析默认decoders不支持的CISC指令,比如对string重复执行move操作的指令,CPUID指令等,这些类型的指令均由MS生成。