Optimization - gusenov/kb GitHub Wiki
- Denis Bakhvalov | Performance explained easy
- The Mature Optimization Handbook (Carlos Bueno, Facebook’s Performance team)
- wiki.c2.com/?PrematureOptimization
- Software optimization resources
- Building a Continuous Profiler
- ByteByteGo / What are the top 𝐜𝐚𝐜𝐡𝐞 strategies?
Wikipedia
- Category:Computer performance
- Category:Computer optimization
- Category:Software optimization
- CPU cache
- Category:Cache (computing)
- False sharing is a performance-degrading usage pattern that can arise in systems with distributed, coherent caches at the size of the smallest resource block managed by the caching mechanism.
- Locality of reference (principle of locality) is the tendency of a processor to access the same set of memory locations repetitively over a short period of time.
- Locality is a type of predictable behavior that occurs in computer systems. Systems that exhibit strong locality of reference are great candidates for performance optimization through the use of techniques such as the caching, prefetching for memory and advanced branch predictors at the pipelining stage of a processor core.
- Распараллеливание программ
- Instrumentation (computer programming) refers to the measure of a product's performance, in order to diagnose errors and to write trace information.
- DynInst может быть весьма полезной при разработке инструментов измерения производительности, отладчиков и симуляторов.
- Performance measurement
- Performance management
- Program optimization
- Optimizing compiler
- Template:Compiler optimizations - Wikipedia
- Partial evaluation is a technique for several different types of program optimization by specialization. The most straightforward application is to produce new programs that run faster than the originals while being guaranteed to behave in the same way.
- Inline expansion is a manual or compiler optimization that replaces a function call site with the body of the called function.
- Zero-copy describes computer operations in which the CPU does not perform the task of copying data from one memory area to another or in which unnecessary data copies are avoided.
- Cache-oblivious algorithm (or cache-transcendent algorithm) is an algorithm designed to take advantage of a processor cache without having the size of the cache (or the length of the cache lines, etc.) as an explicit parameter.
Mechanical Sympathy
- Martin Thompson
- Google Группы / mechanical-sympathy
- bugzmanov/mechanical_sympathy: Curated list of resources dedicated to hardware and low level design
- YouTube
Intel
- Twitter / @IntelDevTools
- Intel® Power Gadget is a software-based power usage monitoring tool enabled for Intel® Core™ processors (from 2nd Generation up to 10th Generation Intel® Core™ processors).
Quotes
- "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." (Donald Knuth)
Courses
- MIT OpenCourseWare
- Performance Engineering of Software Systems
- CO 339 – Performance Engineering by Holger Pirk
- Enhancing Program Performance with Logic Models
I/O
- jrtechs.net: Multi Threaded File IO
- multiple threads don’t increase file throughput
- HHD can only read one file at a time. Adding more CPU cores into the mix would actually slow down the file ingest because the HHD would have to take turns between reading fragments of different files. The seek speed of the HHD would heavily degrade the performance.
- using the same number of threads as your computer has is the most efficient way to read in files
- using more threads will decrease the idle time of the HHD.
- If you use more threads than your CPU has, you will obviously suffer performance wise because the threads will be idle while they wait for each other to finish.
- multiple threads don’t increase file throughput