logoalt Hacker News

b0a04gltoday at 6:12 AM0 repliesview on HN

longtime back ran a model scoring job where each row did feature lookups across diff memory regions. access pattern changed every run since it depended on prev feature values. avg latency stayed fine but total run time kept jumping sometimes 6s, sometimes 8s on same input. perf counters looked flat but thread scheduling kept shifting. reordered the lookups to cut down mem jumps. after that threads aligned better, scheduler ran smoother, run time got stable across batches. gain wasn’t from faster mem, imo i observed it's not the memory is slower but rather how less predictable it's