
So the idea of FizzBuzz is, if I say 1, you say 1, if I say 2, you say 2, if I say 3, you say Fizz, if I say 4, you say 4, if I say 5, you say Buzz, if I say 6 you say Fizz, and if I say 10 you say Buzz and just in case I say 15 you sayįizzBuzz and this continues. The reason stated was that the cost of communication between CPU cores is sufficiently high enough that any gains from parallel computation are ultimately negated.Let us jump into solving one of the Code Kata exercises FizzBuzz, by applying TDD. Alex states the ultimate bottleneck of performance is based on the throughput of the CPU's L2 cache.Īlex did attempt to build a multi-threaded version but was unable to find any performance improvement over the single-threaded version.

The application produced 64 bytes of FizzBuzz for every 4 CPU clock cycles. The lines number is corrected after every 512 bytes of output has been produced.

Some calculations have also been hard-coded into the bytecode in a way not dissimilar to how JIT compilers operate.ĭuring each 600-line generation, an approximation of the line number is also produced. Each 32 bytes of bytecode can be interpreted and have their output stored with just 4 CPU instructions. There is a bytecode generator that produces batches of 600 lines at a time using SIMD instructions. In this post, I'll examine some of the optimisations found in the fastest FizzBuzz implementation to date.Īnd OUTPUT_PTR, -(2 << 20) // rewind to the start of the buffer Alex is a reserve on the UK Olympic Maths Team and has a degree in electronic engineering.

The developer behind the Assembler version is Alex Smith, a doctoral researcher studying for a PhD in the School of Computer Science at the University of Birmingham in the UK. Submissions are benchmarked on Omer's computer which has a 16-core, 32-thread AMD 5950x CPU running with a base clock of 3.4 GHz and a boost clock of upwards of 4.9 GHz, 8 MB of L2 cache and 32 GB of 3.6 GHz DDR4 RAM.Īs of this writing, the 3rd fastest submission is written in Rust and produces out at a rate of 3 GB/s, 2nd is written in C and produces at a rate of 41 GB/s and the fastest is written in Assembler and produces at a rate of 56 GB/s. A year ago, Omer Tuchfeld started a coding contest to see who could write the fastest version of FizzBuzz on Stack Exchange's Code Golf site.
