The One Billion Row Challenge
At the beginning of 2024 Gunnar Morling posted a challenge for Java developers on their blog: How fast can you parse a file that contains one billion rows of temperature data and calculate the min, mean, and max temperature per weather station? At first I was not really interested in the challenge because I was occupied with other projects at the time. Plus, I am not really a Java developer so I kind of put it in the "interesting to observe" category of my brain and left it at that. Time went on and I continued my journey to learn performance aware C programming. One day I found myself asking the question: how fast can I count newline characters in a text file? And while I was working on a few benchmarks to figure this out I remembered the One Billion Row Challenge and wondered: How fast can I actually parse 1 billion rows in C?
This site contains a collection of my attempts at the challenge, sorted by date, with the latest attempt at the top. The dates are formatted as yyyy-mm-dd (year-month-day). This is still an ongoing challenge for me, which means I will update this list every now and then when I finish a new attempt. This is not an attempt to produce the fastest C version of this challenge, but rather it is a learning opportunity for me to get more familiar with performance aware programming in C.