Several commenters on Reddit’s story proposed absurdly inefficient solutions to the ITA ‘word numbers’ puzzle (If the integers from 1 to 999,999,999 are written as words, sorted alphabetically, and concatenated, what is the 51 billionth letter?). Some solutions took hours, others over a day. One person said their C solution took 5 hours.
Here’s my solution, as promised in my previous posts. This algo will generate and alphabetically sort one billion numbers in:
danielharan$ time ruby finder.rb
number:sixhundredseventysixmillionsevenhundredfortysixthousandfivehundredseventyfive
sum: 413540008163475743real 0m1.453s
user 0m1.391s
sys 0m0.019s
There you have it: proof that Ruby is 12,000 times faster than C!
My code is available on refactormycode: Integer puzzle (scroll down a bit). Ok, if you dig, you’ll see that I don’t actually alphabetically sort all those numbers. I don’t even properly sum them. Here’s the function that adds all the numbers in a ‘thousand range’:
def child_values @thousands * 1_000_000 + 499500 end
A couple people that saw this code found it pretty confusing. Let’s consider the sum of numbers from 0 to 999 and the ranges 1,000 to 1,999 and 2,000 to 2,999, arranging them side by side and added:
0 1,000 2,000 1 1,001 2,001 2 1,002 2,002 3 1,003 2,003 … … … 999 1,999 2,999 499500 1,499,500 2,499,500
The reference to Gauss in my first post wasn’t haphazard. The difference between the first and second column is 1,000, repeated 1,000 times - naturally, the sum is 1,000,000 higher. This is what is expressed in the function above.
The same holds true for millions, although the numbers are quite a bit bigger. Since the puzzle also requires that we sum the size of the strings version of those numbers, I also worked out similar functions to calculate the sum of all strings in a range.
The range classes also help in one other regard: generating the sequence in alphabetic order. Trying to alphabetically sort one billion strings would take an absurd amount of memory and computing power, so any reasonable solution has to have a way to generate them already sorted. The code in finder.rb initiates the stream. Range classes are enumerable, and respond to each, taking advantage of the fact that all numbers that start with ‘onemillion’ or ‘eightthousand’ will be grouped together in a sorted stream. Sorting over 2,997 instead of 999,999,999 strings is a definite speed boost.
This puzzle has been a great learning exercise. Ruby is a wonderful language for quickly testing out algorithms, and plenty fast enough if you pick the right one for the task. My code still has room for improvement, however I consider it fairly readable if you understand the problem, and the shortcuts I explain above.
3 comments ↓
In the same vein, follow the website link to view ‘proof’ that java is ten times faster than C. For your convenience, I can better that a factor ten by running it on my new workstation.
Honestly, wouldn’t you think that choice of algorithm has anything to do with the performance improvement? Or am I just a sucker for your troll?
Kind regards,
Wim.
I agree, language performance matters much less than choosing an appropriate algorithm.
The big win for me is being able to tweak that algorithm with a higher-level language. That iteration process would be far more painful in verbose languages.
Hiya Daniel,
My comment states: ‘java is ten times faster than C’, I meant ‘faster than ruby’. I can’t believe I made that typo; in fact I’m pretty sure that I didn’t make it. I really hope that the text didn’t change somewhere between me typing it and getting it displayed here. I will assume good intentions.
Kind regards,
Wim.
Leave a Comment