Ruby / ActiveRecord is too slow

I just got my hands on a postal code file. First, we cut out other provinces we don’t need:

grep '^"[GHJ]' POSTALCODEWORLD-CA-GOLD.CSV >> quebec.csv

My first strategy was to use ActiveRecord to create the records. The console code looked like this:

>> post_codes.each_line do |line|
?> fields = line.split(",")
>> PostCode.create(:postal_code => fields[0],...,
:street_to_suffix => fields[21])
>> end

I was too lazy to write all that myself, the create was generated. However, the performance was an issue. There are 300k lines in that file, and less than 4k were being entered per minute in the database. From MySQL’s command line, this is what my solution looked like:

LOAD DATA LOCAL INFILE '/Users/daniel/projects/[top_secret]/quebec.csv'
-> into table post_codes
-> FIELDS TERMINATED BY ','
-> ENCLOSED BY '"'
-> LINES TERMINATED BY '\r\n'
-> (postal_code,...,
street_to_suffix)
Query OK, 301571 rows affected, 7866 warnings (10.70 sec)
Records: 301571 Deleted: 0 Skipped: 0 Warnings: 0

From > 4500 seconds to 10.70 seconds. Ruby and ActiveRecord can be incredibly slow. And I don’t care, because it’s pretty and easy to use. When I do need execution speed, I’ll use a shortcut; otherwise, my happiness and code writing speed are more important.

1 comment so far ↓

#1 Gregory on 06.26.07 at 6:25 pm

Ooookay, so what’s the point ?

Basicaly, you say that mysql is faster than mysql plus a database abstraction layer over it. Of course it is ! ActiveRecord is not optimized to do that. And by the way, ActiveRecord has to be compatible with many databases other than mysql so implementing that kind of stuff can’t be done easily.

If all you need to do is to insert 300,000 records in a table then you don’t need ActiveRecord.

Leave a Comment