ruby: performance comparison of rexml and libxml
Posted by phillip Sun, 18 Mar 2007 10:20:00 GMT
update: here’s the same for PHP’s XML Parser.
a quick comparison of the two libraries available for processing XML in ruby shows dramatic performance differences.
am i missing something, is there a fundamental flaw in the test? of course REXML is pure ruby, while libxml is C; but can the difference really be so huge?
loading an xml file
| file size | libxml | REXML | factor |
| 10KB | 0,83 | 39,17 | 47,0 |
| 100KB | 6,67 | 306,56 | 46,0 |
| 1.6MB | 71,88 | 3954,21 | 55,0 |
simple xpath expression
| file size | libxml | REXML | factor |
| 10KB | 0,12 | 124,68 | 1004,7 |
| 100KB | 0,67 | 678,11 | 1016,8 |
| 1.6MB | 6,21 | 22578,18 | 3633,6 |
the test code
def benchmark
start = Time.new.to_f
10.times { yield }
puts ((Time.new.to_f - start) / 10) * 1000
end
doc = nil
# exclude the effect of filesystem caching (makes sense?)
File.read('products.xml')
#
# libxml
#
require 'rubygems'
require 'xml/libxml'
benchmark do
doc = XML::Document.file("products.xml")
end
benchmark do
doc.find('//articles/article/shortdesc').each do |node|
#puts node.content
end
end
#
# rexml
#
require "rexml/document"
benchmark do
doc = REXML::Document.new File.read("products.xml")
end
benchmark do
doc.elements.each("//articles/article/shortdesc") do |node|
#puts node.text
end
end