Tags

, , , , ,

I accidentally stumbled upon RDF the other day and thought it was a really interesting and useful concept. It took me way too long to figure out how I can start integrating a parser into my own code (RoR) so I can play around with it and do something useful. Maybe this will help someone else.

The best tool I found for the job was RDF.rb

Here’re some of the gems needed (can just put these in gemfile):

gem 'rdf'
gem 'rdf-rdfa'

There’re plenty of gems that use the RDF gem but I mainly used the RDFa parser. They’re listed at the rubyforge page.

Now you can do something like this to extract any RDFa tags given a URL:

RDF::RDFa::Reader.open("http://www.tripadvisor.com/Hotel_Review-g186525-d280839-Reviews-Gerald_s_Place-Edinburgh_Scotland.html") do |reader|
  reader.each_statement do |statement|
    puts statement
  end
end

And if you’re looking for a particular property, you can check if the statement’s predicate matches. TripAdvisor happens to use the data vocabulary. It’s not defined by default but that’s easy enough to do:

DV = RDF::Vocabulary.new("http://rdf.data-vocabulary.org/#")

So now you can do:

RDF::RDFa::Reader.open("http://www.tripadvisor.com/Hotel_Review-g186525-d280839-Reviews-Gerald_s_Place-Edinburgh_Scotland.html") do |reader|
  reader.each_statement do |statement|
    if(statement.predicate == DV.name)
      puts statement.object.to_s 
    end
  end
end

You should see “Gerald’s Place” as the output.

Some great resources I’ve found on the subject besides the ones already mentioned are:
Google’s help page for RDFa
Screencast that shows how to parse Best Buy product reviews