Skin of Stars

Icon

Kevin Carmody on machines, media and miscellanea.

PHP Script for RSS auto-discovery and OPML file generation

Hey All,

I recently got a reasonable size list of blog URLs. What I wanted was to import all these into a feed reader (via OPML). There seemed to be a lack of conversion scripts for batch URL->find RSS link->feed reader import file (I may be wrong, please let me know if I am :) , so I made one in PHP. I guess this is like an automatic-blogroller. I have just used this as a command line script, I’m not recommending you use this in ‘the wild’ as one might say, I have made little concession to security as I had a trusted list of URLs.

There are basically three steps to this

  1. Take an input file of newline seperated URLs, in my case blogs.
  2. Find (auto-discover) associated RSS feed of each blog URL
  3. Output an OPML file that you can use to import into a feed reader

What it does:

  • Takes a well formed list of newline separated URLs of blogs and turns it into an OPML
  • If the URL source doesn’t contain a <link> to an RSS feed in the head it doesn’t add it to the OPML
  • Detects the <title> and adds that to the OPML text field, or uses the URL if <title> isn’t present

What it doesn’t:

  • Check the RSS feed is validated XML
  • Any other checking really :)
  • Come with any sort of warranty/guarantee

Some of the key functions are from Keith Devens work. Thanks.
Read the rest of this entry »

Ruby On Rails, RSS and Atom feed parsing with Feed Normalizer and subsequent storage

I’ve battled for days on this, but I now finally know how to parse feeds and store them in a database in Ruby On Rails. This won’t be of much interest to the casual reader, but if you are scouring the web for an answer (as I was) then you will probably find this very useful:


class Feed < ActiveRecord::Base
require_association 'post'
require 'feed-normalizer'
require 'open-uri'
require 'rss/2.0'

belongs_to :user
has_many :posts, :dependent => :destroy

#put some other stuff here for feed validation etc

def refresh_all
refresh(Feed.find(:all))
end

def refresh(feeds)
feeds.each do |feed|
rss = FeedNormalizer::FeedNormalizer.parse open(feed.uri)
rss.entries.each do |item|
post = Post.new(:feed_id => feed.id)
post.link = item.url or raise "post has no link tag"
post.title = item.title or "no title"
post.content = item.content or "no text"
post.created_at = item.date_published if item.date_published
post.save
end
end
end

end

About

My name is Kevin Carmody and I live in Oxford, United Kingdom. I am a web developer with a penchant for community sites and a pedantry for open standards.

This here is a collection of my thoughts and musings, a spot for pooling a little of what's rattling around. Thanks for taking the time to visit and I hope you enjoy your stay.