A web feed (commonly known as RSS feeds) is a data format used for providing users with frequently updated content. Content distributors syndicate a web feed, thereby allowing users to subscribe to it. Making a collection of web feeds accessible in one spot is known as aggregation, which is performed by an news aggregator. A web feed is also sometimes referred to as a syndicated feed.
RSS (an acronym for Rich Site Summary) is a structured format for delivering regularly changing web content. Many news-related sites, weblogs and other online publishers syndicate their content as an RSS Feed to whoever wants it. ATOM and OPML feeds are less commonly used nowadays, so RSS is the de-facto syndicated content format.
Content delivered by a web feed are typically either HTML (webpage content) or links to webpages and other kinds of digital media. Often when websites provide web feeds to notify users of content updates, they only include summaries in the web feed rather than the full content itself, so that readers must go to the website in order to read the full article.
A few years ago, I was asked to develop an automated tools for aggregating news from several related sites, so that they could be delivered to users by email (a sort of Google Newstand) class can be used to find RSS feeds associated to a page, as well as ATOM links and OPML outline documents.
In order to make the code more manageable, I created a PHP class that retrieves a given page (using cURL) and parses its head section to obtain the list links to the associated RSS, ATOM and OPML links. The URLs of the available links are returned in an array, if any are found. This RSS feeds finder class was published on PHPClasses and earned an Innovation Award prize. You can find this class on a Github repository, where you can contribute to make it better.
To prevent any copyright issues, the class checks the site’s robots.txt file first to see if it is allowed to parse the site pages before attempting to retrieve the specified page.
RSS Feeds retrieval example:
# Add the feed finder class
# Create a new class instance
$find_links= new Find_RSS_Links('http://www.cleverclogs.org/2006/10/opml_autodiscov.html');
$links = $find_links->getLinks();
Display the RSS, ATOM and OPML links found on the page