Patent on How to Identify a Web Page as a Blog

Mar 10, 2011

blogs  what is it resized 600


In our continuing series of blog entries focused on patents about blog entries, here is an interesting patent on how to identify a web page that is a blog.  While we, human beings, can probably often identify a Web page that we're looking at as being a blog, the web crawlers and other automated systems had a need to do so in a more systematic way.


US7565350 B2


Identifying a web page as belonging to a blog


MICROSOFT CORP


Abstract: A machine learning classifier is used to determine whether a web page belongs to a blog, based on a number of characteristics of web pages (e.g., presence of words such as “permalink”, or being hosted on a known blogging site). The classifier may be initially trained using human-judged examples. After classifying web pages as being blog pages, the blog pages may be further identified or categorized as top level blogs based on their URLs, for example.


- J.A.





SEARCH BLOG


Latest Posts

Posts By Category


RECEIVE A SYNOPSIS OF OUR STAFF BIOGRAPHIES




DOWNLOAD WHITEPAPERS