Yahoo! Research Papers On Web Spam And Information Retrieval
January 15th, 2007 | RSS Feed
There are two important papers at Yahoo! Research dealing with the problems in distributed information retrieval and web topology for detecting web spam. The paper 'Challenges in Distributed Information Retrieval' is by Flavio Junqueira, Ricardo Baeza-Yates, Fabrizio Silvestri, Vassilis Plachouras and Carlos Castillo. As the web sites are increasing at a great rate with over 20 billion indexed pages the centralized systems of the search engines will not be able to handle such a large data. There will be requirement of fully distributed search engines. In this paper all the researchers have put together the recent research results and talk about the challenges that distributed Web retrieval system faces.
The other paper 'Know your Neighbors: Web Spam Detection using the Web Topology' is by Vanessa Murdock, Carlos Castillo, Fabrizio Silvestri, Debora Donato and Aristides Gionis who in the paper “present a spam detection system that uses the topology of the Web graph by exploiting the link dependencies among the Web pages, and the content of the pages themselves. We find that linked hosts tend to belong to the same class: either both are spam or both are non-spam. We demonstrate three methods of incorporating the Web graph topology into the predictions obtained by our base classifier: (i) clustering the host graph, and assigning the label of all hosts in the cluster by majority vote, (ii) propagating the predicted labels to neighboring hosts, and (iii) using the predicted labels of neighboring hosts as new features and retraining the classifier. The result is an accurate system for detecting Web spam that can be applied in practice to large-scale Web data.”
Click here to subscribe to our RSS feed to get a daily digest of news around search engine industry. PageTraffic SEO Blog is updated four times a day and is ranked as one of the best search engine resources blog by Pandia!
- del.icio.us
- Digg
- Furl
- Rojo
- StumbleUpon
- Technorati
- Yahoo!
Did you like this article?
Related Posts
Comments
Leave a Reply
Connect with us
SEO Tools
FEATURED CATEGORIES
- adCenter (82)
- AdSense (113)
- AdWords (298)
- Analytics (53)
- AOL (5)
- Ask (101)
- Bing (33)
- Blogging (19)
- Copywriting (1)
- Directory (6)
- Google (1876)
- Industry News (805)
- Keyword Research & Targeting (21)
- Link Building (1)
- Link Popularity (60)
- Live (78)
- Local SEO (7)
- Microsoft (131)
- Mobile Search (13)
- MSN (170)
- PageTraffic Happenings (6)
- Panama (21)
- Pay Per Click (33)
- Reputation Management (1)
- Search Engine Conferences (153)
- Search Engines (95)
- SEO (222)
- SEO Tools (40)
- Social Media (19)
- Tips & Tricks (12)
- Web Marketing (4)
- Yahoo! (567)
- Yahoo! Search Marketing (66)









