Though a distant third place to Google, Microsoft thinks it can teach its rival a thing or two about searching the Internet.
A big part of Google’s rise to search engine leadership was an algorithm called PageRank that assesses a specific page’s importance by how many other Web pages link to it and by the importance of those linking pages. Microsoft researchers and academic collaborators, though, detailed an idea this week it calls BrowseRank that seeks to bring more of a human touch to that assessment.
(Credit: Microsoft ResearchA Asia)
Essentially, the researchers tested out a system that replaces PageRanks’ link graph–a mathematical model of the hyperlinked connections of the Internet–with what they call a user browsing graph that ranks Web pages by people’s behavior.
The more visits of the page made by the users and the longer time periods spent by the users on the page, the more likely the page is important. We can leverage hundreds of millions of users’ implicit voting on page importance,” the researchers said in BrowseRank: Letting Web Users Vote for Page Importance, a paper from the SIGIR (Special Interest Group on Information Retrieval) conference this week in Singapore. Authors are Bin Gao, Tie-Yan Liu, and Hang Li from Microsoft Research Asia and Ying Zhang of Nankai University, Zhiming Ma of the Chinese Academy of Sciences, and Shuyuan He of Peking University.
Search is of tremendous importance to the Internet for many reasons. For one thing, search engines are highly influential middlemen that steer users to Web sites they may not be able to find on their own. For another, queries typed into search engines can be powerful–and in Google’s case highly profitable–indications of what type of advertisement to place next to the search results.
But Microsoft lags leader Google and No. 2 Yahoo in search. It’s trying hard to catch up, for example with unsuccessful proposals to acquire Yahoo or its search business that would cost the company billions of dollars. And Microsoft just bought search start-up Powerset.
Google isn’t putting all its eggs in the PageRank basket, though.
“It’s important to keep in mind that PageRank is just one of more than 200 signals we use to determine the ranking of a Web site,” the company said in a statement. “Search remains at the core of everything Google does, and we are always working to improve it.”
The Microsoft researchers argue that PageRank has a number of problems. For one thing, people can game the system by building bogus Web sites called link farms. Those sites feature hyperlinks point to a Web page whose importance a person wants to inflate so it appears higher in search results. Another PageRank issue is that the indexing process doesn’t take into account the time a user spends on a particular site.
But user behavior, monitored in anonymous form by Web servers and Web browser plug-ins, can be better, the authors argue.
“Experimental results show that BrowseRank can achieve better performance than existing methods, including PageRank…in important page finding, spam page fighting, and relevance ranking.
The researchers gathered their data from “an extremely large group of users under legal agreements with them,” according to the paper.
There’s no denying PageRank is useful, though, and such algorithms could be added into a larger formula for determining which sites come out on top of search results.
“It is also possible to combine link graph and user behavior data to compute page importance,” the researchers said. “We will not discuss more about this possibility in this paper, and simply leave it as future work.”
Bringing research to fruition
It can be a long time before research comes to fruition, but funding a group of researchers can be much less expensive than acquiring other companies. No doubt Microsoft, especially after years of effort and its thwarted overtures to Yahoo, would like to see its in-house search efforts bring Google to its knees.
When accused of being dominant, Google representatives often argue the company could lose its search dominance if somebody else builds a better mousetrap and Internet users divert their path to that other door door. “If Microsoft or Yahoo are successful in providing similar or better web search results or more relevant advertisements, or in leveraging their platforms or products to make their Web search or advertising services easier to access, we could experience a significant decline in user traffic or the size of the Google (ad) Network,” it said in its most recent quarterly report.
The top players are a moving target, though. Yahoo is hoping to improve search with three efforts: BOSS (build your own search service), which lets others employ Yahoo search results along with its search ads; SearchMonkey, which lets content publishers build elaborate mini-Web pages into search results; and Glue Pages, which present a smorgasbord of related content alongside search results.
And Google invests heavily, too. Its biggest research team is devoted to search, and the company updated its search formula more than 100 times in the second quarter. And researchers have huge infrastructure at their disposal to try new ideas.
“My group at Google has at its disposal many thousands of machines, with storage measured in petabytes,” Udi Manber, head of Google’s search quality, said of Google’s search research infrastructure in a June talk. And, he added, engineers are empowered to try their results, with meetings once or twice a week to see how well they worked: “There is no separation of research and development. Everyone does both.”