Make Your Users Get What They Want

This page contains several ideas on how to help rank and categorize content on a web site, to help users find the information they are looking for. The basic premise is that by watching what users actually do and counting that as implicit votes, we can generate much more precise information about how valuable content actually is to users, than by asking users to explicitly score items.


As is well known in the Human Sciences, especially Information Science, Categorization is in and of itself very problematic and not necessarily helpful to users. Coming up with proper categories and mapping the real world onto those categories is basically an unsolvable problem, and doing it so that it generates value for the user is hard. The categories chosen must depend on a deep knowledge of the actual content you want to categorize. The best way to deal with that is probably to come up with a limited set of categories, and let the set of available categories evolve as content develops and you learn about the content and your users.

We need ways to let not only the set of available categories, but also the categorization of individual items evolve over time. To do the first, it’s reasonable to let your users suggest new categories. To do the latter, you should at least give each mapping between content and a category a weight. A weight is a number, either positive or negative that says how strong the relation between the content and the category is.

But the act of categorizing content shouldn’t be solely in the hands of authors and moderators. Empower your readers, by letting readers also suggest their idea of how a particular content item should be categorized. The readers’ categorizations should be used to adjust the weights of content-to-category-mappings in some balanced way, so that the mappings gradually get better for the readers.

Scoring Content

We should always strive to score content based on all the information we can get. Higher scoring content should bubble to the top whenever the user is searching or browsing for content.

Scoring can happen along several dimensions, depending on what the user is looking for. Sometimes we will want to favor new content. Sometimes we want to favor content that has generated a lot of comments. When browsing a specific category, we’ll want the weight of the mapping of a content to that category to influence the score. And we’ll try to associate a notion of “quality” with a piece of content, and consequently score high-quality content higher. This is what we use the notion of votes for.

Explicit Votes

Explicit votes (“rank this item from 0 to 5”) is not very interesting. When given the option of giving from zero to five stars to some item, most readers either go for either zero of five. If their feeling about the content is somewhere in-between they probably won’t rank it. The deeper problem with simply giving a zero-to-five vote is that it doesn’t say anything about why. So an explicit rank should always be associated with some comment. It goes without saying that all content should be commentable.

Implicit Votes

Implicit voting is much more interesting. Implicit voting is what happens when you email a link to a friend. You’re telling your friend to look at this because you think he’d find it interesting. We want to capture those implicit votes as best we can, and record that as a testimony to the item’s “quality”. Thus, we want to offer a “send this page to a friend” option on every content page. Whenever a page is recommended, we record it, and increase the “quality value” of that item. Since most people have a limited set of people that they regularly recommend pages to, we can increase usability by showing the last five-or-so people they’ve emailed recommendations to and let them recommend by one click (hope <a href=”http://www.noamazon.com/”>Amazon won’t sue us for this).

Another form of implicit vote is links. If someone chooses to link to our content, it’s probably because it’s valuable to them, i.e. high quality. So we should catch referer-headers and record it as a vote.

Bookmarks is another form of links. We should have a link on every content page that will allow users to add a “virtual bookmark” to that content piece, and we record the act as a vote. The bookmarks can be stored on the same site, so the user has a <a href=”/bookmarks”>bookmarks page there, or we can offer to integrate with Yahoo! Bookmarks or other sites that the users might prefer.

By allowing authors of content to include links to other items, we might even be able to do something a’ la <a href=”http://www.google.com/why_use.html”>Google to build up a web of votes from content items to each other. A link is a vote. A link from a highly scoring author weighs more.

Scoring the Voters

The next level of indirection is when we start <a href=”http://www.epinions.com/help/index.html?show=web_of_trust”>ranking rankers. There are two things we want to do here. First, some people just say ridiculous things whenever they open their mouth. We want to avoid having their votes count by downplaying them. Second, all people don’t agree about their opinions. We want to establish links between users that says “user x generally agrees with user y”.

We do this by letting users say whether a comment was useful to them or not, on a per-comment basis. Such a vote influences both the score of the comment in question, and the score of the author of that comment. This will let us bubble interesting comments on top, and give more weight to votes from highly values people.

We also let users explicitly say that they trust some other person. That indicates an agreement in preferenecs and will directly influence the scoring of content for that particular user. It will also give the person being trusted a higher score.

Recording User Interest

As the user searches and browses, we not what search keywords and categories the user likes. That way, we can even further score content to a specific user’s liking. With some clever <a href=”http://click.arsdigita.com/doc/clickstream”>clickstream analysis, we should even be able to estimate how much time the user spends on individual pages, and use that to figure out whether the user liked what he found or not. That’s even more implicit voting.


There are no comments yet. Be the first one to leave a comment!

Leave a comment