Interface advice: Categorize in many many categories

I have a client that categorizes their documents.



Great, you think. Well, they categorize their documents in one or more of almost 3,000 categories. That’s something that’s hard to build a nice, simple interface for.



Here are some possible approaches that I and the few people I’ve asked have come up with:



1. Present a single page with all 3,000 categories, displayed hierarchically, each with a checkbox next to it. Yeah, that’s what we have today. Let me just say that it doesn’t work great, and that we didn’t start out with 3,000 categories.



2. Let people navigate to the category first (think the Yahoo! directory) and add the document there. We have that, too, but that only helps you choose the first cateogry, not the additional ones.



3. Present a series of drop-downs. First we show you one with the top-level. When you choose there, we show you another one with the subcategories of your first choice. We keep doing that until there are no children or you choose “Add category”. Yes, documents can be added to any category, including those that have subcategories.



4. A variant of this is to use multi-select boxes instead of dropdowns, mimicking the OS X finder interface.



5. Use a dynamic Windows Explorer-style tree, like XLoadTree.



6. Live substring-based search. Good if you know what you’re looking for, not good for browsing. And it short-circuits the structure, searching just the categories and not their relationship. This seems useful, but it’s an add-on to another solution, not a solution in itself.



I’d like to ask your advice. Please send me your suggestions. Screenhots or links to interfaces that solve this well, whether web or not, would be fantastic. Post in the comments or email me, and I’ll put it up and share.

21 comments

John Sequeira
 

Here's an implementation of faceted navigation that might inspire your many many category problem http://orange.sims.berkeley.edu/cgi-bin/flamenco.cgi/nobel/Flamenco it's open source http://flamenco.berkeley.edu/download.html I'm a big believer in faceted navigation: (see http://www.jsequeira.com/cgi-bin/virtualization )
Read more
Read less
  Cancel
Kai
 

If you are using a tree structure for your categories and the user needs to select related categories for a content item, you could first present the user with a list of sibling categories (i.e. categories with the same parent as the primary category of the item). Presumably the list of siblings would be relatively short, and relevant too.
Read more
Read less
  Cancel
Michael Yoon
 

Option #4 sounds like the best to me, of the ones you mention, perhaps something like http://johnvey.com/features/deliciousdirector/
Read more
Read less
  Cancel
Dave Bauer
 

How about using the existing category information to suggest categories, with a dynamcially loaded tree as the fallback if that doesn't help. That is, once you have choosen the first category, find out what other documents with that category are also categorized under, and suggest those.
Read more
Read less
  Cancel
Hartvig
 

How about less? If you have 3k categories in what scenario are they used - why is there a need for 3.000? Has the need been tested - eventually simulated down to 100 categories? That would be the first place to look - with 3.000 categories it would probably take longer time to pick categories than to write the original craft.
Read more
Read less
  Cancel
Thijs van der Vossen
 

Why does <a href="http://weblog.greenpeace.org/it/2006/08/advice_sought.html">Greenpeace</a> think they need 3000 categories for organizing their documents?
Read more
Read less
  Cancel
Tanya
 

@1 What John said. I would see if the categories (or perhaps they're really descriptors) can be sorted to allow parametric browsing.
Read more
Read less
  Cancel
Lars Pind
 

That's a perfectly valid question. I think that if the taxonomy is clear enough, that can be completely reasonable. I think the "ICD":http://en.wikipedia.org/wiki/ICD has thousands and thousands of categories. If people understand the taxonomy and know how to navigate it, it can be the right thing. I don't know if it's right for Greenpeace, and frankly, I don't think it matters. Whether there's 500 or 3000, we still need a good interface, and the one we have wasn't good even when the number of categories were in the low hundreds (sorry, Yon!). But the current set of categories has evolved from a much smaller set, and it hasn't been completely mindless: There's been card sorting and stuff :)
Read more
Read less
  Cancel
Lars Pind
 

@Michael Yoon: Thanks a bunch for the link. Seeing something in action as great, and this is a very good way to solve it.
Read more
Read less
  Cancel
James Melzer
 

More use case info would be helpful. Are there catalogers that catalog everything, or are these random business users cataloging for themselves? If the former, then known-item searching is probably the the fastest and best interface. They'll know the taxonomy or have a paper copy to refer to taped to the wall of their cube. On the other hand, if this is for lots of 'amateur' end users cataloging their own materials, I'd go the opposite direction. Their first few visits, show them the whole taxonomy (which sucks, as you said) but remember what categories they used. After a few visits, show their favorite categories first, with the option to see or search the entire list. Chances are, they'll be using the same small set of categories over and over, so this will speed up their work a lot (and make the interface simpler and faster.
Read more
Read less
  Cancel
Hamilton
 

http://developer.yahoo.com/yui/examples/treeview/ If you go with the dynamic tree option, you might find the above useful.
Read more
Read less
  Cancel
Eric Reiss
 

You don’t say how many individual documents are represented by these 3000 categories. However, the sheer number of categories at the top-most level is clearly unwieldy. Most of your proposed solutions build on display mechanisms (Yahoo directory, XLoadTree, dropdowns etc.) But for serendipitous navigation, you need to start by rethinking the basic categories, establishing broader categories at the top – but you already know this. Depending on the number of individual documents, you might emulate Amazon’s collaborative filtering method. This allows people to search for a term and then surf through related items. It’s not a true faceted classification system, nor is it hierarchical, but it does combine the best of a couple of different information-seeking worlds.
Read more
Read less
  Cancel
Lars Pind
 

There's 5 categories at the top level. At the second level, there's at most 30 subcategories, and they represent geography. Also I should mention that each local office gets their own corner of the taxonomy that they control, and they don't generally don't have to deal with the those controlled by other offices, which cuts down the number of categories a person has to deal with quite a bit. Note, that I'm not talking about the interface for browsing the documents in these categories. What I'm looking for is the interface for choosing a category for a document that you already have on the screen.
Read more
Read less
  Cancel
Thijs van der Vossen
 

It would be great if you could post a screenshot of the current interface and/or show us the current list or tree of categories.
Read more
Read less
  Cancel
Lars Pind
 

@Thijs: Unfortunately, I'm not in a position to grant that, and the person that is, isn't back until Monday.
Read more
Read less
  Cancel
Thijs van der Vossen
 

I know, I've been trying to get hold of him myself too... :-)
Read more
Read less
  Cancel
Koyan
 

I would do it like that: http://developer.yahoo.com/yui/examples/treeview/default.html?mode=dist With check boxes right before the names. You can still leave the clients free to do whatever categorisation they want (since I doubt you will get them to make the number of categories smaller), and you can present them with a fast loading page. Now, if they are willing to pay more, you can add functionality like "remember the last categories I had open" etc
Read more
Read less
  Cancel
Thijs van der Vossen
 

Ok, so you have a tree of hierarchical categories where you must be able to select multiple nodes? How about <a href="http://stuff.vandervossen.net/external/2006/treeform.mov">this (QuickTime)</a>?
Read more
Read less
  Cancel
Lars Pind
 

@Thijs: That's pretty neat. You built that? Is there code somewhere?
Read more
Read less
  Cancel
Thijs van der Vossen
 

It's just a nested list with checkboxes inside and some CSS so show the list item you're hovering over. Very simple really.
Read more
Read less
  Cancel
Martin
 

Ok, OK, I'll post a screenshot. It'll be at http://weblog.greenpeace.org/it shortly... First - thanks for all the suggestions - much appreciated! To answer some of the questions... Control of the taxonomy is devolved out to the users of the system because otherwise the system ends up with too few users - and legislating for everything folks come up with is close to impossible. The 3000 categories represent just over 12000 documents - which isn't a bad ratio, although some tidying up will probably help. There is also the issue that 27 local offices have access to the system, effectively creating the need for 27 sets of 'how to find the office' or 'what to do when the photocopier breaks' documents.
Read more
Read less
  Cancel

Leave a comment