How many documents can you read and classify in a day?
A person might be able to scan a thousand or so documents or webpages in a day and determine whether they are relevant or not to a given research topic such as the yogurt market in France or the mobile ringtone market in China or whatever topic you might be interested in finding the latest information on. A human researcher can read only so many pages per minute, but will have a fairly low error rate (relevant documents classified as irrelevant, or vice versa) because after all, she has some idea what she is looking for and she can decide on how far to extend the boundaries of the topic on a realtime basis during the course of the day and as she reads more documents and learns more about the topic herself.
A machine programmed once (a fully-automated system), for example with a bunch of keywords to scour the internet or a corporate intranet for, can classify some tens of thousands of documents as relevant (depending on the specificity of the topic and how much information about it is out there). After it has picked the low-hanging fruit, however, rate of misclassified documents begins to go up, and it has no way to adjust its idea of what it is looking for to include documents that may well be relevant to your topic but that don’t include enough of the keywords or precisely the same keywords in the set you initially identified.
A machine that works together with a human researcher in a continuous feedback loop (a semi-automated system) can adjust its model of the relevant research topic in real-time based on implicit and explicit feedback. It can therefore deploy its more robust model across a much larger set of documents and webpages while at the same time keeping the number of misclassified documents down to levels close to what a human researcher would produce…if a human could surf the internet or a corporate intranet and read and process content as fast as a machine.
In addition to a powerful tool for finding and pulling in content relevant to your research topic, the Conatix system streamlines the business research process workflow to help you organize, use and share that content with the other members of your research team.