Xerox Scientists Apply Insights From Ethnography to Develop New Way to Categorize Documents

AddThis Feed Button

July 14th, 2005 Leave a comment Visited 15 times, 1 so far today

Xerox Scientists Apply Insights From Ethnography to Develop New Way to Categorize Documents

Employing the same ethnographic methods used to observe the social order on a Polynesian atoll or document the culture of natives in southern Siberia, Xerox Corporation (NYSE: XRX) scientists have injected more human know-how into text mining, the practice of using computer analysis of documents to extract new information. The result is better categorization, with higher-quality, customized results.

In a paper titled “Work Practice in Research: A Case Study” being presented here today at the International Council on Systems Engineering symposium, Nathaniel G. Martin, an ethnographer and computer scientist in the Xerox Innovation Group in Webster, N.Y., described the new technology.

Categorization is a powerful form of text mining. It associates a document with subject categories that a computer learns from a “training set” of documents that a subject matter expert has classified by hand. The new software program improves the speed and accuracy of categorizing systems because it helps the subject matter expert interactively create the training set, choosing and refining the categories and the conditions under which they are applied.

It’s a technique that could improve results from traditional categorizing systems and is particularly useful for classifying short documents, according to Martin.

Read the complete Press Release





TechWhack on Facebook

Comments are closed.

Related Posts

Popular Posts

blank