Finding Useful Content

Written by Nexcerpt on May 14th, 2010 in Patterns & News.

Nexcerpt does an enormous amount of work every day, automatically.  The service visits over 6,000 sources, on a schedule adapted to the volume and timing of articles that source typically offers.  In that process, it notices between 50,000 and 70,000 new articles each day.  Overall, the system observes well over one million clickable links daily, then recognizes the few among them (approximately five percent) that qualify as new, useful, meaningful articles.

Some rules for recognizing the most current and important articles are simple and obvious.  Others are extremely complex; the system may need days or weeks, and hundreds of examples, to learn certain rules.  Since its first day of testing (for 100 sources, in August, 2001), Nexcerpt has considered at least three billion links, choosing 125 million articles from among them.  It knows a lot about how to recognize valuable content.

The system continues to learn; we continue manually to test, groom, and improve the thousands of business rules that populate the database.  We encourage all our clients to report examples of links, articles, or excerpts that don’t seem up to Nexcerpt’s high standards.  We’ll do all we can to adapt…

Comments are closed.

Recent Posts and Other Categories