Project of The Sunlight Foundation    
The Open House Project from The Sunlight Foundation

Technology Notes, February 12th

February 12th, 2008 by John Wonderlich · No Comments

Reuters introduced a new entity extraction tool called Calais, which, through an API or web-based submission form, takes text and recognizes entities, outputting an RDF file of recognized entities, complete with URIs.  I wonder if entity extraction will become like spellchecking, widely available and free.  I also wonder if tools that use semantic structured data will adapt in order to take advantage of what will likely become redundant extraction tools, sort of like running redundant ocr, but for semantic elements.  Could multiple extraction tools also lead to a sort of consensus-building around markup languages, where entity extraction tools legitimize certain modes of reference by virtue of reliably recognizing them, in perhaps the same way search engines normalize hypertext links?

Next: I happened across Twiddla.com yesterday, which is a free sharable web-based whiteboard that lets you doodle on or annotate web-pages with others in a chat room.  Highly amusing, and probably useful too.  I seem to remember something like this coming out a year ago–a tool that added chat to every webpage, but I don’t remember what it’s called, and haven’t seen it since.  Guess it wasn’t that successful.

I’ve also been making myself familiar with the work or dataportability.org, promoting open standards for data sharing.  I especially like OPML for its simplicity and usefullness (I’ll be sharing a big OPML file here soon), and hope that APML includes an element for sharing one’s OPML along with your other social information.

Tags: OpenHouse · ampl · calais · opml · rdf · semantic · xml

0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

Leave a Comment