.comment-link {margin-left:.6em;}

Beyond the Fat Wire

Wednesday, March 16, 2005

Remixing Culture with RDF

Remixing Culture with RDF: Running a Semantic Web Search in the Wild
Matthew Haughey, Creative Director, Creative Commons
Mike Linksvayer, CTO, Creative Commons

Creative Commons

Licenses and Metadata
  • tracking the licesne
  • format of the work
  • permissions, requirements
  • extra metadata

Why use the Semantic Web?

  • small organization
  • natural language search not good for plain text metadata
  • decentralization -- other search engines could use it
  • existing RDF toolkits could be used

Metadata format
  • publishers and search engine needs
  • considered html head elements
  • considered robots.txt hacks
  • considered data in extra files/link element
  • chose RDF in HTML

Metadata format II
  • ease of use primary goal - copy/paste button and rdf in one chunk
  • any custom elements automated by license app
  • (one more but i missed it .... it's hard to type balancing an 8 lb laptop on one knee)

Nutch -- open source web crawler/indexer/query interface, aims to be massively scalable, built on Lucene Java text serach library

lucene.apache.org

Oregon State used Nutch to replace their Google search appliances

osuosl.org/news_folder/nutch

the future of Creative Commons metadata: RDF/A, Semantic XHTML, GRDDL (pron: griddle)

Conclusions:
  • semantic web lets anyone use the entire web as a db
  • nutch is a mostly prebuilt app for your domain
  • domain specific search engines without the infrasturcture of a search engine company
  • solves semantic catch 22: publishing data/consuming data

0 Comments:

Post a Comment

<< Home