VLOG - Auto-Tagging, the quest for Compounds!

In this video, I try to explain the concept of Auto-tagging, which can be considered as the task of extracting terms automatically out of unstructured textual data

Terminology mining, term extraction, term recognition, or glossary extraction, is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus.

In the semantic web era, a growing number of communities and networked enterprises started to access and interoperate through the internet. Modeling these communities and their information needs is important for several web applications, like topic-driven web crawlers, web services, recommender systems, etc. The development of terminology extraction is essential to the language industry.

From Wikipedia - Terminology extraction

In this log, I speak about Auto-tagging:

  • Why we need it
  • What is the challenge
  • The different approaches: naive term frequency versus global semantic web-service
  • The key role of compounds (i.e.: double-worded tags)

del.icio.us Slashdot Digg Technorati Google Windows Live Yahoo Sphere

VLOG - Search & Tag Clouds

This is my first video log. I will try to address topics more openly and frequently this way.

In this log, I speak about Search and Tag Clouds:

  • Difference between Search and Tag Clouds
  • Similarity between Search Clouds and Search Autocomplete Suggestions
  • Potential to extend their usage for navigation
  • Challenge to mix search and tags Clouds together (usage versus document frequency)

del.icio.us Slashdot Digg Technorati Google Windows Live Yahoo Sphere

Tag clouds - what is at stake?

In echo to the ever-increasing popularity of the tag-clouds, the emerging domain of auto-tagging aims to be the solution to populate these attractive visual components without requiring to tag each page individually. While many approaches try to solve this challenge, most of them do not address the real underlying technical challenges. But in order to evaluate how tag-clouds can deliver their full potential, we have to analyze what is at stake for the end-users and understand why tag-cloud could make a real difference in the way to access information.

In this new domain where art can meet with technology, visualization with data mining and repetitive manual efforts with automation, I felt it was interesting to inspect the different components and their roles; and to figure out what is new and what is old, what is solved and what is not, what is possible and what is pure fantasy…

wordle_tag_cloud

(more…)

del.icio.us Slashdot Digg Technorati Google Windows Live Yahoo Sphere