After reviewing an article posted by
Nova Spivack (cool
timeline graph - check it out) concerning the evolution of what he is titling the "WebOS", my mind has been reeling.
Much of his discussion concerning Web 3.0 circles around the notion of web semantics, or describing web content in ways that are computer friendly, like OWL, perhaps Microformats, etc., etc. I have to say that I couldn't agree more with him. As content becomes more and more dense and the web becomes more and more crowded intellectually it will behoove us to better describe the nature of our information, rather than with simple text or HTML.
One of the difficulties that I have run into with
myotherskills.com is the searching functionality. Namely, the problem of linguistic semantics, or how words relate to one-another. The nature of the search functionality I wanted to design was to step away from proscribed, categorized, or tagged data and to allow people to use natural language to describe their skills/needs. I would let the application do the grunt work of categorizing by indexing (
ruby ferret) the skill descriptions and titles. This, although very effective still has an inherent problem.
When a person knows exactly what skill they are looking for they can easily search for it. But let's consider an example of why that isn't much help.
Let's say that John is on
myotherskills.com looking for a carpenter to help build his new home. He plugs in the word 'carpenter' into the search bar with a proximity of one mile and hits enter. Jamie is a 20 year veteran contractor and can build a house in nothing flat. Unfortunately when she entered her skill as a contractor she did not use the word 'carpenter' to describe her skill. She lives within one mile of John, but will never be found even though they both meant the same thing.
Now you and I know that a 'contractor' and a 'carpenter' are linguistically similar. You could use either to describe the skill you are looking for. But the computer is looking for exact word matches (although it is possible to extend the search for pluralities). There is a huge disconnect. Even if we utilize good web semantics in describing the two words embedded in our code as being 'skill_descriptors' with OWL or some other semantic, the disconnect still exists.
I believe that one of the greatest assets to the next generation of the web will be to encourage semantics not only in the content presentation, but also within the search.
So what could the future look like with a Semantic Search Service?
Let's borrow our example from above. John performs his search again using the word 'carpenter.' Now for futuristic fun let's say that our application takes the word 'carpenter' and sets it against a web-service that compares the word to other similar words, perhaps even in different languages. The web-service returns an array (list) of words or phrases that match the meaning of 'carpenter'. We then take that array and compare it to the skills listed in our database. Wah-Lah! We find that Jamie's skill description contained the word 'contractor' which was in our array from the web-service. John and Jamie meet, build a new house and live happily ever after!
How great would that be?! The semantic search web service would be a substantial benefit to our next generation web-driven world.
The trick is how to build it. I've seen many trying to use complex algorithms to analyze language for meaning, etc. Reverse engineering language by applying complex mathematical formulas in an attempt to derive meaning is certainly interesting. That method would be very useful in translation, but still does not help provide an immediate array of associative words and phrases.
Then there is the 'reach out and solicit normal people help' solution Google used in their image tagging game. Something similar could be used to help develop word associations. Say two people look at one word and try to list as many words they associate with the word as they can within 10 seconds.
Also, creating a spider program that could real-time travel through the web comparing words in contexts, like words found in the context of carpentry websites, would be far more useful since language is just as evolutionary as the web and as technology is.
Of course, sitting down at the computer with a thesaurus, dictionary, and language lexicon entering data could do the same thing...but who has time. ;)
Just some thoughts...thanks for reading. ;)