Lets say we have a database table called responses, each row contains a word.
|1||true||I have a great experience. I was treated very well. The person was very nice|
|2||false||I had a terrible experience. I was not treated very well. I thought person was very mean.|
We map give each word an id on one table. Lets call it the words table.
We go to each row, we get all the words, if the word does not exist in the words table we add the word. (A new id will be created associated with that word)
Then we get every combination of 2 words in that paragraph and add it to a occurrences table.
Question: Do we want to count the same word in the same sentence more than once in relationships? The word ‘I’ and ‘experience’ occur three times together in that second sentence?
Basically we than get all the true occurrences and rank them by count. Same with false occurrences and we can present them however we want.
Blogs / Pictures (of what I might want)
High Resolution Maps of Science
Algorithms extracting linguistic relations and their evaluation
Rapid Miner – open source data mining, java based, has filtering options.
Kind of Related But Very Interesting
Visual Thesaurus – We could do something similar to this but you also pick a minimum threshold and it shows all the word related that meet it.