Tween · posted by vaibhav bhawsar 97 days ago

Tween allows one to sow words and ideas as starting points for exploring conversations. It allows one to get a feel for what might be happening around them. Tween is a zeitgeist tool.

more documentation coming soon. For now the screenshots have notes at the bottom.

Tween was Realized during my internship at the erstwhile Yahoo! Design Innovation Group (yHaus). Thank you everyone at yHaus.

Comment

Multi-Headed · posted by vaibhav bhawsar 493 days ago

Multi Headed Hydra Proto ~ a generative book

genBookMEGA.pdf

The idea that an autobiography can not only be of a person but also of data, ideas, concepts or other inanimate objects is of interest to us. We are interested in looking at how certain ideas have prevailed on the web over the years. Ideas of emotion, conflict, markets, networks and humor to name a few. The generative book is a narrative object that elicits such ideas/concepts/identities that have existed on the web for years and that which continue to be produced or appended to by people in present day cyberspace.

Why the form of a book?
The Internet is a fascinating source for content. Even more so since content today is generated by millions of people from certain parts of the world. Though the Internet and form it has taken is very different from that of a book- We are interested in re looking at one of the oldest methods of storing knowledge in the light of hypertext, networks, content syndication, concept databases such as WordNet and user generated content. In such a scenario what would a book such as this look like and contain? We are not only interested in what the book contains but also in how its content is generated and presented. For example the page of contents or narration is something we would like to revisit in a book like this.

What is generative about the book?
Every time the program is run it produces a unique book. No one book produced by this program is the same. It renders layouts, spreads, colours, type and other graphic elements depending on a set of relationships built from the queried data. The book is generated using an Idea as a seed.

What is this queried data?
The queried data would come from sites and search engines such as yahoo, flickr, craigslist, twitter, slashdot, wikipedia, open social networks, blogs, archives or the wayback machine to name a few. This would mean using open APIs and RSS feeds and some scraping wherever required.

Example:
The seed idea being “Love”
The program would start with concept search on LOVE using wordnet. It would query for hypernyms, hyponyms and synsets of the word LOVE. This is the phase where the program will expand the keyword by looking at its synonyms, antonyms and related words. This is essentially where we would like to create a context for querying information regarding the seed idea (in this case LOVE). Using the returned wordnet results the program would then query other sites using all its synonyms, antonyms, similar concepts.

Wordnet and Visualthesaurus · posted by vaibhav bhawsar 525 days ago

I never knew that visualthesaurus was built around wordnet . Visualthesaurus presents a visual mapping of word relationships and presents an interface to navigate those relationships. In the beginning it was a free service but now they provide commercial options of this product.
You can choose to display Nouns, Adjectives, Verbs and adverbs and their in-between relationships.
At the visualthesaurus core these are the relationships that you can enable/disable to show on the map for your entered word.

I am not sure how each of these options map to the wordnet fucntions such as hypernyms, hyponyms, meronyms but here is my attempt at relating visualThesaurus functions/relationships with that of wordnet.


Is a Part of is —> Meronym – Something X is part of something Y. Paper is part of book. Ink is part of pen. Earth is part of the Milky Way etc.


Entails —> If you are doing X then you are certainly doing Y
or from wikipedia entry on wordnet
entailment: the verb Y is entailed by X if by doing X you must be doing Y (sleeping by snoring)


Verb-groups as coordinate terms: those verbs sharing a common hypernym (not sure)


Domain Category, Domain Region and Domain Usage all seem to some sort of hypernym and hyponym relationships

Here are some further explanations of the relationships I found on the visualthesaurus site.

AboutYouMeEverything? · posted by vaibhav bhawsar 542 days ago

So I wrote a google search based scraping program for the Programming A to Z midterm. The idea was to find as much as one could about a person through the internet. The scraping program would rummage through texts online and bring back every sentence in which your name appeared. So for example if you entered your name it would bring back to you all the sentences online which had your name in it. I initially intended this to work specifically on a social network- more in the spirit of bringing back all the conversations about you (or even someone else).

Source files – aboutyou.zip

aboutWhereareyou-print.txt

aboutSaddamSaid-print.txt

aboutIamPrint.txt

aboutcoconut.txt

about-Ifeellike-print.txt

Punk n Love Texts · posted by vaibhav bhawsar 560 days ago

Result1 Result2

Here are some results of training the Bayesian analysis on a bunch of love/mush song lyrics and another bunch of Punk lyrics.
I then ran the filter on texts like President Bush Delivers State of the Union Address, THE ORIGINALQUIT INDIARESOLUTION, Koran, a couple of manifestos (yes including the communist manifesto again!!). Some funny and some obvious and some not so obvious results follow.

The Bayesian analysis is binary? only outputs something like “this is good” and “this is bad”??

CONSIDER:
PersonA style of writing is Atype
PersonB style of writing is Btype
PersonC style of writing is Ctype
PersonD style of writing is Dtype

Now that we can say we know how each person writes(ATYPE, BTYPE, CTYPE ETC) is it possible to input a text and compute which style it is closest to?

As of now in the Spam filter exercise we know if a given text is spam or not. So its a boolean result. True or false. But with the above example we have four possible outcomes (excluding the result “this text does not match any of the four styles”). So how do we accommodate a logic to compute more than two outcomes using bayesian analysis?

MSC:
this message was not spam according to the filter!! while gmail says its spam! I say its spam! Maybe the filter needs more training set/data.
————————————————————————————————————-
Hi,
Save over 50% on your medication
http://www.ledrx .com
Remove space in the above link

The rain was still drumming heavily against the high, dark glass.
Another clap of thunder shook the windows, and the stormy ceiling
flashed, illuminating the golden plates as the remains of the first
————————————————————————————————————-

Random Communist Treemap or Anymap · posted by vaibhav bhawsar 567 days ago

this is mad map Trying different intensities Treemap from Dan's code

Learning treemaps using Java!! I do understand it better than I did last week. Tried to move text console output into the treemap visualization itself. Somewhat uncontrollable things happened. And here are the results.
So some of the screen shots show high frequency words larger than the others. Some get too small to read or see.

Will be further working on this. Ah but it was a pleasure doing some object oriented programming. But I keep forgetting scope of vars and objects and end up going in circles over the code.
Next I would like to trace the search path to a word. So in other words I would pull out the traversed path in a given search.

Here are the files- edited them to work under eclipse by importing the processing core library
Concordance.java
Tree.java

You And I · posted by vaibhav bhawsar 574 days ago

Update:
You and I now replaces the words ‘you’ with ‘I’ and ‘I’ with ‘you’ and couple of other words.

I tried to transpose all words that represent the first with the second person and vice-versa. For example all the “I” to “YOU”, “YOU” to “I”. I didn’t quite succeed.

Here is the input text:
YouAndI.java

And here is the output
spam_output.txt

IF THEN PLAY - Interactive Fiction · posted by vaibhav bhawsar 581 days ago

Not having entered a IF world before, being in the world of Edifice by Lucian Smith was amusing, new and soul crushing! It was amusing because I could move, in some ways more literally using text when compared to films or other narrative driven mediums. It was new for the reason that I had to align myself with the whole syntax of moving textually. It took me some time to figure out what the commands did and often I kept bumping into Valleys, cliffs, rocks, rivers and stones!! I had no sense of direction.
I have to be honest and tell you that I couldn’t figure out the motive(s) of the game. It was a little too obscure to me. More than any other command I loved the ‘examine’. Its a command curious in nature and it helped me a lot navigating through the story/puzzle. There were the mysterious others who I PAYED ATTENTION TO but it said THE OTHERS ARE RIGHT HERE. So I haven’t yet figured who the ‘others’ are. All I know is that they are harmless! There will be more of IF. HELP!!!

wordCount · posted by vaibhav bhawsar 581 days ago

word count on The Communist Manifesto word count on The Hacker Manifesto

Counting the number of times a word occurs in a given text. This method does not use regular expression and instead works by comparing a given word at a time with the remaining text(string). There is also a blacklist of words that are not calculated for. It also allows one to limit the results for eg: show only words that occur more than x number of times in the text.

CountWords.java

before and after the manifesto · posted by vaibhav bhawsar 583 days ago

communist manifesto - they communist manifesto - power communist manifesto - we communist manifesto - class communist manifesto - free communist manifesto - freedom

Out of curiosity I wanted to run the Communist Manifesto through a text parser that I am learning to write in Dan Shiffman’s class Programming A to Z
Here are some images of the results I got.

For Dan Shiffman’s class Programming A to Z