Shekispeaks

Me   Work    Queries?   

Ideas, opinions, travel, technology, programming

May 23, 2012 at 11:43am

7 notes

Map Reduce a Local File?

Here is what I want to do. I have a large data file (more than 50GB), I run scripts to get some numbers out of it.

A traditional script looks at the file in a sequential order and performs some operation. Making it multi-threaded does not help much as the file is accessed sequentially, and the threads will be blocked on it. Also writing such a multi threaded program is very painful.

Is there way I can load the file at different offsets, load it in chunks basically and process the chunks individually. Rather perform Mapper and Reducer Jobs, on a local file without the hoopla of Hadoop - Hadoop without the network.

Does this make sense, or am I bat shit crazy. Are there any tools which allow me to do this?

February 22, 2012 at 11:09am

0 notes

It is better to be a human being dissatisfied than a pig satisfied; better to be a Socrates dissatisfied than a fool satisfied. And if the fool or the pig thinks otherwise, it is because they have no experience of the better part.

—JOHN STUART MILL, Utilitarianism

— http://logback.qos.ch/manual/groovy.html

January 21, 2012 at 9:56am

13 notes

Sentiment Analysis

While I was reading upon sentiment analysis, I came across this on Stack Overflow (link)

A linguistics professor was lecturing to her class one day. “In English,” she said, “A double negative forms a positive. In some languages, though, such as Russian, a double negative is still a negative. However, there is no language wherein a double positive can form a negative.”

voice from the back of the room piped up, “Yeah …right.”

I think semantic analysis is a tough cookie.