2011年3月31日木曜日

How do I become a data scientist - from Quora

How do I become a data scientist - from Quora

http://www.quora.com/How-do-I-become-a-data-scientist

This question in Quora was answered very well, and it gave me some ideas.

First, I should really learn Hadoop, study statistics, learn more about MapReduce, and take a plunge at R, etc.

Data science is probably going to be very cool thing to do in the next 10 years.

2011年3月30日水曜日

Quote from O'Reilly's article I found interesting.

"Entrepreneurship is another piece of the puzzle. Patil's first flippant answer to "what kind of person are you looking for when you hire a data scientist?" was "someone you would start a company with." That's an important insight: we're entering the era of products that are built on data."

Source: http://radar.oreilly.com/2010/06/what-is-data-science.html

Data scientists are increasingly getting popular, and in large demand. I really want to find a person I can partner with in the near future, and I hope to find a partner internationally.

You know, start-ups' starters usually had a good partner to begin with. (Microsoft, Google, etc.)

Data Scraping - Screen Scraping

A new word I found out today: Screen Scraping.

2011年3月28日月曜日

Blekko Video



I found this Blekko Video interesting.

2011年3月27日日曜日

Can you benefit from content scraped from your site? - Cool video Part 2



Can you benefit from content scraped from your site? - Cool video Part 2

Cool video I found on Youtube



Cool video I found.

"How would you run your own online marketing company?"

2011年3月18日金曜日

参考になるリンク

http://d.hatena.ne.jp/mjmania/touch/20090205/1233766538

2011年3月7日月曜日

A quote from Daniel Dennett

A quote from Daniel Dennett (translated to Japanese) was inspiring for me.

「学者は図書館をもうひとつ作るための手段である。」

Link of Interest.

http://www.infed.org/thinkers/et-lewin.htm

About Kurt Lewin

2011年3月4日金曜日

Thoughts about Search Engine 2011

My thoughts about Search Engine in the year 2011.

I was studying Google's PageRank and Beyond by Amy N. Langville and Carl D. Meyer, and here are the thoughts that popped up in my head.

* To combat unethical SEO, the search engine must increase its IQ of spider and indexer.

* Most search engines generate revenue by selling profile data to interested parties.

* Users search for:

1. Wanting authoritative pages for searching deeply or for research.
2. Wanting hub pages, for broad search.

My idea: to make a ranking method of extracting only interesting pages.
The definition of interesting is rather broad, but in this case, sites that are buried in the deep web, but are high in quality. In one word RARE sites.