
Getting 1400MB/s throughput on a SSD.

This SSD has a throughput of 1400MB/s and I want it so much. ;) It uses PCI Express slot instead of SATA.

but the thing is, it's ULTRA expensive. It costs 998,000yen incl. tax!

I guess I will wait for some time until the price drops to 1/10 the price....

Well, if somebody is thinking of scaling up storage having unlimited budget with a 1400MB/s SSD that has a capacity of 300GB, it might be worth the cost.

Or if somebody is kind enough to donate me one of this, I will be eternally grateful.




Some Basics and Application You Should Know When Making a Search Engine.

Some basic keywords I will like to introduce to you:

Outbound Link
Inbound Link
Keyword Density

One simple algorithm I am sharing with you is the keyword density calculation. To put it simply, just calculate how many keywords there are, and get the percentage of the keyword you want to calculate against the whole set of keywords.

k/ak where k is keyword and ak is all keywords set.

Stop Word - You should check this out.
Score - Very important in ranking. I would say this is the most basic, but the hardest part of an search engine.

Definition of "score" according to SearchEngineDictionary.com

"Search engines usually arrange search results from the most relevant to the least relevant (as determined by the search engine's algorithm). In order to rank documents, the search engine assigns a score to each page and those with the highest scores are listed first. Most search engines simply give the maximum score to the most relevant document and score all other relevant documents relative to that document. Others compare all documents to a theoretically perfect document. The score of a web page therefore refers to its relevance as perceived by a specific search engine."

Definition of "scored keyword phrase" according to SearchEngineDictionary.com

"Name given to phrases that searchers use that are tracked by a system the records the number of times the phrase was used in a search, also known as the score."

One interesting technology I found is Lucandra, a Cassandra based Lucene backend.

Making a search engine will require months of testing, and improvement, it's almost like a never ending cycle. BUT, it is well worth it. Monetizing as a goal might be good, but it is important to make the search engine experience superb, and make it the main goal.

Enhancing lots of optional search methods or adding lots of database source (twitter) is a good idea.

Competition is harsh, too as small sized search engine attract only a tiny percentage of the whole search engine traffic. Entering the niche world is a very good idea.
I think starting from a different approach than a traditional search engine is crucial in achieving success in this world.

Hadoop O'Reilly Webcast

Cluster Computing and MapReduce Lecture 4

Another interesting video worth watching.

Hadoop Visualization

Hadoop Visualization - interesting video.


Keywords I'm interested in.

Here are some keywords I am interested in learning.




Note to self: interesting server node:https://supcom.hgc.jp/japanese/sys_const/000012.html

Super computer(with GPGPU) for rent http://itpro.nikkeibp.co.jp/article/NEWS/20101102/353730/



「運命は大胆なる者に味方する。」- デジデリウス エラスムス


