(2013) Top popular languages for analytics / data mining / data science

A poll conducted by KDnuggets recently asked a question which I believe many of people like me may have interest in: What programming/statistics languages you used for an analytics / data mining / data science work in 2013?

The results show below. I’m glad that I know all top 4 languages and kinda use them everyday. And I’m also learning Hadoop by myself, which means future of data management, at least I believe.

How about you guys?

Continue reading

[who you should follow] The Most Influential in Big Data on Twitter

I’m a big fan of Twitter and also like big data. It is a headache for me to find someone who are good at big data to follow on Twitter because there are way too many people there.

Fortunately, Big Data Republic solved this problem for me. They have run a poll to figure out who is the most influential in big data on Twitter. Here is the list and you can scroll down to see the entire list.

[iframe src=”http://groups.peerindex.com/bigdatarepublic/big-data-100/embed” width=”600″ height=”1180″ scrolling=”yes”]






The Most Important Algorithms

We all know that computer programing is a kind of core technique needed as a data scientist  And algorithms are the foundation of computer science. So, I bet you have asked such question: what are the most important algorithms?

Dr. Christoph Koutschan from RICAM (Johann Radon Institute for Computational and Applied Mathematics) conducted a survey to figure out this question. Although the result doesn’t come out yet, and it is really difficult to reach a consensus on such a big question, here I list all the candidates in his survey and hope you can find some which you are familiar with and use everyday.

1. A* search algorithm 
Graph search algorithm that finds a path from a given initial node to a given goal node. It employs a heuristic estimate that ranks each node by an estimate of the best route that goes through that node. It visits the nodes in order of this heuristic estimate. The A* algorithm is therefore an example of best-first search.

2. Beam Search
Beam search is a search algorithm that is an optimization of best-first search. Like best-first search, it uses a heuristic function to evaluate the promise of each node it examines. Beam search, however, only unfolds the first m most promising nodes at each depth, where m is a fixed number, the beam width.

3. Binary search
Technique for finding a particular value in a linear array, by ruling out half of the data at each step.Continue reading