(2013) Top popular languages for analytics / data mining / data science

A poll conducted by KDnuggets recently asked a question which I believe many of people like me may have interest in: What programming/statistics languages you used for an analytics / data mining / data science work in 2013?

The results show below. I’m glad that I know all top 4 languages and kinda use them everyday. And I’m also learning Hadoop by myself, which means future of data management, at least I believe.

How about you guys?

Continue reading

The Most Important Algorithms

We all know that computer programing is a kind of core technique needed as a data scientist  And algorithms are the foundation of computer science. So, I bet you have asked such question: what are the most important algorithms?

Dr. Christoph Koutschan from RICAM (Johann Radon Institute for Computational and Applied Mathematics) conducted a survey to figure out this question. Although the result doesn’t come out yet, and it is really difficult to reach a consensus on such a big question, here I list all the candidates in his survey and hope you can find some which you are familiar with and use everyday.

1. A* search algorithm 
Graph search algorithm that finds a path from a given initial node to a given goal node. It employs a heuristic estimate that ranks each node by an estimate of the best route that goes through that node. It visits the nodes in order of this heuristic estimate. The A* algorithm is therefore an example of best-first search.

2. Beam Search
Beam search is a search algorithm that is an optimization of best-first search. Like best-first search, it uses a heuristic function to evaluate the promise of each node it examines. Beam search, however, only unfolds the first m most promising nodes at each depth, where m is a fixed number, the beam width.

3. Binary search
Technique for finding a particular value in a linear array, by ruling out half of the data at each step.Continue reading