Popular algorithms for data science – an introduction

A variety of Machine Learning and data mining algorithms are available for creating  valuable analytic platforms. Established goals will determine which algorithms are used to sort out and process the information available. Various algorithms have been developed to deal specifically with business problems. Other algorithms were designed to augment current existing algorithms, or to perform in new ways. According to Moretto, Some algorithms will be more appropriate than others. There are a range of algorithms to choose from. They can do anything from recognizing faces to reminding clients they have an appointment.

Algorithm models take different shapes, depending on their purpose. Using different algorithms to provide comparisons can offer some surprising results about the data being used. Making these comparisons will give a manager more insight into business problem and solutions. They can come as a collection of scenarios, an advanced mathematical analysis, or even a decision tree. Some models function best only for certain data and analyses. For example, classification algorithms with decision rules can be used to screen out problems, such as a loan applicant with a high probability of defaulting.

Unsupervised clustering algorithms can be used to find relationships within an organization’s dataset. These algorithms can be used to find different kinds of groupings within a customer base, or to decide what customers and services can be grouped together. An unsupervised clustering approach can offer some distinct advantages, as compared to the supervised learning approaches. One example is the way novel applications can be discovered by studying how the connections are grouped when a new cluster is formed.

Laila Moretto covered the primary uses of many algorithms in her presentation (see the video link at the bottom for a deeper discussion of each algorithm), including:

  • K Means Clustering
  • Association Rules
  • Linear Regression
  • Logistic Regression
  • Naïve Bayesian Classifier
  • Decision Trees
  • Time Series Analysis
  • Text Analysis

Choosing Data Scientists for Employment

Businesses such as Facebook and Google have numerous Data Scientists on their staff. Companies like Target and Macy’s are moving in that direction. The skills of Data Scientists are necessary, both in setting up the data system, choosing an algorithm, and in interpreting the results. Choosing the right algorithms for an organization involves a combination of science and art. The “artistic” part is based on data mining experience, combined with knowledge of the business and its customer base. These abilities play a crucial role in choosing an algorithm model capable of delivering business queries accurately. For this to happen, a competent staff of Data Scientists needs to be in place.

Laila Moretto has the following suggestions when interviewing a Data Scientist:

  • Ask, “Was your education more related to Machine Learning, or decision-making analytics?” (A business may need one of each, or more.)
  • Look for graduates that have done Machine Learning projects, capstone projects, or worked in competitions. (Essentially, people with some hands on experience.)
  • Look for graduates who have done internships in areas similar to the ones being planned.

The use of Big Data, when coupled with Data Science, allows organizations to make more intelligent decisions. Its evolution has resulted in a rapid increase in insights for enterprises utilizing such advancements. Learning to understand Big Data, and hiring a competent staff, are key to staying on the cutting edge in the information age.

Leave a Reply

Your email address will not be published. Required fields are marked *