### How to Apply Text Distances and Fuzzy Joins

Edit Distance for measuring the Text Distance Today we will talk about text similarities and how we can “calculate” a

### How to build Stacked Ensemble Models in R

At this post, we will show you how you easily apply Stacked Ensemble Models in R using the H2O package.

### Computer Vision: Face Detection in OpenCV

This post is a practical example of how we can use OpenCV with Python for detecting faces in a video.

### Bookmaker’s Margin in Multiple Bets

In the previous post we explained how to calculate the Bookmaker’s Margin for a single bet. Now, we are going

### Dplyr Pipes In Python Using Pandas

One of the advantages of R is the data manipulation process using the dplyr library. It has a fast, easy

### Bookmaker’s Margin and Arbitrage Betting

Calculate Bookmaker’s Margin Betting companies are making a great profit due to their margins, which means that the gambler is

### Applications of Exponential Distribution

The Exponential Distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process

### How to Rename and Relevel Factors in R

A “special” data structure in R is the “factors”. We are going to provide some examples of how we can

### Applications of Poisson Distribution

The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in

### PySpark: Logistic Regression with TF-IDF on N-Grams

This post is about how to run a classification algorithm and more specifically a logistic regression of a “Ham or

### Binomial Distribution: Min Number of Shots Needed to Score at Least Once

For Data Science positions, the interviewers use to ask questions about probabilities which require some knowledge of statistics. Today, we