Hack: The “count(case when … else … end)” in dplyr

When I run quires in SQL (or even HiveQL, Spark SQL and so on), it is quite common to use the syntax of count(case when.. else ... end). Today, I will provide you an example of how you run this type of commands in dplyr.

Let’s start:

library(sqldf)
library(dplyr)


df<-data.frame(id = 1:10,
               gender = c("m","m","m","f","f","f","m","f","f","f"),
               amt= c(5,20,30,10,20,50,5,20,10,30))

df

Let’s get the count and the sum per gender in different columns in SQL.

sqldf("select count(case when gender='m' then id else null end) as male_cnt,
              count(case when gender='f' then id else null end) as female_cnt,
              sum(case when gender='m' then amt else 0 end) as male_amt,
              sum(case when gender='f' then amt else 0 end) as female_amt
              from df")

Output:

  male_cnt female_cnt male_amt female_amt
1        4          6       60        140

Let’s get the same output in dplyr. We will need to subset the data frame based on one column.

df%>%summarise(male_cnt=length(id[gender=="m"]),
               female_cnt=length(id[gender=="f"]),
               male_amt=sum(amt[gender=="m"]),
               female_amt=sum(amt[gender=="f"])
               )

Output:

  male_cnt female_cnt male_amt female_amt
1        4          6       60        140

Share This Post

2 thoughts on “Hack: The “count(case when … else … end)” in dplyr”

Bill Venables

October 19, 2020 at 12:34 am

The awkwardness with the R version mainly comes from requiring the result to be in an awkward form. A more usual way of achieving an equivalent result in a better form would be to use

df %>%
group_by(gender) %>%
summarise(n = n(), amt = sum(amt), .groups = ‘drop’)
Reply
- George Pipis
  
  October 19, 2020 at 8:31 am
  
  The output is in a different format. You can use the tidyR package to reshape the results. However, our goal here was to show the equivalent of “case when” in dplyr
  Reply

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

George Pipis March 21, 2024

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s

George Pipis March 15, 2024

Hack: The “count(case when … else … end)” in dplyr

Share This Post

2 thoughts on “Hack: The “count(case when … else … end)” in dplyr”

Leave a Comment Cancel reply

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Image Captioning with HuggingFace

Intro to Chatbots with HuggingFace

Hack: The “count(case when … else … end)” in dplyr

Share This Post

2 thoughts on “Hack: The “count(case when … else … end)” in dplyr”

Leave a Comment Cancel reply

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Image Captioning with HuggingFace

Intro to Chatbots with HuggingFace

#Tag Cloud ☁️