Avoid apply() function in large datasets

Tags: apply, row mins

When we are dealing with large datasets and there is a need to calculate some values like the row/column min/max/rank/mean etc we should avoid the apply function because it takes a lot of time. Instead, we can use the matrixStats package and its corresponding functions. Let’s provide some comparisons.

Example of Minimum value per Row

Assume that we want to get the minimum value of each row from a 500 x 500 matrix. Let’s compare the performance of the apply function from the base package versus the rowMins function from the matrixStats package.

library(matrixStats)
library(microbenchmark)
library(ggplot2)

x <- matrix( rnorm(5000 * 5000), ncol = 5000 )

tm <- microbenchmark(apply(x,1,min),
                     rowMins(x),
                     times = 100L
                    )

tm

Unit: milliseconds
             expr      min         lq       mean    median        uq       max neval
 apply(x, 1, min) 981.6283 1034.98050 1078.04485 1065.4163 1107.9962 1327.9284   100
       rowMins(x)  42.1838   43.80065   46.55752   45.2255   47.6249   81.3097   100

As we can see from the output above, the apply function was 23 times slower than the rowMins. Below we represent the violin plot

autoplot(tm)

Tags: apply, row mins

Share This Post

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

George Pipis March 21, 2024

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s

George Pipis March 15, 2024

Avoid apply() function in large datasets

Example of Minimum value per Row

Share This Post

Leave a Comment Cancel reply

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Image Captioning with HuggingFace

Intro to Chatbots with HuggingFace

Avoid apply() function in large datasets

Example of Minimum value per Row

Share This Post

Leave a Comment Cancel reply

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Image Captioning with HuggingFace

Intro to Chatbots with HuggingFace

#Tag Cloud ☁️