Predictive Hacks

Measure the Effectiveness of Covid-19 Vaccinations

effectiveness of vaccination

In this post, we will try to measure the effectiveness of Covid-19 Vaccinations. This analysis is very basic and we cannot drive safe conclusions and is based on the assumption that as more people getting vaccinated we expect fewer daily cases and deaths. Some things that we need to mention:

  • We do not take into consideration the seasonality of the Covid-19. We know that during summer we should expect fewer daily cases.
  • We do not take into consideration the measurements imposed by the government, like lockdown etc.
  • The percentage of the population who got vaccinated is a monotonically increasing series.

Get the Covid-19 Data and Run the Analysis

Our source of data is the Our World in Data and  Worldometers. For this example, we will consider the country Israel and the percentage of people who are fully vaccinated and can be found here. Also, we will work with the covid-daily Python library to get the data from the Worldometers. Note that we have data up to 2021-03-18 and by this time, Isreal has fully vaccinated the 51.77% of the population. Finally, keep in mind that the records of fully vaccinated people started in 2021–01–10.

Fully Vaccinated vs New Cases

import pandas as pd
import covid_daily

KPI ='israel', chart = 'graph-cases-daily',  as_json=False)

Vaccinated = pd.read_csv("share-people-fully-vaccinated-covid.csv")

# Get Data only for Israel
Vaccinated.query("Entity=='Israel'" , inplace = True)

# Start Measuring from the time the vaccines started
Vaccinated.query("people_fully_vaccinated_per_hundred>0" , inplace = True)

# Conver the Day to Datetime Index
Vaccinated['Day'] = pd.DatetimeIndex(Vaccinated['Day'])

# Join the data
merged = KPI.merge(Vaccinated, how='inner', left_index=True, right_on='Day')
merged.index=merged.Day = 'Date'


Scatter Plot

From the scatter plot below, we can see a negative correlation between New Cases and the Percentage of Fully Vaccinated People.

merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Cases',  
                    figsize=(12,8), title = "Israel: Daily Cases vs %Population Vaccinated")

Smooth the Data with a Rolling Window of 7 Days

There is an issue with the daily cases and in the way that they measure them. For that reason, it is better to smooth the data by considering the average value of a rolling window of 7 days for both variables.

rolling_merged = merged[['people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Cases']].rolling(7).mean().dropna()

rolling_merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Cases',  
                    figsize=(12,8), title = "Israel: Daily Cases vs %Population Vaccinated")

Now, we have a clearer view and it is obvious that the data are strongly negatively correlated. Let’s get also the correlation for both raw and smoothed data:

For the smoothed data the correlation is -0.965 and for the raw data is -0.787

Fully Vaccinated vs Deaths

Let’s run the same analysis, but this time by considering the daily deaths.

KPI ='israel', chart = 'graph-deaths-daily',  as_json=False)

# Join the data
merged = KPI.merge(Vaccinated, how='inner', left_index=True, right_on='Day')
merged.index=merged.Day = 'Date'

merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Deaths',  
                    figsize=(12,8), title = "Israel: Daily Deaths vs %Population Vaccinated")
rolling_merged = merged[['people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Deaths']].rolling(7).mean().dropna()

rolling_merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Deaths',  
                    figsize=(12,8), title = "Israel: Daily Deaths vs %Population Vaccinated")

As we can see the correlation is also negative when we consider the daily deaths, where in the smooth data it is -0.8987 and in the raw data is -0.603

Finally, we can run a rolling correlation on the smoothed day by considering a rolling window of 30 days.

rolling(30).corr(rolling_merged['Novel Coronavirus Daily Deaths']).\
dropna().plot(figsize=(12,8), title="Rolling Correlation")

The Takeaway

We considered Israel because is by far the country with the highest rate of fully vaccinated people. We could have run an analysis based on the vaccinated people with at least one dose. In this analysis can be a criticism that it is possible for the daily cases and the deaths to decrease for other reasons apart from vaccinations. However, we cannot neglect this strongly negative correlation and especially during February and March which are months where we observe more Covid-19 cases. It is interesting to keep monitoring this correlation especially when 70% of the population will have been vaccinated which is a number relevant to the so-called “herd immunity”.

Extra Part

For convenience let’s have a look at the daily new cases and deaths, in the raw format and by taking into consideration a rolling window of 7 days (moving average).'israel', chart = 'graph-cases-daily',  as_json=False).plot(figsize=(12,8), title="New Cases")'israel', chart = 'graph-cases-daily',  as_json=False).rolling(7).mean().plot(figsize=(12,8), title="New Cases with Moving Average of 7 days")'israel', chart = 'graph-deaths-daily',  as_json=False).plot(figsize=(12,8), title="Deaths")'israel', chart = 'graph-deaths-daily',  as_json=False).rolling(7).mean().plot(figsize=(12,8), title="Deaths with Moving Average of 7 days")

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Leave a Comment

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore


Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.


Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s