In this post, we will try to measure the effectiveness of Covid-19 Vaccinations. This analysis is very basic and we cannot drive safe conclusions and is based on the assumption that as more people getting vaccinated we expect fewer daily cases and deaths. Some things that we need to mention:
- We do not take into consideration the seasonality of the Covid-19. We know that during summer we should expect fewer daily cases.
- We do not take into consideration the measurements imposed by the government, like lockdown etc.
- The percentage of the population who got vaccinated is a monotonically increasing series.
Get the Covid-19 Data and Run the Analysis
Our source of data is the Our World in Data and Worldometers. For this example, we will consider the country Israel and the percentage of people who are fully vaccinated and can be found here. Also, we will work with the covid-daily Python library to get the data from the Worldometers. Note that we have data up to 2021-03-18 and by this time, Isreal has fully vaccinated the 51.77% of the population. Finally, keep in mind that the records of fully vaccinated people started in 2021–01–10.
Fully Vaccinated vs New Cases
import pandas as pd import covid_daily KPI = covid_daily.data(country='israel', chart = 'graph-cases-daily', as_json=False) Vaccinated = pd.read_csv("share-people-fully-vaccinated-covid.csv") # Get Data only for Israel Vaccinated.query("Entity=='Israel'" , inplace = True) # Start Measuring from the time the vaccines started Vaccinated.query("people_fully_vaccinated_per_hundred>0" , inplace = True) # Conver the Day to Datetime Index Vaccinated['Day'] = pd.DatetimeIndex(Vaccinated['Day']) # Join the data merged = KPI.merge(Vaccinated, how='inner', left_index=True, right_on='Day') merged.index=merged.Day merged.index.name = 'Date' merged
Scatter Plot
From the scatter plot below, we can see a negative correlation between New Cases and the Percentage of Fully Vaccinated People.
merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Cases', figsize=(12,8), title = "Israel: Daily Cases vs %Population Vaccinated")
Smooth the Data with a Rolling Window of 7 Days
There is an issue with the daily cases and in the way that they measure them. For that reason, it is better to smooth the data by considering the average value of a rolling window of 7 days for both variables.
rolling_merged = merged[['people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Cases']].rolling(7).mean().dropna() rolling_merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Cases', figsize=(12,8), title = "Israel: Daily Cases vs %Population Vaccinated")
Now, we have a clearer view and it is obvious that the data are strongly negatively correlated. Let’s get also the correlation for both raw and smoothed data:
For the smoothed data the correlation is -0.965 and for the raw data is -0.787
Fully Vaccinated vs Deaths
Let’s run the same analysis, but this time by considering the daily deaths.
KPI = covid_daily.data(country='israel', chart = 'graph-deaths-daily', as_json=False) # Join the data merged = KPI.merge(Vaccinated, how='inner', left_index=True, right_on='Day') merged.index=merged.Day merged.index.name = 'Date' merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Deaths', figsize=(12,8), title = "Israel: Daily Deaths vs %Population Vaccinated")
rolling_merged = merged[['people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Deaths']].rolling(7).mean().dropna() rolling_merged.plot.scatter('people_fully_vaccinated_per_hundred', 'Novel Coronavirus Daily Deaths', figsize=(12,8), title = "Israel: Daily Deaths vs %Population Vaccinated")
As we can see the correlation is also negative when we consider the daily deaths, where in the smooth data it is -0.8987 and in the raw data is -0.603
Finally, we can run a rolling correlation on the smoothed day by considering a rolling window of 30 days.
rolling_merged['people_fully_vaccinated_per_hundred'].\ rolling(30).corr(rolling_merged['Novel Coronavirus Daily Deaths']).\ dropna().plot(figsize=(12,8), title="Rolling Correlation")
The Takeaway
We considered Israel because is by far the country with the highest rate of fully vaccinated people. We could have run an analysis based on the vaccinated people with at least one dose. In this analysis can be a criticism that it is possible for the daily cases and the deaths to decrease for other reasons apart from vaccinations. However, we cannot neglect this strongly negative correlation and especially during February and March which are months where we observe more Covid-19 cases. It is interesting to keep monitoring this correlation especially when 70% of the population will have been vaccinated which is a number relevant to the so-called “herd immunity”.
Extra Part
For convenience let’s have a look at the daily new cases and deaths, in the raw format and by taking into consideration a rolling window of 7 days (moving average).
covid_daily.data(country='israel', chart = 'graph-cases-daily', as_json=False).plot(figsize=(12,8), title="New Cases") covid_daily.data(country='israel', chart = 'graph-cases-daily', as_json=False).rolling(7).mean().plot(figsize=(12,8), title="New Cases with Moving Average of 7 days") covid_daily.data(country='israel', chart = 'graph-deaths-daily', as_json=False).plot(figsize=(12,8), title="Deaths") covid_daily.data(country='israel', chart = 'graph-deaths-daily', as_json=False).rolling(7).mean().plot(figsize=(12,8), title="Deaths with Moving Average of 7 days")