Predictive Hacks

# How to Rename and Relevel Factors in R

A “special” data structure in R is the “factors”. We are going to provide some examples of how we can rename and relevel the factors. For the next examples, we will work with the following data

```df<-data.frame(ID=c(1:10), Gender=factor(c("M","M","M","","F","F","M","","F","F" )),
AgeGroup=factor(c("[60+]", "[26-35]", "[NA]", "[36-45]", "[46-60]", "[26-35]", "[NA]", "[18-25]", "[26-35]", "[26-35]")))
```
``````> df
ID Gender AgeGroup
1   1      M    [60+]
2   2      M  [26-35]
3   3      M     [NA]
4   4         [36-45]
5   5      F  [46-60]
6   6      F  [26-35]
7   7      M     [NA]
8   8         [18-25]
9   9      F  [26-35]
10 10      F  [26-35]``````

## Rename Factors

Let’s say that I want to convert the empty string of Gender to “U” from the Unknown

```levels(df\$Gender)[levels(df\$Gender)==""] ="U"
```

Let’s say that we want to merge the age groups. For instance the new categories will be “[18-35]”, “[35+], “[NA]”

```levels(df\$AgeGroup)[levels(df\$AgeGroup)=="[18-25]"] = "[18-35]"
levels(df\$AgeGroup)[levels(df\$AgeGroup)=="[26-35]"] = "[18-35]"

levels(df\$AgeGroup)[levels(df\$AgeGroup)=="[36-45]"] = "[35+]"
levels(df\$AgeGroup)[levels(df\$AgeGroup)=="[46-60]"] = "[35+]"
levels(df\$AgeGroup)[levels(df\$AgeGroup)=="[60+]"] = "[35+]"

```

Notice that we could have done it in once, but it is very risky because sometimes we can have different order than what we expected.

```levels(df\$AgeGroup)<-c("[18-35]","[18-35]","[35+]","[35+]","[35+]", "[NA]")
```

By applying the changed we mentioned before, we get the following data.

``````> df
ID Gender AgeGroup
1   1      M    [35+]
2   2      M  [18-35]
3   3      M     [NA]
4   4      U    [35+]
5   5      F    [35+]
6   6      F  [18-35]
7   7      M     [NA]
8   8      U  [18-35]
9   9      F  [18-35]
10 10      F  [18-35]``````

## Relevel Factors

Let’s say that we want the “[NA]” age group to appear first

```df\$AgeGroup<-factor(df\$AgeGroup, c("[NA]", "[18-35]" ,"[35+]"))
```

Another way to change the order is to use `relevel()` to make a particular level first in the list. (This will not work for ordered factors.). Let’s day that we want the ‘F’ Gender first

```df\$Gender<-relevel(df\$Gender, "F")
```

By applying these changes, we can see how the factors have changed level.

``````> str(df)
'data.frame':	10 obs. of  3 variables:
\$ ID      : int  1 2 3 4 5 6 7 8 9 10
\$ Gender  : Factor w/ 3 levels "F","U","M": 3 3 3 2 1 1 3 2 1 1
\$ AgeGroup: Factor w/ 3 levels "[NA]","[18-35]",..: 3 2 1 3 3 2 1 2 2 2``````

## More Data Science Hacks?

You can follow us on Medium for more Data Science Hacks

### Get updates and learn from the best

Python

#### Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

#### Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s