R重新編碼數據表中的字符列 - R recode a character column in a data table -开发者知识库

R重新編碼數據表中的字符列 - R recode a character column in a data table -开发者知识库,第1张

I have a column in a data table that contains names of variables. The column name is nutrient. For display purposes I want to replace a variable name in this column like "vit_c_mg" to "Vitamin C". I have a list of the old and new variables. I could do something like

我在數據表中有一個包含變量名稱的列。列名是營養素。出於顯示目的,我想將此列中的變量名稱替換為“vit_c_mg”到“Vitamin C”。我有一個舊的和新的變量列表。我可以做點什么

for (i in 1:length(list1){
    DT[nutrient %in% list1[i], nutrient := list2[i]]
}

but there must be a better data.table way of doing this.

但必須有一個更好的data.table方式來做到這一點。

2 个解决方案

#1


4  

I happen to have small dta.table named dt

我碰巧有一個名為dt的小dta.table

dt
    x y z          d1 d2
 1: 1 1 b 0.948027912  1
 2: 2 2 a 0.926351588  1
 3: 4 1 a 0.555704929  1
 4: 4 1 a 0.987548561  1
 5: 2 1 a 0.093421508  1

It's pretty easy to use an existing column value to index a translation table:

使用現有列值來索引轉換表非常容易:

 dt[ , z := c(a="v",b="w")[z] ]

> dt
    x y z          d1 d2
 1: 1 1 w 0.948027912  1
 2: 2 2 v 0.926351588  1
 3: 4 1 v 0.555704929  1
 4: 4 1 v 0.987548561  1
 5: 2 1 v 0.093421508  1

The values of nutrient should match up with the names in the translation vector. There needs to be a name in the vector for every current value in the column or you will get NA's. (Might be safer to create a new column before discarding the old values.)

營養素的值應與翻譯載體中的名稱相匹配。向量中需要為列中的每個當前值指定一個名稱,否則您將獲得NA。 (在丟棄舊值之前,可能更安全地創建新列。)

最佳答案:

本文经用户投稿或网站收集转载,如有侵权请联系本站。

发表评论

0条回复