我有一個資料框,其中一個列是,id并且在記錄資料期間某些值被弄亂了。
這是資料型別的示例
dput(df)
structure(list(Id = c("'110171786'", "'1103fbfd5'", "'0700edf6dc'",
"'1103fad09'", "'01103fc9bb'", "''", "''", "0000fba2b'", "'01103fb169'",
"'01103fd723'", "'01103f9c34'", "''", "''", "''", "'01103fc088'",
"'01103fa6d8'", "'01103fb374'", "'01103fce8c'", "'01103f955d'",
"'011016e633'", "'01103fa0da'", "''", "''", "''", "'01103fa4bd'",
"'01103fb5c4'", "'01103fd0d7'", "'01103f9e2e'", "'01103fc657'",
"'01103fd4d1'", "'011016e78e'", "'01103fbda2'", "'01103fbae7'",
"'011016ee23'", "'01103fc847'", "'01103fbfbb'", "''", "'01103fb8bb'",
"'01103fc853'", "''", "'01103fbcd5'", "'011016e690'", "'01103fb253'",
"'01103fcb19'", "'01103fb446'", "'01103fa4fa'", "'011016cfbd'",
"'01103fd250'", "'01103fac7d'", "'011016a86e'"), Weight = c(11.5,
11.3, 11.3, 10.6, 10.6, 8.9, 18.7, 10.9, 11.3, 18.9, 18.9, 8.6,
8.8, 8.4, 11, 10.4, 10.4, 10.8, 11.2, 11, 10.3, 9.5, 8.1, 9.3,
10.2, 10.5, 11.2, 21.9, 18, 17.8, 11.3, 11.5, 10.8, 10.5, 12.8,
10.9, 8.9, 10.3, 10.8, 8.9, 10.9, 9.9, 19, 11.6, 11.3, 11.7,
10.9, 12.1, 11.3, 10.6)), class = "data.frame", row.names = c(NA,
-50L))
>
我想做的是搜索id列并替換以下錯誤
- 一些值在前面缺少零,所有這些現在都以 1 開頭,這樣可以輕松找到它們。所以基本上任何字符長度為 9 并以 1 開頭的東西都需要 0 作為第一個字符。
- 有些值的長度小于 10 個字符,需要將其洗掉。
- 有些有不止一個前導 0,這些需要洗掉。
uj5u.com熱心網友回復:
df$Id <- gsub("^('?)(1.{8}')$", "\\10\\2", df$Id)
df[ !grepl("^'?(00|'$)", df$Id),]
# Id Weight
# 1 '0110171786' 11.5
# 2 '01103fbfd5' 11.3
# 3 '0700edf6dc' 11.3
# 4 '01103fad09' 10.6
# 5 '01103fc9bb' 10.6
# 9 '01103fb169' 11.3
# 10 '01103fd723' 18.9
# 11 '01103f9c34' 18.9
# 15 '01103fc088' 11.0
# 16 '01103fa6d8' 10.4
# 17 '01103fb374' 10.4
# 18 '01103fce8c' 10.8
# 19 '01103f955d' 11.2
# 20 '011016e633' 11.0
# 21 '01103fa0da' 10.3
# 25 '01103fa4bd' 10.2
# 26 '01103fb5c4' 10.5
# 27 '01103fd0d7' 11.2
# 28 '01103f9e2e' 21.9
# 29 '01103fc657' 18.0
# 30 '01103fd4d1' 17.8
# 31 '011016e78e' 11.3
# 32 '01103fbda2' 11.5
# 33 '01103fbae7' 10.8
# 34 '011016ee23' 10.5
# 35 '01103fc847' 12.8
# 36 '01103fbfbb' 10.9
# 38 '01103fb8bb' 10.3
# 39 '01103fc853' 10.8
# 41 '01103fbcd5' 10.9
# 42 '011016e690' 9.9
# 43 '01103fb253' 19.0
# 44 '01103fcb19' 11.6
# 45 '01103fb446' 11.3
# 46 '01103fa4fa' 11.7
# 47 '011016cfbd' 10.9
# 48 '01103fd250' 12.1
# 49 '01103fac7d' 11.3
# 50 '011016a86e' 10.6
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/383636.html
