我有一個txt具有以下行型別的檔案:
"Hello I'm in Tensorflow"
"My name is foo"
'Mr "alias" is running'
...
所以可以看出,每行只有一個字串。當我嘗試創建一個 時tf.data.Dataset,輸出如下所示:
conver = TextLineDataset('path_to.txt')
for utter in conver:
print(utter)
break
# tf.Tensor(b'"Hello I'm in Tensorflow"', shape=(), dtype=string)
如果您注意到,引號"仍然存在于字串的開頭和結尾(加上 tensor 定義的')。我想要的輸出是:
# tf.Tensor(b'Hello I'm in Tensorflow', shape=(), dtype=string)
也就是說,沒有引號。先感謝您
uj5u.com熱心網友回復:
你可以使用tf.strings.regex_replace:
import tensorflow as tf
conver = tf.data.TextLineDataset('/content/text.txt')
def remove_quotes(text):
text = tf.strings.regex_replace(text, '\"', '')
text = tf.strings.regex_replace(text, '\'', '')
return text
conver = conver.map(remove_quotes)
for s in conver:
print(s)
tf.Tensor(b'Hello Im in Tensorflow', shape=(), dtype=string)
tf.Tensor(b'My name is foo', shape=(), dtype=string)
tf.Tensor(b'Mr alias is running', shape=(), dtype=string)
或者,如果您只想洗掉前導和尾隨引號,請嘗試以下操作:
text = tf.strings.regex_replace(text, '^[\"\']*|[\"\']*$', '')
uj5u.com熱心網友回復:
該eval()功能應該這樣做。
for utter in conver:
print(eval(utter))
break
或者你可以簡單地使用replace-
for utter in conver:
print(utter.replace('"',''))
break
uj5u.com熱心網友回復:
如果要在字串中保留不在字串末尾或開頭的引號 -
for utter in conver:
print(''.join([utter[i] if not (utter[i] == '"' and (i==0 or i==len(utter)-1)) else '' for i in range(len(utter))]))
break
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/381609.html
