我想從 Pandas 列的刺中去除串列中指定的單詞,并用它們構建另一列。如果列字串包含單詞標志,我的這個示例的靈感來自問題python pandas
listing = ['test', 'big']
df = pd.DataFrame({'Title':['small test','huge Test', 'big','nothing', np.nan, 'a', 'b']})
df['Test_Flag'] = np.where(df['Title'].str.contains('|'.join(listing), case=False,
na=False), 'T', '')
print (df)
Title Test_Flag
0 small test T
1 huge Test T
2 big T
3 nothing
4 NaN
5 a
6 b
但是,如果我想輸入串列中已找到的實際單詞而不是“T”,該怎么辦?所以,有一個結果:
Title Test_Flag
0 small test test
1 huge Test test
2 big big
3 nothing
4 NaN
5 a
6 b
uj5u.com熱心網友回復:
使用.apply帶有自定義函式的方法應該會給你你正在尋找的東西
import pandas as pd
import numpy as np
# Define the listing list with the words you want to extract
listing = ['test', 'big']
# Define the DataFrame
df = pd.DataFrame({'Title':['small test','huge Test', 'big','nothing', np.nan, 'a', 'b']})
# Define the function which takes a string and a list of words to extract as inputs
def listing_splitter(text, listing):
# Try except to handle np.nans in input
try:
# Extract the list of flags
flags = [l for l in listing if l in text.lower()]
# If any flags were extracted then return the list
if flags:
return flags
# Otherwise return np.nan
else:
return np.nan
except AttributeError:
return np.nan
# Apply the function to the column
df['Test_Flag'] = df['Title'].apply(lambda x: listing_splitter(x, listing))
df
輸出:
Title Test_Flag
0 small test ['test']
1 huge Test ['test']
2 big ['big']
3 nothing NaN
4 NaN NaN
5 a NaN
6 b NaN
7 smalltest ['test']
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/347422.html
上一篇:如何從元組串列創建資料框?
