如何在Python中執行此拆分程序？-有解無憂

我正在嘗試在表中進行資料標記，我需要以這樣一種方式來做，即在每一行中，索引都會重復，但是，在每一列中都有另一個 Enum 類。

到目前為止，我所做的是使用相同的列舉器類進行此表示。

將列單獨用作串列的解決方案也是可能的。但是，解決此問題的最佳方法是什么？

import pandas as pd
from enum import Enum


df = pd.DataFrame({'first': ['product and other', 'product2 and other', 'price'], 'second':['product and prices', 'price2', 'product3 and price']})
df

class Tipos(Enum):
    B = 1
    I = 2
    L = 3

for index, row in df.iterrows():
    sentencas = row.values
    for sentenca in sentencas:
        for pos, palavra in enumerate(sentenca.split()):
            print(f"{palavra} {Tipos(pos 1).name}")

結果：

                first              second
0   product and other  product and prices
1  product2 and other              price2
2               price  product3 and price

product B
and I
other L
product B
and I
prices L
product2 B
and I
other L
price2 B
price B
product3 B
and I
price L

預期結果：

        Word Ent
0    product B_first
1        and I_first
2      other L_first
3    product B_second
4        and I_second
5     prices L_second
6   product2 B_first
7        and I_first
8      other L_first
9     price2 B_second
10     price B_first
11  product3 B_second
12       and I_second
13     price L_second

# In that case, the sequence is like that: (B_first, I_first, L_first, L_first...) and if changes the column gets B_second, I_second, L_second...

uj5u.com熱心網友回復：

Enum您可以使用dict映射代替使用。如果您展平資料框，則可以避免回圈：

out = df.unstack().str.split().explode().sort_index(level=1).to_frame('Word')
out['Ent'] = out.groupby(level=[0, 1]).cumcount().map(Tipos) \
                   '_'   out.index.get_level_values(0)
out = out.reset_index(drop=True)

輸出：

>>> out
        Word       Ent
0    product   B_first
1        and   I_first
2      other   L_first
3    product  B_second
4        and  I_second
5     prices  L_second
6   product2   B_first
7        and   I_first
8      other   L_first
9     price2  B_second
10     price   B_first
11  product3  B_second
12       and  I_second
13     price  L_second

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/399123.html

標籤：Python 熊猫枚举

上一篇：將字串附加到for回圈中的空Pandas列

下一篇：在資料幀上計算標準偏差時的值錯誤