在Pandas中捕獲第二個和第三個空格之前的所有字串-有解無憂

我了解如何從第一次出現的空格中拆分字串。我的問題是如何在第二次第三次出現空格時拆分并在此之前捕獲所有字串。

df = pd.DataFrame({"cid" : {0 : "cd1", 1 : "cd2", 2 : "cd3"},
                   "Name" : {0 : "John Maike Leiws", 1 : "Katie Sue Adam", 2 : "Tanaka Ubri Kse Suri"}}).set_index(['cid'])

                     Name
cid
cd1      John Maike Leiws
cd2        Katie Sue Adam
cd3  Tanaka Ubri Kse Suri

df['split_one'] = df.Name.str.split().str[0]

預期輸出：

                     Name  split_one   split_two   split_three
cid
cd1      John Maike Leiws      John    John Maike  John Maike Leiws
cd2        Katie Sue Adam     Katie    Katie Sue   Katie Sue Adam
cd3  Tanaka Ubri Kse Suri    Tanaka    Tanaka Ubri Tanaka Ubri Kse

uj5u.com熱心網友回復：

將索引與strthen一起使用Series.str.join：

s = df.Name.str.split()
df['split_one'] = s.str[0]
df['split_two'] = s.str[:2].str.join(' ')
df['split_three'] = s.str[:3].str.join(' ')
print (df)
                     Name split_one    split_two       split_three
cid                                                               
cd1      John Maike Leiws      John   John Maike  John Maike Leiws
cd2        Katie Sue Adam     Katie    Katie Sue    Katie Sue Adam
cd3  Tanaka Ubri Kse Suri    Tanaka  Tanaka Ubri   Tanaka Ubri Kse

uj5u.com熱心網友回復：

使用正則運算式的一種簡單方法是使用嵌套捕獲組：

df['Name'].str.extract('(((\S )\s\S )\s\S )').iloc[:,::-1]

輸出：

                    0            1       2
cid                                       
cd1  John Maike Leiws   John Maike    John
cd2    Katie Sue Adam    Katie Sue   Katie
cd3   Tanaka Ubri Kse  Tanaka Ubri  Tanaka

要添加，只需顛倒順序：

df[['split_one', 'split_two', 'split_three']] = df['Name'].str.extract('(((\S )\s\S )\s\S )').iloc[:,::-1]

輸出：

                     Name split_one    split_two       split_three
cid                                                               
cd1      John Maike Leiws      John   John Maike  John Maike Leiws
cd2        Katie Sue Adam     Katie    Katie Sue    Katie Sue Adam
cd3  Tanaka Ubri Kse Suri    Tanaka  Tanaka Ubri   Tanaka Ubri Kse

uj5u.com熱心網友回復：

我不知道您是在尋找通用的東西還是簡單的東西。這是一種簡單的方法。

 df = pd.DataFrame({"cid" : {0 : "cd1", 1 : "cd2", 2 : "cd3"},
                       "Name" : {0 : "John Maike Leiws", 1 : "Katie Sue Adam", 2 : "Tanaka Ubri Kse Suri"}}).set_index(['cid'])
    
    s = df.Name.str.split().str
    df['split_one'] = s[0]
    df['split_two'] = s[0]   ' '   s[1]
    df['split_three'] = s[0]   ' '   s[1]   ' '   s[2]

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/427042.html

標籤：Python 熊猫

上一篇：python：在某個值之前將字串串列拆分為多個串列

下一篇：繪制與主列不同的列的線形圖