在Python中將串列元素決議為多個串列-有解無憂

我設法從資料源中提取了一個串列。串列元素的格式如下（注意第一個數字不是索引）：

0                   cheese    100
1                   cheddar cheese    1100
2                   gorgonzola    1300
3                   smoked cheese    200

等等

這意味著列印時，一行包含“ 0 cheese 100”，所有空格。

我想做的是決議每個條目以將其分成兩個串列。我不需要第一個數字。相反，我想要奶酪型別和后面的數字。

例如：

cheese
cheddar cheese
gorgonzola
smoked cheese

和：

最終目標是能夠將這兩個串列歸因于 pd.DataFrame 中的列，以便可以以自己的方式處理它們。

任何幫助深表感謝。

uj5u.com熱心網友回復：

如果目標是一個資料框，為什么不直接制作它而不是兩個串列。如果你把你的字串變成一個系列，你可以pandas.Series.str.extract()把它分成你想要的列：

import pandas as pd

s = '''0                   cheese    100
1                   cheddar cheese    1100
2                   gorgonzola    1300
3                   smoked cheese    200'''

pd.Series(s.split('\n')).str.extract(r'.*?\s (?P<type>.*?)\s (?P<value>\d )')

這給出了一個資料框：

    type             value
0   cheese           100
1   cheddar cheese   1100
2   gorgonzola       1300
3   smoked cheese    200

uj5u.com熱心網友回復：

IIUC 您的字串是串列的元素。您可以re.split在找到兩個或多個空格的位置使用拆分：

import re
import pandas as pd

your_list = [
  "0                   cheese    100",
  "1                   cheddar cheese    1100",
  "2                   gorgonzola    1300",
  "3                   smoked cheese    200",
]

df = pd.DataFrame([re.split(r'\s{2,}', s)[1:] for s in your_list], columns=["type", "value"])

輸出：

             type value
0          cheese   100
1  cheddar cheese  1100
2      gorgonzola  1300
3   smoked cheese   200

uj5u.com熱心網友回復：

我認為這些方面的某些東西可能會起作用：

import pandas as pd
import re
mylist=['0 cheese 100','1 cheddar cheese 200']


numbers = '[0-9]'

list1=[i.split()[-1] for i in mylist]
list2=[re.sub(numbers, '', i).strip() for i in mylist]


your_df=pd.DataFrame({'name1':list1,'name2':list2})
your_df

uj5u.com熱心網友回復：

我可以建議這個簡單的解決方案：

lines = [
         "1                   cheddar cheese    1100 ",
         "2                   gorgonzola    1300 ",
         "3                   smoked cheese    200",
        ]

for line in lines:
  words = line.strip().split()
  print( ' '.join( words[1:-1]), words[-1])

結果：

cheddar cheese 1100
gorgonzola 1300
smoked cheese 200

uj5u.com熱心網友回復：

如果你有：

text = '''0                   cheese    100
1                   cheddar cheese    1100
2                   gorgonzola    1300
3                   smoked cheese    200'''

# OR

your_list = [
 '0                   cheese    100',
 '1                   cheddar cheese    1100',
 '2                   gorgonzola    1300',
 '3                   smoked cheese    200'
]

text = '\n'.join(your_list)

正在做：

from io import StringIO

df = pd.read_csv(StringIO(text), sep='\s\s ', names=['col1', 'col2'], engine='python')
print(df)

輸出：

             col1  col2
0          cheese   100
1  cheddar cheese  1100
2      gorgonzola  1300
3   smoked cheese   200

這會將第一個數字視為索引，但您可以df=df.reset_index(drop=True)根據需要重置它。

uj5u.com熱心網友回復：

您可以通過使用切片來實作這一點：

from curses.ascii import isdigit


inList = ['0                   cheese    100', '1                   cheddar cheese    1100', '2                   gorgonzola    1300', '3                   smoked cheese    200']

cheese = []
prices = []

for i in inList:
    temp = i[:19:-1] #Cuts out first number and all empty spaces until first character and reverses the string
    counter = 0
    counter2 = 0
    for char in temp: #Temp is reversed, meaning the number e.g. '100' for 'cheese' is in front but reversed
        if char.isdigit(): 
            counter  = 1
        else:   #If the character is an empty space, we know the number is over
            prices.append((temp[:counter])[::-1]) #We know where the number begins (at position 0) and ends (at position counter), we flip it and store it in prices

            cheeseWithSpace = (temp[counter:]) #Since we cut out the number, the rest has to be the cheese name with some more spaces in front
            for char in cheeseWithSpace:
                if char == ' ': #We count how many spaces are in front
                    counter2  = 1
                else:   #If we reach something other than an empty space, we know the cheese name begins.
                    cheese.append(cheeseWithSpace[counter2:][::-1]) #We know where the cheese name begins (at position counter2) cut everything else out, flip it and store it
                    break
            break

print(prices)
print(cheese)

查看代碼內注釋以了解該方法。基本上，您使用 [::-1] 翻轉字串以使它們更易于處理。然后你一個一個地洗掉每個部分。

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/525825.html

標籤：Python熊猫列表解析

上一篇：如何訪問divid、cheerio節點js中的資料狀態

下一篇：評估代碼是否是JS功能的簡化子集