將字串附加到for回圈中的空Pandas列-有解無憂

該代碼使用 OCR 從串列“url_list”中的 URL 讀取文本。我試圖將字串 'txt' 形式的輸出附加到空的 Pandas 列 'url_text' 中。但是，代碼沒有向“url_text”列附加任何內容？什么時候

df = pd.read_csv(r'path') # main dataframe

df['url_text'] = "" # create empty column that will later contain the text of the url_image

url_list = (df.iloc[:, 5]).tolist() # convert column with urls to a list 

print(url_list)

['https://pbs.twimg.com/media/ExwMPFDUYAEHKn0.jpg', 
'https://pbs.twimg.com/media/ExuBd4-WQAMgTTR.jpg', 
'https://pbs.twimg.com/media/ExuBd5BXMAU2-p_.jpg', 
' ',
'https://pbs.twimg.com/media/Ext0Np0WYAEUBXy.jpg', 
'https://pbs.twimg.com/media/ExsJrOtWUAMgVxk.jpg', 
'https://pbs.twimg.com/media/ExrGetoWUAEhOt0.jpg',
' ',
' ']

for img_url in url_list: # loop over all urls in list url_list
    try:
        img = io.imread(img_url) # convert image/url to cv2/numpy.ndarray format

        # Preprocessing of image
        gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        (h, w) = gry.shape[:2]
        gry = cv2.resize(gry, (w*3, h*3))
        thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY   cv2.THRESH_OTSU)[1]

        txt = pytesseract.image_to_string(thr)  # read tweet image text

        df['url_text'].append(txt)

        print(txt)
    except: # ignore any errors. Some of the rows does not contain a URL causing the loop to fail
        pass

print(df)

uj5u.com熱心網友回復：

我無法對其進行測驗，但請嘗試此操作，因為您可能需要先創建串列，然后將其作為新列添加到 df（我將串列本身轉換為資料框，然后連接到原始 df）

txtlst=[]
for img_url in url_list: # loop over all urls in list url_list
    try:
        img = io.imread(img_url) # convert image/url to cv2/numpy.ndarray format

        # Preprocessing of image
        gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        (h, w) = gry.shape[:2]
        gry = cv2.resize(gry, (w*3, h*3))
        thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY   cv2.THRESH_OTSU)[1]

        txt = pytesseract.image_to_string(thr)  # read tweet image text
        txtlst.append(txt)


        print(txt)
    except: # ignore any errors. Some of the rows does not contain a URL causing the loop to fail
        txtlst.append("")
        pass
dftxt=pd.Dataframe({"url_text":txtlst})
df=pd.concat([df, dftxt], axis=1)
print(df)

uj5u.com熱心網友回復：

正如Series.append()的檔案中所指出的，append 呼叫僅在兩個系列之間起作用。

更好的是在回圈外創建一個空串列，附加到回圈本身內的字串串列，然后將該串列插入到df["url_list"] = list_of_urls. 這在運行時也比重復地將兩個系列附加在一起要快得多。

url_list = []

for ...:
    ...
    url_list.append(url_text)

df["url_list"] = url_list

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/399122.html

標籤：Python 熊猫附加

上一篇：熊貓-隱藏從資料幀中過濾出來的行的慣用方法

下一篇：如何在Python中執行此拆分程序？