如何使用python，beautifulsoup將名稱分成3個單元格用于excel表-有解無憂

我正在嘗試刮掉名字并將它們匯入到 Excel 表中以供以后使用。問題是我需要它們在 3 個不同的單元格中first，last和initial。該腳本在這種情況下查找關鍵字est of并列印整行，該行具有全名和“est of”。我需要它：

從最后洗掉 est of。
將全名拆分為 3，以便可以將其匯出到作業表中。

繼承人的代碼：

#!python
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import NoSuchElementException
from random import randint
import pickle
import datetime
import os
import time
import sys
import openpyxl
from openpyxl import Workbook
import re

url = 'https://www.miamidade.gov/global/home.page'

current_time = datetime.datetime.now()
current_time.strftime("%m/%d/%Y")
options = webdriver.ChromeOptions()
options.headless = True
chromedriver = "chromedriver.exe"
number = "2080"
driver = webdriver.Chrome(chromedriver) #chromedriver
driver.get(url)
pickle.dump(driver.get_cookies() , open("cookies.pkl","wb"))
time.sleep(3)
nav1 = driver.find_element_by_xpath('/html/body/div[2]/div/div[1]/div/header/div[2]/nav/div/div[1]/div/div[1]/a').click()
time.sleep(1)
nav2 = driver.find_element_by_xpath('/html/body/div[2]/div/div[1]/div/header/div[2]/div[2]/div/div/div/ul/li[1]/button').click()
propsrch1 = driver.find_element_by_xpath('/html/body/div[2]/div/div[1]/div/header/div[2]/div[2]/div/div/div/ul/li[1]/ul/li[2]/ul/li[5]/a').click()

time.sleep(2)
propsrch2 = driver.find_element_by_xpath('/html/body/div[2]/div/main/div[2]/div/div[2]/div/div[1]/div[1]/ul/li[1]/span/a').click()
time.sleep(5)



subdivision = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[1]/ul/li[3]/a').click()
searchbar = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[1]/div[2]/div[2]/div/div[3]/div/input')
time.sleep(2)
searchbar.send_keys("RICHMOND HGTS")
search = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[1]/div[2]/div[2]/div/div[3]/div/span/button/span').click()
time.sleep(10)
table = driver.find_element_by_xpath('/html/body/div/div[2]/div[3]/div[1]/div[2]/div[4]/a').click()
main_window_handle = None
while not main_window_handle:
    main_window_handle = driver.current_window_handle
#driver.find_element_by_xpath(u'//a[text()="click here"]').click()
signin_window_handle = None
while not signin_window_handle:
    for handle in driver.window_handles:
        if handle != main_window_handle:
            signin_window_handle = handle
            break
driver.switch_to.window(signin_window_handle)
time.sleep(20)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')

keyword = 'est of'
#keywords = soup.find(keyword)
counts = soup.find_all(text=re.compile("EST OF"))
for count in counts:
    print(count)

現在它列印到 cmd 中，所以我可以看到它的作業。看起來像這樣：

GRACE K ROLLE EST OF    
ETHEL H FIFE EST OF 
BARBARA J BROUSSARD EST OF  
CLEMENTINA D RAHMING EST OF 
CHARLES B  CAMBRIDGE JR EST OF  
EMILY STATEN EST OF 
HATTIE S KING  EST OF

拆分名稱的最佳方法是什么？

uj5u.com熱心網友回復：

您可以使用拆分方法拆分以下空間

for count in counts:
    count= count.split(' ')
    First_name=counnt[0]
    mid_name=count[1]
    Last_name=count[2]

uj5u.com熱心網友回復：

如果您知道它總是由空格分隔的 3 個單詞，您可以使用count.split(' ')[:3].

如果您不知道名稱的長度，您可以使用count.rstrip('EST OF').split(' ').

轉載請註明出處，本文鏈接：https://www.uj5u.com/qianduan/436837.html

標籤：Python 硒美丽的汤

上一篇：Python/Selenium-無法訪問部分標記中的元素

下一篇：Python，Selenium：如何指定哪些站點可以加載影像？