
知網爬蟲的第一步,輸入檢索條件
selenium 通過模擬滑鼠點擊,自動實作:選擇檢索詞的類別、輸入檢索詞、選擇精確還是模糊查找、邏輯關系、點擊檢索按鈕等一系列動作
而你所需要做的,就是給出搜索條件:
search_words = '摘要:地理探測器(精確) OR 摘要:geodetector(精確)'
首先將搜索條件處理成四元組:(邏輯關系,搜索型別,搜索詞,精確|模糊)
search_words = 'BEG '+search_words
pieces = search_words.split(' ')
conditions = []
for p in pieces:
if p in ['BEG', 'OR', 'AND','NOT']:
conditions.append([p])
else:
conditions[-1] += p.replace(')','').replace('(',':').split(':')
print(conditions)
'''
[['BEG', '摘要', '地理探測器', '精確'], ['OR', '摘要', 'geodetector', '精確']]
'''
然后就開始一系列的點擊啦
search_type = {
"主題":"SU",
"篇關摘":"TKA",
"關鍵詞":"KY",
"篇名":"TI",
"全文":"FT",
"作者":"AU",
"第一作者":"FI",
"通訊作者":"RP",
"作者單位":"AF",
"基金":"FU",
"摘要":"AB",
"小標題":"CO",
"參考文獻":"RF",
"分類號":"CLC",
"文獻來源":"LY",
"DOI":"DOI"
}
search_fuzzy = {
"精確" : "=",
"模糊" :"%"
}
logical_id = {
"AND": 0,
"OR":1,
"NOT":2
}
sleep_time = 0.5
search_middle = driver.find_element_by_class_name('search-middle')
dds = search_middle.find_elements_by_tag_name("dd")
if len(dds) < len(conditions):
pass
for i in range(len(conditions)):
if i > 0:
logical_list = dds[i].find_element_by_xpath('.//div[@class="sort logical"]')
logical_list.click()
time.sleep(sleep_time)
options = logical_list.find_elements_by_xpath(
'.//a'
)
options[logical_id[conditions[i][0]]].click()
time.sleep(sleep_time)
dds[i].find_element_by_xpath('.//div[@class="sort reopt"]').click()
time.sleep(sleep_time)
dds[i].find_element_by_xpath(
'.//a[@value="{}"]'.format(search_type[conditions[i][1]])
).click()
time.sleep(sleep_time)
dds[i].find_element_by_tag_name('input').clear()
dds[i].find_element_by_tag_name('input').send_keys(conditions[i][2])
dds[i].find_element_by_xpath('.//div[@class="sort special"]')\
.find_element_by_class_name('sort-default').click()
time.sleep(sleep_time)
dds[i].find_element_by_xpath(
'.//a[@value="{}"]'.format(search_fuzzy[conditions[i][3]])
).click()
time.sleep(sleep_time)
driver.find_element_by_class_name('search-buttons').click()
time.sleep(5)
成果展示:

關注后免費下載完整代碼哦:
https://download.csdn.net/download/itnerd/12832133
注:代碼用 jupyter notebook 完成,這個好用的工具怎么能不學呢
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/11618.html
標籤:其他
