對照另一列串列檢查一列串列并回傳多列值-有解無憂

我正在比較sb_list反對的串列值psr_list。如果sb_list['ASINs']在任何來自的串列中找到來自的所有串列項psr_list['Child ASIN']，sb_list['bucket']則標記為'clean'。這部分代碼運行良好......

我遇到的問題是填充sb_list['Group']. 如果['bucket']被標記'clean'，sb_list['Group']則應等于psr_list['Group']找到匹配項的對應位置。

我正在嘗試運行該函式來檢查sb_list['ASINs']反對的每個串列psr_list['Group']，并回傳一個元組，如果找到匹配項，則元組的第一個值是干凈/混合的，元組的第二個值是任何psr_list['Group']值匹配行。

這與我幾周前提出的另一個問題相似，但與我認為它值得單獨發帖的地方有很大不同。

資料：

import pandas as pd

list1 = [
    ['1', ['hi', 'there', '10', '14', '15']],
    ['2',  ['7', '13', '25', '46', '50']],
    ['3',  ['hello', 'du', '6', '19', '36']],
    ['4',  ['hi', '19', '24', '26', '29']]]

psr_list = pd.DataFrame(list1, columns =['Group', 'Child ASIN']) 

list2 = [
    ['a', ['hi', 'there']],
    ['r',  ['hello', 'du', 'th']],
    ['e',  ['hello', '9']],
    ['f',  ['hello', '6', '36']],
    ['w',  ['hello', '6', '37']],
    ['a',  ['24', '29']],
    ['q',  ['hi', '14', '15']]]

sb_list = pd.DataFrame(list2, columns =['camp', 'ASINs']) 
sb_list['bucket'] = ""
sb_list['Group'] = ""

我的嘗試：

def process(psr_asin_list, sb_ap_asin_list):
  return [compare(psr_asin_list, sb_sp_row) for sb_sp_row in sb_ap_asin_list]

def compare(psr_asin_list, sb_sp_row):
  counter = 0
  while counter < psr_asin_list.shape[0]:
    if all(asins in psr_asin_list[counter] for asins in sb_sp_row): return ('clean', psr_asin_list['Group'])
    counter  =1
  return ('mixed', '')


sb_list['bucket'] = process(psr_list['Child ASIN'].to_numpy(), sb_list['ASINs'].to_numpy())[0]
sb_list['Group'] = process(psr_list['Child ASIN'].to_numpy(), sb_list['ASINs'].to_numpy())[1]

期望的輸出：

  camp            ASINs bucket Group
0    a      [hi, there]  clean     1
1    r  [hello, du, th]  mixed
2    e       [hello, 9]  mixed
3    f   [hello, 6, 36]  clean     3
4    w   [hello, 6, 37]  mixed
5    a         [24, 29]  clean     4
6    q     [hi, 14, 15]  clean     1

uj5u.com熱心網友回復：

您可以set.issubset在串列推導中使用來檢查 insb_list中的任何串列是否包含在psr_list. 如果串列存在，則在它存在的地方獲取“組”值，如果沒有填寫""。請注意，這假設只有一個串列psr_list包含來自的串列sb_list。

bucket然后根據是否找到“組”值填寫：

def get_group(asin):
    group = psr_list.loc[[set(asin).issubset(y) for y in psr_list['Child ASIN'].tolist()], 'Group']
    return group.iat[0] if not group.empty else ''

sb_list['Group'] = sb_list['ASINs'].apply(get_group)
sb_list['bucket'] = np.where(sb_list['Group']=='', 'mixed', 'clean')

輸出：

  camp            ASINs bucket Group
0    a      [hi, there]  clean     1
1    r  [hello, du, th]  mixed      
2    e       [hello, 9]  mixed      
3    f   [hello, 6, 36]  clean     3
4    w   [hello, 6, 37]  mixed      
5    a         [24, 29]  clean     4
6    q     [hi, 14, 15]  clean     1

轉載請註明出處，本文鏈接：https://www.uj5u.com/gongcheng/439812.html

標籤：Python 熊猫数据框麻木的

上一篇：如何將此代碼從matlab轉換為python？

下一篇：影像上的contourf作為輸入，如何剪輯成圓形