串列理解-將一個串列與另一個串列進行索引比較-有解無憂

我有一個包含推文串列的串列：

twitter_dataset_list = [['322185112684994561', '@Bill_Porter nice to know that your site is back :-)'], ['322185112684994545', 'I had a bad day']]

我想將每個元素的訊息與以下串列進行比較，看看它們是正面還是負面

positive_keyword_list = ['nice']

negative_keyword_list = ['bad']

如果它們是正/負，那么我想在每個初始串列中附加一個標志，如下所示：

[['322185112684994561', '@Bill_Porter nice to know that your site is back :-)', 1], ['322185112684994545', 'I had a bad day', -1]]

我已經這樣做了，但我不確定如何迭代和子索引

for element in twitter_dataset_list:
    if any(word in twitter_dataset_list[0][1] for word in positive_keyword_list) == True:
        twitter_dataset_list.append('1')
    elif any(word in twitter_dataset_list[0][1] for word in negative_keyword_list) == True:
        twitter_dataset_list.append('-1')
    else:
        twitter_dataset_list[0][1].append('0')

print(twitter_dataset_list)

So how do I iterate over the twitter_dataset_list

uj5u.com熱心網友回復：

首先，該enumerate函式在這里很有用，因為它會在您遍歷串列時為您提供索引和值。

其次，您可以使用for i, (id, text) in語法隨時解包。

最后，您可以使用_在回圈中實際未使用的任何解包。（這里，我不需要ID，所以我只是_告訴python不要擔心它。）

有關在回圈中解包的不同方法的更多詳細資訊，請參閱Python 檔案的資料結構。

for i, (_, text) in enumerate(twitter_dataset_list):
    if any(word in text for word in positive_keyword_list):
        twitter_dataset_list[i].append(1)
    elif any(word in text for word in negative_keyword_list):
        twitter_dataset_list[i].append(-1)
    else:
        twitter_dataset_list[i].append(0)

uj5u.com熱心網友回復：

我建議不要更改原始資料，而是回傳一個新串列：

positive_keyword_set = {"nice",}
negative_keyword_set = {"bad",}

tweets_with_sentiments = []
for tweet_id, tweet in twitter_dataset_list:
    sentiment = 0
    words = tweet.lower().split()
    if negative_keyword_set.intersection(words):
        sentiment = -1
    elif positive_keyword_set.intersection(words):
        sentiment = 1
    tweets_with_sentiments.append([tweet_id, tweet, sentiment])

請注意，我還將您的關鍵字串列轉換為set. 這允許O(1)查找，因為存盤在中的值set可以被散列。它還允許您簡單地使用set.intersection()推文的詞來查找關鍵字：

>>> tweet = '@Bill_Porter nice to know that your site is back :-)'

>>> tweet.lower().split()
['@bill_porter',
 'nice',
 'to',
 'know',
 'that',
 'your',
 'site',
 'is',
 'back',
 ':-)']

>>> positive_keyword_set.intersection(tweet.split())
{'nice'}

事實上，我建議使用 adict來存盤推文情緒：

positive_keyword_set = {"nice",}
negative_keyword_set = {"bad",}

tweets_with_sentiments = {}
for tweet_id, tweet in twitter_dataset_list:
    sentiment = 0
    if negative_keyword_set.intersection(tweet.split()):
        sentiment = -1
    elif positive_keyword_set.intersection(tweet.split()):
        sentiment = 1
    tweets_with_sentiments[int(tweet_id)] = dict(tweet=tweet, sentiment=sentiment)

現在可以O(1)通過推文 ID 訪問您的資料結構：

>>> tweets_with_sentiments
{322185112684994561: {'tweet': '@Bill_Porter nice to know that your site is back :-)', 'sentiment': 1},
 322185112684994545: {'tweet': 'I had a bad day', 'sentiment': -1}}

>>> tweets_with_sentiments[322185112684994561]["sentiment"]
1

uj5u.com熱心網友回復：

我會創建一個函式來處理情緒，因為我假設這部分代碼可能會發展（我將 Blobtext lib 用于類似的應用程式）：

twitter_dataset_list = [['322185112684994561', '@Bill_Porter nice to know that your site is back :-)'],
                        ['322185112684994545', 'I had a bad day']]

def text_positivity(tweet_text:str)->list:
    # https://www.adamsmith.haus/python/answers/how-to-check-if-a-string-contains-an-element-from-a-list-in-python#:~:text=Use any() to check,to build the generator expression.
    positive_keyword_list = ['nice']
    negative_keyword_list = ['bad']
    if any(' ' keyword.lower() ' ' in tweet_text.lower() for keyword in positive_keyword_list):
        return [1]
    if any(' ' keyword.lower() ' ' in tweet_text.lower()  for keyword in negative_keyword_list):
        return [-1]
    return [0]

twitter_dataset_list = [tweet_details   text_positivity(tweet_text=tweet_details[1]) for tweet_details in twitter_dataset_list]
print(twitter_dataset_list)

編輯答案以考慮 Kumar 關于部分匹配的評論。同樣，使用 lower() 匹配大寫或小寫匹配

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/463639.html

標籤：Python 列表索引

上一篇：如何對具有非數字值的字串格式浮點數的串列進行排序？

下一篇：如何在撰寫原始程式的主檔案中創建一個撰寫print("helloworld")的程式？在Python中