用python從多級（搜刮的）復雜結構的json檔案中提取鑰匙 -有解無憂

我有一個多級/復雜的json檔案 - twitter.json，我想從這個json檔案中只提取作者ID。

我的檔案'twitter.json'是這樣的：

[
[
    {
        "tweets_results"/span>: [
            {
                "meta"/span>: {
                    "result_count": 0。
                }
            }
        ],
        "youtube_link": "www.youtube.com/channel/UCl4GlGXR0ED6AUJU1kRhRzQ"。
    }
],
[
    {
        "tweets_results": [
            {
                "data": [
                    {
                        "author_id": "125959599",
                        "created_at": "2021-06-12T15:16:40.000Z",
                        "id": "1403732993269649410",
                        "in_reply_to_user_id": "125959599",
                        "lang": "pt",
                        "public_metrics": {
                            "like_count": 0,
                            "quote_count": 0,
                            "reply_count": 1,
                            "retweet_count": 0。
                        },
                        "source": "Twitter for Android"。
                        "text": "?? Canais do YouTube:

1 - Alexandre Garcia: Canal de Brasília"。
                    },
                    {
                        "author_id": "521827796",
                        "created_at": "2021-06-07T20:23:08.000Z",
                        "id": "1401998177943834626",
                        "in_reply_to_user_id": "623794755",
                        "lang": "und",
                        "public_metrics": {
                            "like_count": 0,
                            "quote_count": 0,
                            "reply_count": 0,
                            "retweet_count": 0.
                        },
                        "source": "TweetDeck",
                        "text": "@thelittlecouto".
                    }
                ],
                "meta": {
                    "newest_id": "1426546114115870722",
                    "oldest_id": "1367808835403063298",
                    "result_count": 7.
                }
            }
        ],
        "youtube_link": "www.youtube.com/channel/UCm0yTweyAa0PwEIp0l3N_gA"。
    }
]
]

我已經閱讀了許多類似的SO問題（包括但不限于）：

在python中訪問一個多級JSON檔案的密鑰

多級 JSON 詞典 - 無法提取密鑰到新的詞典中

。

如何從JSON回應中提取一個單一的值？

如何在 python 中從 json 檔案的一個特定鍵中獲取欄位和值

。

如何通過python選擇json中一個物件的特定鍵/值

。

Python。從json中獲取一個特定鍵的所有值

但是，這些jsons的結構非常簡單，當我試圖復制時，我遇到了錯誤。

從我讀到的內容來看，contents.tweets_results.data.author_id是如何進行參考的。我正在使用contents = json.load(open("twitter.json"))加載。希望得到任何幫助。

編輯：@sammywemmy和@balderman的代碼都對我有用。我接受了@sammywemmy的代碼，因為我使用了該代碼，但我想以某種方式歸功于他們兩個。

uj5u.com熱心網友回復：

你的資料有一個路徑，你有一個嵌套在串列中的串列，在內部串列中，你有一個 tweets_results 鍵，其值是一個 dicts 串列；其中一個有一個 data 鍵，它包含一個串列/陣列，它包含一個 dictionary，其中一個鍵是 author_id。我們可以將路徑（排序）模擬為: '[][].tweets_results[].data[].author_id'/code>

。
一個重復的排序。先打第一個串列，然后是內部串列，然后訪問tweets_results鍵，然后訪問值串列；在這個值串列中，訪問data鍵，在與data相關的值串列中，訪問author_id：
有了這個路徑，人們可以使用jmespath來拉出author_ids :
# pip install jmespath。
import jmespath
              # similar to re.compile
expression = jmespath.compile('[][].tweets_results[].data[].author_id'/span>)
expression.search(data)
['125959599', '521827796']

如果你想從嵌套的字典中建立一個資料結構，jmespath是非常有用的；但是，如果你只關心author_id的值，你可以用nested_lookup代替；它遞回搜索鍵并回傳值：
# pip install nested-lookup。
from nested_lookup import nested_lookup
nested_lookup('author_id', data)
['125959599', '521827796']

uj5u.com熱心網友回復：
見下文（不涉及外部lib）
data = [
[
    {
        "tweets_results": [
            {
                "meta": {
                    "result_count": 0。
                }
            }
        ],
        "youtube_link": "www.youtube.com/channel/UCl4GlGXR0ED6AUJU1kRhRzQ"。
    }
],
[
    {
        "tweets_results": [
            {
                "data": [
                    {
                        "author_id": "125959599",
                        "created_at": "2021-06-12T15:16:40.000Z",
                        "id": "1403732993269649410",
                        "in_reply_to_user_id": "125959599",
                        "lang": "pt",
                        "public_metrics": {
                            "like_count": 0,
                            "quote_count": 0,
                            "reply_count": 1,
                            "retweet_count": 0。
                        },
                        "source": "Twitter for Android"。
                        "text": "?? Canais do YouTube:

1 - Alexandre Garcia: Canal de Brasília"。
                    },
                    {
                        "author_id": "521827796",
                        "created_at": "2021-06-07T20:23:08.000Z",
                        "id": "1401998177943834626",
                        "in_reply_to_user_id": "623794755",
                        "lang": "und",
                        "public_metrics": {
                            "like_count": 0,
                            "quote_count": 0,
                            "reply_count": 0,
                            "retweet_count": 0.
                        },
                        "source": "TweetDeck",
                        "text": "@thelittlecouto".
                    }
                ],
                "meta": {
                    "newest_id": "1426546114115870722",
                    "oldest_id": "1367808835403063298",
                    "result_count": 7.
                }
            }
        ],
        "youtube_link": "www.youtube.com/channel/UCm0yTweyAa0PwEIp0l3N_gA"。
    }
]
]

ids = []
for entry in data:
  for sub in entry:
   result = sub['tweets_results']
   if result[0].get（'data'）。
    info = result[0]['data']
    for item in info:
      ids.append(item.get('author_id','not_found')
print(ids)

輸出
['125959599'/span>, '521827796'/span>]







        
      轉載請註明出處，本文鏈接：https://www.uj5u.com/houduan/328624.html
      標籤：
      上一篇：如何比較二維陣列中的相鄰元素是否重復？
下一篇：用圖畫形式表現python詞典