我正在嘗試為我經營的企業獲取 Facebook 資料,并將其放入熊貓資料框中。有些帖子有評論,而其他帖子沒有,我正在嘗試從中獲取資料框。
我擁有的 JSON 是這樣的:
{'data': [{'id': 'user_id_post_id1'},
{'id': 'user_id_post_id2'},
{'id': 'user_id_post_id3'},
{'comments': {'data': [{'created_time': '2022-11-09T00:15:29 0000',
'message': 'comment_id',
'id': 'user_who_commented_the_id_comment_id'}]},
'id': 'user_id_post_id4'},
{'id': 'user_id_post_id5'}...]}
我正在嘗試獲得一個看起來像這樣的 pandas df:
df = pd.DataFrame(data = data)
print(df)
0 User ID and Post ID comment Commenter_id
1 user_id_post_id 0 or N/A 0 or N/A
2 user_id_post_id1 0 or N/A 0 or N/A
2 user_id_post_id2 0 or N/a 0 or N/A
3 user_id_post_id3 Comment_id user_who_commented_the_id_comment_id
4 user_id_post_id3 Comment_id* user_who_commented_the_id_comment_id
2 user_id_post_id4 0 or N/a 0 or N/A
* means another comment under the same User ID and Post ID
And so on
我知道當沒有雙嵌套 json 時該怎么做,但在嘗試追加它時遇到了麻煩。這個命令試過了,沒用。
df = pd.json_normalize(data=JSON_Name["data"]["comments"])
and get this as the return value:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_1/FileName.py in <module>
----> 1 df = pd.json_normalize(data=basic_insight["data"]["comments"])
TypeError: list indices must be integers or slices, not str
任何幫助都會得到幫助!
uj5u.com熱心網友回復:
嘗試:
data = {
"data": [
{"id": "user_id_post_id1"},
{"id": "user_id_post_id2"},
{"id": "user_id_post_id3"},
{
"comments": {
"data": [
{
"created_time": "2022-11-09T00:15:29 0000",
"message": "comment_id",
"id": "user_who_commented_the_id_comment_id",
}
]
},
"id": "user_id_post_id4",
},
{"id": "user_id_post_id5"},
]
}
tmp = [
{
"User ID and Post ID": d["id"],
"Commenter_id": d.get("comments", {}).get("data"),
}
for d in data["data"]
]
df = pd.DataFrame(tmp).explode("Commenter_id")
df["comment"] = df["Commenter_id"].str["message"]
df["Commenter_id"] = df["Commenter_id"].str["id"]
print(df)
印刷:
User ID and Post ID Commenter_id comment
0 user_id_post_id1 None None
1 user_id_post_id2 None None
2 user_id_post_id3 None None
3 user_id_post_id4 user_who_commented_the_id_comment_id comment_id
4 user_id_post_id5 None None
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/530718.html
上一篇:R:創建一組變數,僅在滿足條件時列印一系列列中的所有匹配值
下一篇:python資料框唯一值
