我正在處理大量 JSON 資料。特別是兩個看起來像這樣:
users = [
{ "id": 1, "name": "Greg Harris", "roles": ["mega-user"] },
{ "id": 2, "name": "Sarah Smith", "roles": ["charger", "rider"] },
{ "id": 7, "name": "Jack Snow", "roles": ["rider"] },
{ "id": 11, "name": "NA", "roles": [] },
{ "id": 18, "name": "Tiffany Denson", "roles": ["beta tester"] },
]
和這個:
users2 = [
{
'id': 1,
'name': 'Employee #1',
'customer_id': 1,
'activated_on': datetime.date(2018, 11, 4),
'deactivated_on': datetime.date(2019, 1, 10)
},
{
'id': 2,
'name': 'Employee #2',
'customer_id': 1,
'activated_on': datetime.date(2018, 12, 4),
'deactivated_on': None
}
]
我需要知道如何有效地遍歷它們并執行一些計算。
對于第一個 JSON,我如何迭代字典串列并使用 python 僅提取在其值串列中具有“角色”騎手的用戶的“名稱”?
對于第二個 JSON,假設作為基本結構,我想計算每月每一天的活躍訂閱者的每日費率。我想確定當天哪些用戶是活躍的,然后將當天其他活躍用戶的數量乘以計算當天的總數。訂閱是 4 美元/月,所以它看起來像一天:
2019-01-01 2 active users * $0.129032258 = $0.258064516 (subtotal: $0.258064516)
并計算整個月的總數。
users2 也可能是空的,所以我需要處理這種情況。
For the first one I tried something like this:
for d in users:
if 'rider' in d['roles']:
print(d['name'])
Seems to work but not sure if there is a better way to go about it. For the second part I am truly lost on how to go about it.
Please help Thanks
uj5u.com熱心網友回復:
對于第一個檔案,您的解決方案似乎沒問題,不需要更改。
最終你可以把它寫成串列理解(但你不必)
selected = [person['name'] for person in users if 'rider' in person['roles']]
for name in selected:
print(name)
完整的作業代碼
users = [
{ "id": 1, "name": "Greg Harris", "roles": ["mega-user"] },
{ "id": 2, "name": "Sarah Smith", "roles": ["charger", "rider"] },
{ "id": 7, "name": "Jack Snow", "roles": ["rider"] },
{ "id": 11, "name": "NA", "roles": [] },
{ "id": 18, "name": "Tiffany Denson", "roles": ["beta tester"] },
]
#selected = []
#for person in users:
# if 'rider' in person['roles']:
# selected.append(person['name'])
selected = [person['name'] for person in users if 'rider' in person['roles']]
#print(selected)
for name in selected:
print(name)
同 pandas
import pandas as pd
users = [
{ "id": 1, "name": "Greg Harris", "roles": ["mega-user"] },
{ "id": 2, "name": "Sarah Smith", "roles": ["charger", "rider"] },
{ "id": 7, "name": "Jack Snow", "roles": ["rider"] },
{ "id": 11, "name": "NA", "roles": [] },
{ "id": 18, "name": "Tiffany Denson", "roles": ["beta tester"] },
]
df = pd.DataFrame(users)
print('\n--- dataframe ---\n')
print(df)
mask = df['roles'].apply(lambda x: 'rider' in x)
print('\n--- mask ---\n')
print(mask)
selected = df[ mask ]
print('\n--- selected ---\n')
print(selected['name'])
結果:
--- dataframe ---
id name roles
0 1 Greg Harris [mega-user]
1 2 Sarah Smith [charger, rider]
2 7 Jack Snow [rider]
3 11 NA []
4 18 Tiffany Denson [beta tester]
--- mask ---
0 False
1 True
2 True
3 False
4 False
Name: roles, dtype: bool
--- selected ---
1 Sarah Smith
2 Jack Snow
Name: name, dtype: object
第二個檔案可能需要嵌套for回圈,因為它必須運行不同的日子,并且每天都必須檢查所有用戶。
import datetime
users2 = [
{
'id': 1,
'name': 'Employee #1',
'customer_id': 1,
'activated_on': datetime.date(2018, 11, 4),
'deactivated_on': datetime.date(2019, 1, 10)
},
{
'id': 2,
'name': 'Employee #2',
'customer_id': 1,
'activated_on': datetime.date(2018, 12, 4),
'deactivated_on': None
}
]
date = datetime.date(2018, 12, 31),
one_day = datetime.timedelta(days=1)
price = 0.129032258 # $4 / 31days
subtotal = 0
for x in range(31):
count = 0 # count persons
date = one_day # get next date
# check every person
for person in users2:
if (person['activated_on'] < date) and (person['deactivated_on'] is None or person['deactivated_on'] > date):
count = 1
# display result for one date
total = count * price
subtotal = total
print(f'{date} | {count:2} active users * ${price:.2f} = {total:.2f} (subtotal: {subtotal:.2f})')
結果:
2018-12-31
2019-01-01 | 2 active users * $0.13 = 0.26 (subtotal: 0.26)
2019-01-02 | 2 active users * $0.13 = 0.26 (subtotal: 0.52)
2019-01-03 | 2 active users * $0.13 = 0.26 (subtotal: 0.77)
2019-01-04 | 2 active users * $0.13 = 0.26 (subtotal: 1.03)
2019-01-05 | 2 active users * $0.13 = 0.26 (subtotal: 1.29)
2019-01-06 | 2 active users * $0.13 = 0.26 (subtotal: 1.55)
2019-01-07 | 2 active users * $0.13 = 0.26 (subtotal: 1.81)
2019-01-08 | 2 active users * $0.13 = 0.26 (subtotal: 2.06)
2019-01-09 | 2 active users * $0.13 = 0.26 (subtotal: 2.32)
2019-01-10 | 1 active users * $0.13 = 0.13 (subtotal: 2.45)
2019-01-11 | 1 active users * $0.13 = 0.13 (subtotal: 2.58)
2019-01-12 | 1 active users * $0.13 = 0.13 (subtotal: 2.71)
2019-01-13 | 1 active users * $0.13 = 0.13 (subtotal: 2.84)
2019-01-14 | 1 active users * $0.13 = 0.13 (subtotal: 2.97)
2019-01-15 | 1 active users * $0.13 = 0.13 (subtotal: 3.10)
2019-01-16 | 1 active users * $0.13 = 0.13 (subtotal: 3.23)
2019-01-17 | 1 active users * $0.13 = 0.13 (subtotal: 3.35)
2019-01-18 | 1 active users * $0.13 = 0.13 (subtotal: 3.48)
2019-01-19 | 1 active users * $0.13 = 0.13 (subtotal: 3.61)
2019-01-20 | 1 active users * $0.13 = 0.13 (subtotal: 3.74)
2019-01-21 | 1 active users * $0.13 = 0.13 (subtotal: 3.87)
2019-01-22 | 1 active users * $0.13 = 0.13 (subtotal: 4.00)
2019-01-23 | 1 active users * $0.13 = 0.13 (subtotal: 4.13)
2019-01-24 | 1 active users * $0.13 = 0.13 (subtotal: 4.26)
2019-01-25 | 1 active users * $0.13 = 0.13 (subtotal: 4.39)
2019-01-26 | 1 active users * $0.13 = 0.13 (subtotal: 4.52)
2019-01-27 | 1 active users * $0.13 = 0.13 (subtotal: 4.65)
2019-01-28 | 1 active users * $0.13 = 0.13 (subtotal: 4.77)
2019-01-29 | 1 active users * $0.13 = 0.13 (subtotal: 4.90)
2019-01-30 | 1 active users * $0.13 = 0.13 (subtotal: 5.03)
2019-01-31 | 1 active users * $0.13 = 0.13 (subtotal: 5.16)
相同,pandas但它會使用date_range
import pandas as pd
import datetime
users2 = [
{
'id': 1,
'name': 'Employee #1',
'customer_id': 1,
'activated_on': datetime.date(2018, 11, 4),
'deactivated_on': datetime.date(2019, 1, 10)
},
{
'id': 2,
'name': 'Employee #2',
'customer_id': 1,
'activated_on': datetime.date(2018, 12, 4),
'deactivated_on': None
}
]
df = pd.DataFrame(users2)
print('\n--- dataframe ---\n')
print(df)
print()
price = 0.129032258 # 4/31 # $4 / 31days
subtotal = 0
for date in pd.date_range('2019.01.01', periods=31):
#print('\n===== date:', date, '=====\n')
mask1 = (df['activated_on'] < date)
mask2 = (df['deactivated_on'].isnull())
mask3 = (df['deactivated_on'] > date)
#print(mask1)
#print(mask2)
#print(mask3)
mask = mask1 & (mask2 | mask3)
#print('\n--- mask ---\n')
#print(mask)
selected = df[ mask ]
count = len(selected)
total = count * price
subtotal = total
print(f'{date.date()} | {count:2} active users * ${price:.2f} = {total:.2f} (subtotal: {subtotal:.2f})')
轉載請註明出處,本文鏈接:https://www.uj5u.com/yidong/358178.html
標籤:python list dictionary
上一篇:如何對列中的每個字典執行操作?
下一篇:從串列中彈出或洗掉字典項
