我有一個帶有列名的資料框 df:
names
phil/andy
allen
john/william/chris
john
我想把它變成一種“字典”(熊貓資料框),每個名字都有唯一的亂數:
name value
phil 1
andy 2
allen 3
john 4
william 5
chris 6
怎么做?資料框是示例,所以我需要一個函式來對非常大的資料框做同樣的事情
uj5u.com熱心網友回復:
干得好。
import numpy as np
import pandas as pd
# Original pd.DataFrame
d = {'phil': [1],
'phil/andy': [2],
'allen': [3],
'john/william/chris': [4],
'john': [5]
}
df = pd.DataFrame(data=d)
# Append all names to a list
names = []
for col in df.columns:
names = names col.split("/")
# Remove duplicated names from the list
names = [i for n, i in enumerate(names) if i not in names[:n]]
# Create DF
df = pd.DataFrame(
# Random numbers
np.random.choice(
len(names), # Length
size = len(names), # Shape
replace = False # Unique random numbers
),
# Index names
index = names,
# Column names
columns = ['Rand value']
)
如果你想創建一個字典而不是一個 pd.DataFrame 你也可以d = df.T.to_dict()在最后申請。如果你想要數字0,1,2,3,...,n而不是亂數,你可以np.random.choice()用range().
uj5u.com熱心網友回復:
# assuming you have a df like this
df = pd.DataFrame({'names': ['phil/andy', 'allen', 'john/william/chris', 'john']})
# split the names and explode to create a single column
# reset_index twice to get unique values for each row
# drop duplicate names to get unique value for each name
df['names'].str.split('/').explode().reset_index(drop=True).reset_index().drop_duplicates('names').rename({'index':'value'}, axis=1)
value names
0 0 phil
1 1 andy
2 2 allen
3 3 john
4 4 william
5 5 chris
uj5u.com熱心網友回復:
您可以使用 numpy 為這些名稱生成隨機整數,然后可以使用以下命令將其轉換為字典.to_dict():
import numpy as np
import pandas as pd
names_lst = ["phil", "andy", "allen", "john", "william", "chris", "john"]
df = pd.DataFrame(names_lst, columns=["name"])
df["value"] = np.random.randint(1, 6, df.shape[0])
print(df.set_index('name')["value"].to_dict())
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/477037.html
標籤:Python python-3.x 数据框 功能
