如何根據熊貓中另一列中的值對升序和降序進行排序？-有解無憂

我有一個帶有一組值（價格）的熊貓資料框。在每組中，initiator_id我需要對價格進行升序排序，如果是type == sell，則降序排列type == buy。然后我在每個組中添加一個 id。現在我做：

 df['bidnum'] = df.groupby(['initiator_id', 'type']).cumcount()

'initiator_id', 'type == sell'在每個組中升序和降序排序的有效方法是什么'initiator_id', 'type == buy'？

這是原始資料集現在的樣子：

initiator_id    price   type    bidnum
1       170.81  sell    0
2       170.81  sell    0
2       169.19  buy     0
3       170.81  sell    0
3       169.19  buy     0
3       70.81   sell    1
4       170.81  sell    0
4       169.19  buy     0
4       70.81   sell    1
4       69.19   buy     1

我需要類似的東西：

initiator_id, price, type
1, 100,sell
1, 99, sell
1, 98, sell
1, 110, buy
1, 120, buy
1, 125, buy

這樣sell每個initiator_id組內的子組按降序排序，buy子組按升序排序。

uj5u.com熱心網友回復：

如果您可以假設您的"price"列將始終包含非負值，我們可以“作弊”。為買入或賣出操作的價格分配一個負值，排序，然后計算絕對值以回傳原始價格：

如果型別為"buy"，則價格保持正數 (2 * 1 - 1 = 1)。如果型別為"sell"，價格將變為負數 (2 * 0 - 1 = -1)。
```
df["price"] = df["price"] * (2 * (df["type"] == "buy").astype(int) - 1)
```
現在正常排序值。我已經包含了"initiator_id"和"type"列以匹配您的預期輸出：
```
df = df.sort_values(["initiator_id", "type", "price"])
```
最后，計算列的絕對值"price"以檢索原始值：
```
df["price"] = df["price"].abs()
```

此操作在您的示例輸入上的預期輸出：

   initiator_id   price  type  bidnum
0             1  170.81  sell       0
2             2  169.19   buy       0
1             2  170.81  sell       0
4             3  169.19   buy       0
3             3  170.81  sell       0
5             3   70.81  sell       1
9             4   69.19   buy       1
7             4  169.19   buy       0
6             4  170.81  sell       0
8             4   70.81  sell       1

uj5u.com熱心網友回復：

一種解決方案：

final_df = pd.DataFrame()
grouped_df = df.groupby(['initiator_id', 'type'])

for key, item in grouped_df:
    dfg = grouped_df.get_group(key).reset_index()
    final_df = final_df.append(dfg.sort_values('price', ascending=(dfg.loc[0, 'type']=='buy')))
            
final_df.drop(final_df.columns[0], axis=1, inplace=True)
final_df.reset_index(inplace=True, drop=True)

輸出：

   initiator_id   price  type
0             1  170.81  sell
1             2  169.19   buy
2             2  170.81  sell
3             3  169.19   buy
4             3  170.81  sell
5             3   70.81  sell
6             4   69.19   buy
7             4  169.19   buy
8             4  170.81  sell
9             4   70.81  sell

uj5u.com熱心網友回復：

其他人都用熊貓給出了解決方案。在這里，我提出了一個沒有 pandas 的解決方案。

輸入 CSV：

initiator_id,price,type,bidnum
1,170.81,sell,0
2,170.81,sell,0
2,169.19,buy,0
3,170.81,sell,0
3,169.19,buy,0
3,70.81,sell,1
4,170.81,sell,0
4,169.19,buy,0
4,70.81,sell,1
4,69.19,buy,1

輸出 CSV：

initiator_id,price,type,bidnum
1,170.81,sell,0
2,170.81,sell,0
2,169.19,buy,0
3,170.81,sell,0
3,70.81,sell,1
3,169.19,buy,0
4,170.81,sell,0
4,70.81,sell,1
4,69.19,buy,1
4,169.19,buy,0

代碼：

from collections import OrderedDict
import numpy

"""
the reason why this code uses exec is so that the ordering of columns can be arbitrary
"""

def remove_duplicates(seq):
    seen = set()
    seen_add = seen.add
    return [x for x in seq if not (x in seen or seen_add(x))]

def returnLastIndex(temp2):
    global mydict
    temp3 = mydict['initiator_id'][temp2]
    while True:
        temp2 = temp2   1
        try:
            if mydict['initiator_id'][temp2] != temp3:
                return temp2-1
        except:
            return temp2-1

def returnFirstIndex(temp2):
    global mydict
    temp3 = mydict['initiator_id'][temp2]
    while temp2 >= 1:
        temp2 = temp2 - 1
        if mydict['initiator_id'][temp2] != temp3:
            return temp2 1
    return 0


with open("input.csv") as file:
    lines = file.readlines()

new_lines = []
new_headers = []
for x in range(len(lines)): #loop to reamove headers and newlines
    if x == 0:
        for y in lines[x].strip().split(","):
            new_headers.append(y)
    else:
        new_lines.append(lines[x].strip())

mydict = OrderedDict()
for x in new_headers:
    exec("mydict['" x "'] = []")

for x in range(len(new_headers)):
    for y in new_lines:
        if new_headers[x] == "initiator_id":
            exec("mydict['" new_headers[x] "'].append(int('" y.split(",")[x] "'))")
        elif new_headers[x] == "price":
            exec("mydict['" new_headers[x] "'].append(float('" y.split(",")[x] "'))")
        else:
            exec("mydict['" new_headers[x] "'].append('" y.split(",")[x] "')")

for x in new_headers:
    exec("mydict['" x "'] = numpy.array(mydict['" x "'])")


temp1 = mydict['initiator_id'].argsort()

for x in (new_headers):
    exec("mydict['" x "'] = mydict['" x "'][temp1]")

splice_list_first = []

for x in range(len(mydict['initiator_id'])):
    splice_list_first.append(returnFirstIndex(x))

splice_list_last = []

for x in range(len(mydict['initiator_id'])):
    splice_list_last.append(returnLastIndex(x))

splice_list_first = remove_duplicates(splice_list_first)
splice_list_last = remove_duplicates(splice_list_last)

master_string = ",".join(new_headers) "\n"

for x in range(len(splice_list_first)):
    temp4 = OrderedDict()
    for y in new_headers:
        exec("temp4['" y "'] = mydict['" y "'][" str(splice_list_first[x]) ":" str(splice_list_last[x] 1) "]")
    sell_index = []
    buy_index = []
    for z in range(len(temp4['type'])):
        if temp4['type'][z] == "sell":
            sell_index.append(z)
        if temp4['type'][z] == "buy":
            buy_index.append(z)
    temp5 = OrderedDict()
    for a in range(len(sell_index)):
        for b in new_headers:
            try:
                exec("temp5['" b "']")
            except:
                exec("temp5['" b "'] = []")
            exec("temp5['" b "'].append(temp4['" b "'][" str(sell_index[a]) ":" str(sell_index[a] 1) "][0])")
    try:
        for c in new_headers:
            exec("temp5['" c "'] = numpy.array(temp5['" c "'])")
        temp7 = temp5['price'].argsort()[::-1]
        for d in (new_headers):
            exec("temp5['" d "'] = temp5['" d "'][temp7]")
        for e in range(len(temp5['initiator_id'])):
            for f in new_headers:
                master_string = master_string   str(temp5[f][e]) ","
            master_string = master_string[:-1] "\n"
    except Exception as g:
        pass


    temp6 = OrderedDict()
    for a in range(len(buy_index)):
        for b in new_headers:
            try:
                exec("temp6['" b "']")
            except:
                exec("temp6['" b "'] = []")
            exec("temp6['" b "'].append(temp4['" b "'][" str(buy_index[a]) ":" str(buy_index[a] 1) "][0])")
    try:
        for c in new_headers:
            exec("temp6['" c "'] = numpy.array(temp6['" c "'])")
        temp7 = temp6['price'].argsort()
        for d in (new_headers):
            exec("temp6['" d "'] = temp6['" d "'][temp7]")
        for e in range(len(temp6['initiator_id'])):
            for f in new_headers:
                master_string = master_string   str(temp6[f][e]) ","
            master_string = master_string[:-1] "\n"
    except Exception as g:
        pass


print(master_string)
f = open("output.csv", "w")
f.write(master_string)
f.close()

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/428706.html

標籤：Python 熊猫

上一篇：如何創建空Pandas系列并將其與其他系列合并

下一篇：如何重命名熊貓中的嵌套列組？