如何在沒有額外浮點小數的情況下將pandas資料幀to

我想使用dtype='float32'（它可能是一個 numpy dtype => np.float32）而不是dtype='float64'減少我的熊貓資料幀的記憶體使用量，因為我必須處理休熊貓資料幀。

有一次，我想提取一個 python 串列，'.to_dict(orient='records')'以便為每一行獲取一個字典。

在這種情況下，我會得到額外的小數位，這可能是基于 s.th 的，如下所示：

浮點數學被破壞了嗎？

如何轉換日期/更改型別等以獲得相同的結果，就像我得到的一樣float64（參見示例片段）？

import pandas as pd

_data = {'col1': [1.45123, 1.64123], 'col2': [0.1, 0.2]}

_test = pd.DataFrame(_data).astype(dtype='float64')

print(f"{_test=}")
print(f"{_test.round(1)=}")
print(f"{_test.to_dict(orient='records')=}")
print(f"{_test.round(1).to_dict(orient='records')=}")

float64輸出：


_test=      col1  col2
0  1.45123   0.1
1  1.64123   0.2
_test.round(1)=   col1  col2
0   1.5   0.1
1   1.6   0.2
_test.to_dict(orient='records')=[{'col1': 1.45123, 'col2': 0.1}, {'col1': 1.64123, 'col2': 0.2}]
_test.round(1).to_dict(orient='records')=[{'col1': 1.5, 'col2': 0.1}, {'col1': 1.6, 'col2': 0.2}]

import pandas as pd

_data = {'col1': [1.45123, 1.64123], 'col2': [0.1, 0.2]}

_test = pd.DataFrame(_data).astype(dtype='float32')

print(f"{_test=}")
print(f"{_test.round(1)=}")
print(f"{_test.to_dict(orient='records')=}")
print(f"{_test.round(1).to_dict(orient='records')=}")

float32輸出：

_test=      col1  col2
0  1.45123   0.1
1  1.64123   0.2
_test.round(1)=   col1  col2
0   1.5   0.1
1   1.6   0.2
_test.to_dict(orient='records')=[{'col1': 1.4512300491333008, 'col2': 0.10000000149011612}, {'col1': 1.6412299871444702, 'col2': 0.20000000298023224}]
_test.round(1).to_dict(orient='records')=[{'col1': 1.5, 'col2': 0.10000000149011612}, {'col1': 1.600000023841858, 'col2': 0.20000000298023224}]

uj5u.com熱心網友回復：

管理浮動表示有一些限制，例如這個

使用 to_dict() 函式從 numpy 表示切換到 python 本機浮點表示，這意味著一種翻譯。盡管如此，您使用的精度仍然會丟失一些小資訊。

對于無損轉換，您必須在to_dict()之前使用as_type()函式將您的數字轉換為字串：

_data = {'col1': [1.45123, 1.64123], 'col2': [0.1, 0.2]}
_test = pd.DataFrame(_data).astype(dtype='float32')
_test.round(1).astype('str').to_dict(orient='records')

_test.round(1).astype('str').to_dict(orient='records')=[{'col1': '1.5', 'col2': '0.1'}, {'col1': '1.6', 'col2': '0.2'}]

另一種可能是十進制格式。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/479480.html

標籤：Python 熊猫麻木的

上一篇：在哪些情況下numpy的out引數更快？

下一篇：有沒有辦法在不使用pd.DataFrame()函式的情況下創建資料框？

如何在沒有額外浮點小數的情況下將pandas資料幀to_dict與float32一起使用