我有一個資料框,其中包含日期時間值和陣列以及將來可能的其他資料型別。我希望 to_sql 到 PostgreSQL,其中 datetime 是(沒有時區的時間戳),陣列是(位元組)型別,但我不知道為 dtype 引數放什么。
有沒有辦法根據資料框列的資料型別動態執行 dtype?
表的外觀:
CREATE TABLE IF NOT EXISTS data2_mcw_tmp (
time timestamp without time zone,
u bytea,
v bytea,
w bytea,
spd bytea,
dir bytea,
temp bytea
);
到目前為止我的代碼(在來自用戶 rftr 的幫助之后):
dtypedict = {}
Data2_mcw_conv = Data2_mcw.copy()
for row in Data2_mcw_conv.index:
for col in Data2_mcw_conv.columns:
value = Data2_mcw_conv[col][row]
try:
if type(Data2_mcw_conv[col].iloc[0]).__module__ == np.__name__:
dtypedict.update({col:BYTEA})
value = Data2_mcw_conv[col].loc[row]
print('before: ')
print (value.flags)
print('---------------------')
value = value.copy(order='C')
print('after: ')
print (value.flags)
print('=====================')
value = pickle.dumps(value)
except:
if isinstance(Data2_mcw_conv[col].iloc[0], datetime.date):
dtypedict.update({col:TIMESTAMP})
Data2_mcw_conv[col][row] = value
Data2_mcw_conv.to_sql(name='data2_mcw_tmp',con=conn,
if_exists = 'replace',
dtype=dtypedict)
但是,我收到此錯誤:
Traceback (most recent call last):
File "C:\Users\myname\Desktop\database\pickletopdb2.py", line 145, in <module>
postgres_conv()
File "C:\Users\myname\Desktop\database\pickletopdb2.py", line 124, in postgres_conv
Data2_mcw_conv.to_sql(name='data2_mcw_tmp',con=conn,
File "C:\Python38\lib\site-packages\pandas\core\generic.py", line 2778, in to_sql
sql.to_sql(
File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 590, in to_sql
pandas_sql.to_sql(
File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 1397, in to_sql
table.insert(chunksize, method=method)
File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 831, in insert
exec_insert(conn, keys, chunk_iter)
File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 748, in _execute_insert
conn.execute(self.table.insert(), data)
File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1286, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "C:\Python38\lib\site-packages\sqlalchemy\sql\elements.py", line 325, in _execute_on_connection
return connection._execute_clauseelement(
File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1478, in _execute_clauseelement
ret = self._execute_context(
File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1842, in _execute_context
self._handle_dbapi_exception(
File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 2027, in _handle_dbapi_exception
util.raise_(exc_info[1], with_traceback=exc_info[2])
File "C:\Python38\lib\site-packages\sqlalchemy\util\compat.py", line 207, in raise_
raise exception
File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1779, in _execute_context
self.dialect.do_executemany(
File "C:\Python38\lib\site-packages\sqlalchemy\dialects\postgresql\psycopg2.py", line 951, in do_executemany
context._psycopg2_fetched_rows = xtras.execute_values(
File "C:\Python38\lib\site-packages\psycopg2\extras.py", line 1267, in execute_values
parts.append(cur.mogrify(template, args))
ValueError: ndarray is not C-contiguous
value.flag 在 value = value.copy(order='C') 之前/之后輸出:
before:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
---------------------
after:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
=====================
為什么會發生此錯誤以及如何解決它?
uj5u.com熱心網友回復:
遵循此答案的第二個代碼片段,并PostgreSQL從此處自定義 dtype 為其等效項。所以在你的情況下,例如:
from sqlalchemy.dialects.postgresql import BYTEA, TIMESTAMP
def sqlcol(dfparam):
# ...
if "datetime" in str(j):
dtypedict.update({i: TIMESTAMP})
if "object" in str(j): # Depending on what your other column's datatypes are
dtypesdict.update({i: BYTEA})
# ...
筆記:
- 根據docs,
TIMESTAMP默認情況下沒有時區。 - 你(位元組)列通常與資料型別代表
object在pandas。如果您將來添加更多資料,您應該考慮到這一點。
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/341486.html
標籤:Python 熊猫 PostgreSQL 数据框 大熊猫到 sql
上一篇:使用ALTERSYSTEMSET(postgresql14)更改配置后,SELECTpg_reload_conf()不起作用
