我在 PostgreSQL 資料庫中有數千行重復資料。要找出重復的行,我正在使用以下代碼:
SELECT "Date" FROM stockdata
group by "Date"
having count("Date")>1
這再次產生了數千行日期列,其中包含超過 1 個條目。如何洗掉帶有日期的行,以便僅保留 1 個重復項。
PS 輸入資料時我不能使用主鍵。
更新
根據評論。沒有主鍵。日期也是唯一的,因此不能有 2 個或更多。df 看起來像這樣:
Date High Low Open Close Volume Adj Close
0 2017-04-03 893.489990 885.419983 888.000000 891.510010 3422300 891.510010
1 2017-04-04 908.539978 890.280029 891.500000 906.830017 4984700 906.830017
2 2017-04-05 923.719971 905.619995 910.820007 909.280029 7508400 909.280029
3 2017-04-06 917.190002 894.489990 913.799988 898.280029 6344100 898.280029
4 2017-04-07 900.090027 889.309998 899.650024 894.880005 3710900 894.880005
... ... ... ... ... ... ... ...
12595 2022-03-28 1097.880005 1053.599976 1065.099976 1091.839966 34168700 1091.839966
12596 2022-03-29 1114.770020 1073.109985 1107.989990 1099.569946 24538300 1099.569946
12597 2022-03-30 1113.949951 1084.000000 1091.170044 1093.989990 19955000 1093.989990
12598 2022-03-31 1103.140015 1076.640015 1094.569946 1077.599976 16265600 1077.599976
12599 2022-04-01 1094.750000 1066.640015 1081.150024 1076.352783 11449987 1076.352783
12600 rows × 7 columns
- 資料在一些地方重復了幾次。
- 但是,具有相同日期的行具有相同的資料。
- This data is not a stock data (i am using it as a troubleshoot example) but from yokogawa datalogger. https://www.yokogawa.com/in/solutions/products-platforms/data-acquisition/data-logger/#Overview There are redundancies in the system and the earlier integrator had just dumped all the data on 1 database and thus if redundant logger comes online, the database has multiple entries. I need to remove it so we can actually use the data. I don't have access to their software.
Further Update:
Using this code as suggested in the comments:
delete from stockdata s
using
(SELECT "Date" , max(ctid) as max_ctid from stockdata group by "Date") t
where s.ctid<>t.max_ctid
and s."Date"=t."Date";
It was able to do the job but going forward, is this dangerous solution for production?
uj5u.com熱心網友回復:
這應該可以解決問題:
DELETE FROM
stockdata a
USING stockdata b
WHERE
a.id < b.id
AND a.Date = b.Date;
但要小心,這將立即洗掉所有重復項。沒有辦法恢復它們。
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/454603.html
標籤:postgresql
下一篇:TypeORM的型別不匹配
