在這個例子中,我有大量的 ID(1.6m ),在每條規則中都有一個 0/??1 值。
示例表:
| ID | Rule 1 | Rule 2 | Rule 3 | Rule 4 |.....
| :-----: | :----: | :----: | :----: | :----: |
| 12721 | 1 | 1 | 0 | 1 |
| 12722 | 1 | 1 | 1 | 1 |
| 12723 | 0 | 1 | 0 | 0 |
| 12724 | 0 | 1 | 0 | 0 |
| 12725 | 0 | 0 | 0 | 0 |
| 12726 | 1 | 1 | 0 | 1 |
| 12727 | 0 | 1 | 1 | 1 |
.......
我正在嘗試創建一個“關系矩陣”(我不知道它的正確名稱),即兩列共享相同值的頻率。
期望的結果:
| ID | Rule 1 | Rule 2 | Rule 3 | Rule 4 |.....
| :-----: | :----: | :----: | :----: | :----: |
| Rule 1 | - | 3 | 1 | 3 |
| Rule 2 | 3 | - | 2 | 4 |
| Rule 3 | 1 | 2 | - | 2 |
| Rule 4 | 3 | 4 | 2 | - |
.....
我知道我需要使用某種 PIVOT 函式,但我什至不知道我需要開始的確切方法。
我已經有了;
SELECT COUNT(*), SUM(Rule_1), SUM(Rule_2), SUM(Rule_3), SUM(Rule_4)...
FROM TABLE
對于我正在處理的作品的另一個要求
uj5u.com熱心網友回復:
UNPIVOT 的序列,在條件上將 UNPIVOT 資料與自身連接,通過在 2 列上分組包含規則的 COUNT,然后 PIVOT 得到最終結果:(我在 ORACLE 中測驗,但它是 ANSI SQL,所以它應該像在 SQLServer 中,我剛剛洗掉了“從雙重”)
WITH data(id, r1, r2, r3, r4) AS (
SELECT 12721, 1, 1, 0, 1 UNION ALL
SELECT 12722, 1, 1, 1, 1 UNION ALL
SELECT 12723, 0, 1, 0, 0 UNION ALL
SELECT 12724, 0, 1, 0, 0 UNION ALL
SELECT 12725, 0, 0, 0, 0 UNION ALL
SELECT 12726, 1, 1, 0, 1 UNION ALL
SELECT 12727, 0, 1, 1, 1
),
udata AS (
SELECT * FROM data
UNPIVOT (
val for col in (r1, r2, r3, r4)
)
),
sdata AS (
SELECT s1.id, s1.col as col1, s2.col as col2, s1.val
FROM udata s1
JOIN udata s2 ON s1.id = s2.id AND s1.val = s2.val AND s1.col <> s2.col AND s1.val = 1
),
cdata AS (
SELECT col1, col2, COUNT(*) AS c
FROM sdata
GROUP BY col1, col2
)
SELECT * FROM cdata
PIVOT (
MAX(c) FOR col2 IN ('R1' as r1, 'R2' as r2, 'R3' as r3, 'R4' as r4)
)
ORDER by col1
;
R1 3 1 3
R2 3 2 4
R3 1 2 2
R4 3 4 2
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/530324.html
標籤:sql亚马逊红移
