我有兩張桌子
showroom
============================================
model_id car_name is_available
--------------------------------------------
0 rav4 false
1 tacoma false
2 corolla false
3 tundra false
和
warehouse
========================
model_id car_name
------------------------
0 rav4
1 tacoma
主鍵是model_id. 我想列更新showroom.is_available為true,如果model_id出現在warehouse。所以根據上面的例子輸出將是
showroom
============================================
model_id car_name is_available
--------------------------------------------
0 rav4 true
1 tacoma true
2 corolla false
3 tundra false
我知道我可以
UPDATE showroom
SET is_available = true
WHERE model_id IN (
SELECT model_id
FROM warehouse
);
但是兩個表都有超過 10 億行,即使我使用的是 Spark SQL,查詢也很慢
uj5u.com熱心網友回復:
SELECT
a.model_id
,a.car_name
,CASE WHEN b.model_id IS NOT NULL THEN TRUE ELSE FALSE END is_available
FROM
showroom a
LEFT OUTER JOIN warehouse b ON (a.model_id = b.model_id)
uj5u.com熱心網友回復:
在大表上,連接比 IN 子句快
看看你能不能#t減少這個數字
對于陳列室,您可以在 (model_id,is_available) 上使用組合索引
當您還可以減少可能的候選人數量時
UPDATE showroom s
INNER JOIN warehouse w ON s.model_id = w.model_id
SET s.is_available = true
WHERE s.is_available = FALSE
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/339041.html
