嗨,我有一張大約有 10 億行的表。在索引欄位上執行 ORDER BY 需要大約 3 秒才能使用 LIMIT 獲取 30 條記錄,而沒有 ORDER BY 則需要 195 毫秒。我想加快這個速度。
誰能幫我解決這個問題?
這是查詢的簡化版本(洗掉了一些欄位和幾個連接)。
SELECT DISTINCT `auctions_opportunity`.`id`,
`auctions_opportunity`.`employer_id`,
`auctions_opportunity`.`salary`,
`auctions_opportunity`.`is_active`,
`auctions_opportunity`.`is_interested`,
`auctions_opportunity`.`interview_status`,
`auctions_opportunity`.`previous_interview_status`,
`auctions_opportunity`.`creation_source`,
`auctions_opportunity`.`created_at`,
`auctions_opportunity`.`last_modified`,
`auctions_opportunity`.`last_instant_alert_email_at`,
`auctions_opportunity`.`last_daily_alert_email_at`,
`auctions_opportunity`.`last_periodic_alert_email_at`,
`auctions_opportunity`.`reviewed_at`,
`auctions_opportunity`.`last_modified_by_id`,
`auctions_opportunity`.`candidate_id`,
`auctions_opportunity`.`job_id`,
`auctions_opportunity`.`interview_request_notes`,
`auctions_opportunity`.`application_email_at`,
`auctions_opportunity`.`batch_application_email_at`,
`auctions_opportunity`.`is_location_match`,
`auctions_opportunity`.`is_strong_match`,
`auctions_opportunity`.`score`,
`auctions_opportunity`.`es_score`,
`auctions_opportunity`.`message`,
`auctions_opportunity`.`es_maybe`,
`candidates_candidate`.`id`,
`candidates_candidate`.`user_id`,
`candidates_candidate`.`phone`,
`candidates_candidate`.`is_new`,
`candidates_candidate`.`last_seen`,
`candidates_candidate`.`last_agreed_to_terms_at`,
`candidates_candidate`.`email_backend_status`,
`candidates_candidate`.`email_suppressed_at`,
`candidates_candidate`.`email_verified_at`,
`candidates_candidate`.`number_verified_at`,
`candidates_candidate`.`last_emailed_at`,
`candidates_candidate`.`last_updated`,
`candidates_candidate`.`internal_note`,
`candidates_candidate`.`main_skills`,
`candidates_candidate`.`main_skills_nopunc`,
`candidates_candidate`.`total_experience`,
`candidates_candidate`.`current_company`,
`candidates_candidate`.`current_company_nopunc`,
`candidates_candidate`.`current_designation`,
`candidates_candidate`.`onboarding_completed_at`,
`candidates_candidate`.`previously_onboarded`,
`candidates_candidate`.`talent_advocate_id`,
`candidates_candidate`.`job_function_skills`,
`candidates_candidate`.`availability`,
`candidates_candidate`.`deactivated_at`,
`candidates_candidate`.`deactivated_by_id`,
`candidates_candidate`.`deactivation_source`,
`candidates_candidate`.`is_private`,
`candidates_candidate`.`previous_companies`,
`candidates_candidate`.`companies_interned_at`,
`candidates_candidate`.`gender`,
`candidates_candidate`.`skype_id`,
`candidates_candidate`.`alternate_phone`,
`candidates_candidate`.`shadow_linkedin`,
`candidates_candidate`.`last_job_post_email`,
`candidates_candidate`.`last_job_alert_email`,
`candidates_candidate`.`last_joining_email_sent`,
`candidates_candidate`.`last_hired_email_sent`,
`candidates_candidate`.`last_reonboarding_email`,
`candidates_candidate`.`last_indexed_at`,
`candidates_candidate`.`resume_viewed_notification_type`,
`candidates_candidate`.`last_resume_viewed_email_at`,
`candidates_candidate`.`seen_go_premium_modal_at`,
`candidates_candidate`.`seen_active_check_modal_at`,
`candidates_candidate`.`alerts_limit_reached_at`,
`candidates_candidate`.`calculation_done_at`,
`candidates_candidate`.`calculation_attempted_at`,
`candidates_candidate`.`bio`,
`candidates_candidate`.`last_seen_activity_at`,
`candidates_candidate`.`job_unsubscribed_at`,
`candidates_candidate`.`monthly_alerts_unsubscribed_at`,
`candidates_candidate`.`resume_views_unsubscribed_at`,
`candidates_candidate`.`update_preferences_unsubscribed_at`,
`candidates_candidate`.`onboarding_reminder_unsubscribed_at`,
`candidates_candidate`.`is_hireable`,
`candidates_candidate`.`recruiter_message_push_unsubscribed_at`,
`candidates_candidate`.`resume_views_email_unsubscribed_at`,
`candidates_candidate`.`resume_views_push_unsubscribed_at`,
`candidates_candidate`.`profile_reminder_email_unsubscribed_at`,
`candidates_candidate`.`profile_reminder_push_unsubscribed_at`,
`candidates_candidate`.`newsletter_email_unsubscribed_at`,
`candidates_candidate`.`newsletter_push_unsubscribed_at`,
`candidates_candidate`.`product_updates_email_unsubscribed_at`,
`candidates_candidate`.`product_updates_push_unsubscribed_at`,
`candidates_candidate`.`push_notifications_shown_at`,
`candidates_candidate`.`push_notifications_enabled`,
`candidates_candidate`.`push_notifications_verified_at`,
`candidates_candidate`.`whatsapp_number`,
`candidates_candidate`.`whatsapp_enabled`,
`candidates_candidate`.`whatsapp_verified_at`,
`candidates_candidate`.`last_whatsapp_sent_at`,
`auth_user`.`id`,
`auth_user`.`password`,
`auth_user`.`last_login`,
`auth_user`.`is_superuser`,
`auth_user`.`username`,
`auth_user`.`first_name`,
`auth_user`.`last_name`,
`auth_user`.`email`,
`auth_user`.`is_staff`,
`auth_user`.`is_active`,
`auth_user`.`date_joined`
FROM `auctions_opportunity`
INNER JOIN `employers_employer` ON (`auctions_opportunity`.`employer_id` = `employers_employer`.`id`)
INNER JOIN `jobs_job` ON (`auctions_opportunity`.`job_id` = `jobs_job`.`id`)
INNER JOIN `profiles_profilestatus` ON (`employers_employer`.`status_id` = `profiles_profilestatus`.`id`)
INNER JOIN `candidates_candidate` ON (`auctions_opportunity`.`candidate_id` = `candidates_candidate`.`id`)
INNER JOIN `auth_user` ON (`candidates_candidate`.`user_id` = `auth_user`.`id`)
INNER JOIN `candidates_resume` ON (`candidates_candidate`.`id` = `candidates_resume`.`candidate_id`)
LEFT OUTER JOIN `candidates_jobsearchpreferences` ON (`candidates_candidate`.`id` = `candidates_jobsearchpreferences`.`candidate_id`)
LEFT OUTER JOIN `candidates_customaction` ON (`auctions_opportunity`.`id` = `candidates_customaction`.`opportunity_id`)
LEFT OUTER JOIN `candidates_emailaction` ON (`auctions_opportunity`.`id` = `candidates_emailaction`.`opportunity_id`)
LEFT OUTER JOIN `candidates_saveaction` ON (`auctions_opportunity`.`id` = `candidates_saveaction`.`opportunity_id`)
LEFT OUTER JOIN `candidates_hideaction` ON (`auctions_opportunity`.`id` = `candidates_hideaction`.`opportunity_id`)
LEFT OUTER JOIN `candidates_hireaction` ON (`auctions_opportunity`.`id` = `candidates_hireaction`.`opportunity_id`)
WHERE (`auctions_opportunity`.`employer_id` = 4
AND NOT (`auctions_opportunity`.`job_id` IS NULL)
AND `profiles_profilestatus`.`name` = Approved
AND NOT (`auth_user`.`email` = deactivated@blobinfotech.com)
AND NOT (`candidates_candidate`.`availability` = 3)
AND NOT (`auctions_opportunity`.`job_id` IS NULL)
AND NOT (`candidates_resume`.`id` IS NULL)
AND (`jobs_job`.`is_active` = TRUE
AND `auctions_opportunity`.`is_active` = TRUE)
AND ((((`candidates_candidate`.`is_hireable` = TRUE
AND `candidates_candidate`.`last_seen` >= 2020-12-18 09:19:42.873898
AND `candidates_jobsearchpreferences`.`status` = 0)
OR (`candidates_candidate`.`is_hireable` = TRUE
AND (NOT (`candidates_jobsearchpreferences`.`status` = 0)
OR (`candidates_candidate`.`last_seen` < 2020-12-18 09:19:42.873989
AND `candidates_jobsearchpreferences`.`status` = 0))))
AND `candidates_candidate`.`is_private` = FALSE)
OR `auctions_opportunity`.`interview_status` = 1
OR NOT (`auctions_opportunity`.`id` IN
(SELECT U0.`id` AS Col1
FROM `auctions_opportunity` U0
LEFT OUTER JOIN `candidates_emailaction` U1 ON (U0.`id` = U1.`opportunity_id`)
WHERE (U1.`reply_email_at` IS NULL
AND U0.`id` = (`auctions_opportunity`.`id`)))))
AND NOT (`auctions_opportunity`.`candidate_id` IN
(SELECT U2.`candidate_id` AS Col1
FROM `candidates_blockedemployerintermediate` U2
WHERE U2.`employer_id` = 4))
AND `auctions_opportunity`.`job_id` IN (40729)
AND NOT (((`auctions_opportunity`.`creation_source` = 8
AND `auctions_opportunity`.`creation_source` IS NOT NULL)
OR (`auctions_opportunity`.`interview_status` = 1
AND `auctions_opportunity`.`is_active` = FALSE)))
AND ((((`candidates_candidate`.`is_hireable` = TRUE
AND (NOT (`candidates_jobsearchpreferences`.`status` = 0
AND `candidates_jobsearchpreferences`.`status` IS NOT NULL)
OR (`candidates_candidate`.`last_seen` < 2020-12-18 09:19:42.863410
AND `candidates_jobsearchpreferences`.`status` = 0)))
OR (`candidates_candidate`.`is_hireable` = TRUE
AND `candidates_jobsearchpreferences`.`status` = 2))
AND `candidates_candidate`.`is_private` = FALSE
AND `auctions_opportunity`.`interview_status` = 0
AND (`auctions_opportunity`.`is_location_match` = TRUE
OR ((`candidates_jobsearchpreferences`.`current_location` LIKE %Delhi%
OR `candidates_jobsearchpreferences`.`current_location` LIKE %Noida%
OR `candidates_jobsearchpreferences`.`current_location` LIKE %Gurgaon%
OR `candidates_jobsearchpreferences`.`current_location` LIKE %Faridabad%
OR `candidates_jobsearchpreferences`.`current_location` LIKE %Greater Noida%)
AND (`jobs_job`.`locations` LIKE %Delhi%
OR `jobs_job`.`locations` LIKE %Noida%
OR `jobs_job`.`locations` LIKE %Gurgaon%
OR `jobs_job`.`locations` LIKE %Faridabad%
OR `jobs_job`.`locations` LIKE %Greater Noida%)
AND `jobs_job`.`accept_outstation` = FALSE)))
OR (((`candidates_candidate`.`is_hireable` = TRUE
AND (NOT (`candidates_jobsearchpreferences`.`status` = 0
AND `candidates_jobsearchpreferences`.`status` IS NOT NULL)
OR (`candidates_candidate`.`last_seen` < 2020-12-18 09:19:42.863410
AND `candidates_jobsearchpreferences`.`status` = 0)))
OR (`candidates_candidate`.`is_hireable` = TRUE
AND `candidates_jobsearchpreferences`.`status` = 2))
AND `auctions_opportunity`.`reviewed_at` < 2020-12-18 09:19:42.863329
AND `auctions_opportunity`.`interview_status` = 1))
AND `auctions_opportunity`.`employer_id` = 4
AND `candidates_customaction`.`id` IS NULL
AND `candidates_emailaction`.`id` IS NULL
AND `candidates_saveaction`.`id` IS NULL
AND (`candidates_hideaction`.`id` IS NULL
OR `candidates_hideaction`.`is_deleted` = TRUE)
AND `candidates_hireaction`.`id` IS NULL
AND `auctions_opportunity`.`employer_id` = 4)
ORDER BY `auctions_opportunity`.`reviewed_at` DESC
LIMIT 30
uj5u.com熱心網友回復:
一般的答案很簡單。如果你拿一本電話簿,并被要求說出 30 個電話號碼(LIMIT 30沒有ORDER BY),其中該人的街道名稱包含 A,你需要多長時間?大概幾分鐘吧。現在,您需要說出該人的街道名稱包含 A的 30 個最低電話號碼(LIMIT 30帶有ORDER BY數然后取前三十。相比之下,這需要多長時間?
因此,使用和不使用ORDER BY. 這是預期的。
在您的情況下,DBMS 可能只選擇行,is_location_match = TRUE從而避免昂貴的位置搜索,LIKE例如(如果這足以最終找到 30 行)。它不能做到這一點,當它必須考慮所有的比賽,以便找到您的頂部由指定的30行ORDER BY。
回到電話簿:如果電話簿只有50個條目,作業不會太難。如果它有一百萬個條目,搜索它需要很長時間。所以處理小資料集一定是我們的目標。您在candidates_resume不使用表的情況下加入表。(條件AND NOT candidates_resume.id IS NULL是多余的,沒有意義。)您還加入了所有candidates_hideaction行,其中is_deleted = TRUE. 假設每個候選人有 3 個簡歷和 5 個已洗掉的隱藏操作。因此,您將資料集炸毀了 15 (3 x 5) 的因子,只是為了稍后使用DISTINCT.DISTINCT在大型資料集上是一項非常昂貴的操作。它也是一個寫得不好的查詢的典型指標,就像你的情況一樣:不必要的連接會破壞你的中間結果,而不是解決你DISTINCT最終應用的連接問題。
您還使用了許多反連接(例如LEFT OUTER JOIN candidates_customaction... WHERE candidates_customaction.id IS NULL)。這讓我一開始很困惑。我想知道為什么在我找到IS NULL條件之前你要交叉加入所有這些候選動作。我認為這些反連接不可讀。現在讓我們假設每個候選人的四種反加入動作型別中的每一種都有十個動作。如果 DBMS 進行連接,它將為單個候選人創建 10000 (10 x 10 x 10 x 10) 行,這些行必須全部被解雇。我會選擇使用NOT EXISTS子句來提高可讀性,并可能避免不必要的大中間結果。我也會NOT EXISTS用于檢查未洗掉的隱藏操作。
話雖如此,我的建議是您更改查詢,以便您可以關閉DISTINCT. 這可以為 DBMS 節省大量作業,并可能體現在最終的執行速度上。
您可以做一些小事來使查詢更具可讀性。你不必宣告它三次,你要employer_id = 4:-)然后,status = 0 AND status IS NOT NULL僅僅是status = 0當然的。而NOT (status = 0 AND status IS NOT NULL) OR (last_seen < xxx AND status = 0)因此是只status <> 0 OR last_seen < xxx。這不會加快查詢速度,但會使其更具可讀性和可維護性,也許您甚至想考慮狀態 NULL 并在清理條件的程序中犯了一個更明顯的錯誤。
uj5u.com熱心網友回復:
雖然這似乎ORDER BY是罪魁禍首,但很可能是其他原因導致了性能不佳。
OR在WHERE和ON應盡量避免實際的。NOT x IN ( SELECT ... )應該變成任一NOT EXISTS ( SELECT 1 ... )或LEFT JOIN ( SELECT y ... ) ... WHERE y IS NULL一些可能的索引來幫助:
auctions_opportunity: INDEX(employer_id, interview_status, job_id) candidates_saveaction: INDEX(opportunity_id, id) profiles_profilestatus: INDEX(name) candidates_jobsearchpreferences: INDEX(status) U1: INDEX(reply_email_at)
基于大版本查詢
LEFT JOINs構成過濾部分的 6 個可能是性能問題的一部分。正在測驗的列是否可以移動到主表中?大約有兩打
ORs。每個都抑制了對正在測驗的任何內容的索引的可能性。對于他們中的大多數,我沒有建設性的建議。status = 0暗示status IS NOT NULL. 我懷疑一些復雜的布爾運算式可以簡化。有一列可以是
NULL或0,但解釋這兩個“值”會導致不必要的復雜查詢。選擇NULL或0,然后在插入資料時將兩者映射到選擇的值。有多余的測驗,如
AND NOT (auctions_opportunity.JOB_IDIS NULL)。當然,優化器可能會過濾掉額外的作業,但我們不要相信它。flags
is_hireable,is_private,availability能否以某種方式組合在一起,既使查詢更簡單又仍能提供詳細資訊?AND NOT (auth_user.電子郵件= [email protected])是一個語法錯誤——還有其他錯誤嗎?(缺少引號)而不是幾個
LIKE '%...%',單個RLIKE '...|...'可能會更快。你需要
candidates_resume嗎?它似乎沒有被使用。的使用
U1似乎很奇怪。的LEFT JOIN是ON opportunity_id,然而IS NULL試驗是在不同的列(reply_email_at)。這可能不會像您預期的那樣作業。以下一些索引可能會有所幫助。
auctions_opportunity: INDEX(employer_id, job_id, reviewed_at) jobs_job: INDEX(is_active, locations, accept_outstation, id) profiles_profilestatus: INDEX(name, id) candidates_jobsearchpreferences: INDEX(candidate_id, status, current_location, candidate_id) blobinfotech: INDEX(com) U1: INDEX(opportunity_id, reply_email_at) U2: INDEX(employer_id, candidate_id)改變
AND NOT (`auctions_opportunity`.`candidate_id` IN ( SELECT U2.`candidate_id` AS Col1 FROM `candidates_blockedemployerintermediate` U2 WHERE U2.`employer_id` = 4) )
到
AND NOT EXISTS( SELECT 1
FROM `candidates_blockedemployerintermediate` U2
WHERE `auctions_opportunity`.`candidate_id` = U2.`candidate_id`
AND U2.`employer_id` = 4 )
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/323499.html
上一篇:哪種解決方案性能最好,為什么要在復雜串列中找到重復項的數量?
下一篇:整數的有序與相等比較
