在表列中查找每組最頻繁的值-有解無憂

我需要object_of_search為每個種族找到最常見的值。我怎樣才能做到這一點？中子查詢SELECT子句和相關子查詢是不允許的。類似的東西：

mode() WITHIN GROUP (ORDER BY stopAndSearches.object_of_search) AS "Most frequent object of search"

但這并沒有聚合，并且為每個種族和 object_of_search 提供了許多行：

 officer_defined_ethnicity | Sas for ethnicity |   Arrest rate    | Most frequent object of search
--------------------------- ------------------- ------------------ --------------------------------
 ethnicity2                |                 3 | 66.6666666666667 | Stolen goods
 ethnicity3                |                 2 |              100 | Fireworks
 ethnicity1                |                 5 |               60 | Firearms
 ethnicity3                |                 2 |              100 | Firearms
 ethnicity1                |                 5 |               60 | Cat
 ethnicity1                |                 5 |               60 | Dog
 ethnicity2                |                 3 | 66.6666666666667 | Firearms
 ethnicity1                |                 5 |               60 | Psychoactive substances
 ethnicity1                |                 5 |               60 | Fireworks

應該是這樣的：

 officer_defined_ethnicity | Sas for ethnicity |   Arrest rate    | Most frequent object of search
--------------------------- ------------------- ------------------ --------------------------------
 ethnicity2                |                 3 | 66.6666666666667 | Stolen goods
 ethnicity3                |                 2 |              100 | Fireworks
 ethnicity1                |                 5 |               60 | Firearms

小提琴表。
詢問：

SELECT DISTINCT
    stopAndSearches.officer_defined_ethnicity,
    count(stopAndSearches.sas_id) OVER(PARTITION BY stopAndSearches.officer_defined_ethnicity) AS "Sas for ethnicity",
    sum(case when stopAndSearches.outcome = 'Arrest' then 1 else 0 end)
       OVER (PARTITION BY stopAndSearches.officer_defined_ethnicity)::float /
       count(stopAndSearches.sas_id) OVER(PARTITION BY stopAndSearches.officer_defined_ethnicity)::float * 100 AS "Arrest rate",
    mode() WITHIN GROUP (ORDER BY stopAndSearches.object_of_search) AS "Most frequent object of search"
FROM stopAndSearches
GROUP BY stopAndSearches.sas_id, stopAndSearches.officer_defined_ethnicity;

桌子：

CREATE TABLE IF NOT EXISTS stopAndSearches(
    "sas_id" bigserial PRIMARY KEY,
    "officer_defined_ethnicity" VARCHAR(255),
    "object_of_search" VARCHAR(255),
    "outcome" VARCHAR(255)
);

uj5u.com熱心網友回復：

更新：小提琴

這應該解決具體的“每個種族哪個物件”的問題。

請注意，這并沒有解決計數中的關系。那不是問題/請求的一部分。

調整您的 SQL 以包含此邏輯，以提供該詳細資訊：

WITH cte AS (
        SELECT officer_defined_ethnicity
             , object_of_search
             , COUNT(*) AS n
             , ROW_NUMBER() OVER (PARTITION BY officer_defined_ethnicity ORDER BY COUNT(*) DESC) AS rn
          FROM stopAndSearches
         GROUP BY officer_defined_ethnicity, object_of_search
     )
SELECT * FROM cte
 WHERE rn = 1
;

結果：

官員_定義的種族	搜索物件	n	恩
種族1	貓	1	1
種族2	被盜物品	2	1
種族3	煙花	1	1

uj5u.com熱心網友回復：

SELECT DISTINCT ON (1)
       officer_defined_ethnicity, object_of_search, count(*) AS ct
FROM   stop_and_searches
GROUP  BY 1, 2
ORDER  BY 1, 3 DESC, 2;

或更明確地說：

SELECT DISTINCT ON (officer_defined_ethnicity)
       officer_defined_ethnicity, object_of_search, count(*) AS ct
FROM   stop_and_searches
GROUP  BY officer_defined_ethnicity, object_of_search
ORDER  BY officer_defined_ethnicity, ct DESC, object_of_search;

 officer_defined_ethnicity | object_of_search | ct
--------------------------- ------------------ ----
 ethnicity1                | Cat              | 1
 ethnicity2                | Stolen goods     | 2
 ethnicity3                | Firearms         | 1

db<>在這里擺弄

因為DISTINCT ON是在 GROUP BY我們只需要一個查詢級別之后應用的。

總結，讓每計數(officer_defined_ethnicity, object_of_search)用GROUP BY。
挑選每個最高計數的行officer_defined_ethnicity用DISTINCT ON。

我添加object_of_search了第三ORDER BY項作為決勝局并產生確定性結果：
如果出現平局，請object_of_search根據字母排序順序選擇第一個。
適應您的需求。

看：

選擇每個 GROUP BY 組中的第一行？
在應用 LIMIT 之前獲得結果計數的最佳方法

比具有以下功能的子查詢更簡單且通常更快row_number()：

選擇每個 GROUP BY 組中的第一行？- 基準

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/334434.html

標籤：sql PostgreSQL 每组最大 n

上一篇：即使物體已經存在，Hibernate也會保存物體，因為它們生成了id

下一篇：如何統計上個月的資料？