我想將長文本分成多行;還有其他與此類似的問題,但沒有一個對我有用。我有的
ID | Message
----------------------------------
1 | Very looooooooooooooooong text
2 | Short text
我想做的是每隔 n 個字符將該字串分割一次結果如果n = 15:
Id | Message
------------------------------------------
1 | Very looooooooo
1 | oooooooong text
2 | Short text
如果在 n 個字符后的第一個空格處進行拆分,那就更好了。
我嘗試使用 string_split 和 substring,但找不到任何有效的方法。我想使用類似的東西:
SELECT index, element FROM table, CAST(message AS SUPER) AS element AT index;
但它沒有考慮長度,我不喜歡將 varchar 變數轉換為超級變數。
uj5u.com熱心網友回復:
您可以使用generate_series()來完成此操作:
select m.*, gs.posn, substring(m.message, gs.posn, 15) as split_message
from messages m
cross join lateral generate_series(1, length(message), 15) gs(posn);
在長度之后分割空格有點棘手。我們必須將它們拆分message成單詞,然后弄清楚如何將它們分成組,然后重新聚合。
我無法弄清楚如何在沒有遞回的情況下拆分空間。我希望您不介意它將所有空格視為單詞邊界:
with recursive by_words as (
select m.*, s.n, s.word, length(s.word) as word_len,
max(s.n) over (partition by m.id) as num_words
from messages m
cross join lateral regexp_split_to_table(m.message, '\s ')
with ordinality as s(word, n)
), rejoin as (
select id, n, array[word] as words, word_len as cum_word_len,
word_len >= 15 as keep
from by_words
where n = 1
union all
select p.id, c.n,
case
when p.cum_word_len >= 15 then array[c.word]
else p.words||c.word
end as words,
case
when p.cum_word_len >= 15 then c.word_len
else p.cum_word_len c.word_len 1
end as cum_word_len,
(p.cum_word_len c.word_len 1 >= 15)
or (c.n = c.num_words) as keep
from rejoin p
join by_words c on (c.id, c.n) = (p.id, p.n 1)
)
select id,
row_number() over (partition by id
order by n) as segnum,
array_to_string(words, ' ') as split_message
from rejoin
where keep
order by 1, 2
;
db<>在這里擺弄
編輯添加:
您能否告訴我以下內容是否適用于 Redshift?
with gs as (
select generate_series as posn
from generate_series(1, 150000, 15)
)
select *, substring(m.message, gs.posn, 15) as split_message
from messages m
join gs
on gs.posn <= greatest(1, length(m.message))
order by m.id, gs.posn
;
uj5u.com熱心網友回復:
感謝@Mike Organek 的回答和他的幫助,我找到了一個也適用于 Redshift 的解決方案。
Mike 對 Redshift 的回答中的問題與 Redshiftgenerate_series沒有得到很好的支持有關,所以這里有一個解決方法。
with row as (
select t.*, row_number() over () as x
from table t -- big enough table
limit 100
),
result as
(
select (x-1)*15 1 as posn from row --change 15 to a number to split the long text with
)
select * into gs
from result
然后邁克的回答:
select *, substring(m.feedback from gs.posn for 15) as split_message
from messages m
join gs
on gs.posn <= greatest(1, length(m.message))
order by m.id, gs.posn
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/492357.html
標籤:sql PostgreSQL
