我有一個資料幀,其中包含作為音頻剪輯范圍的開始和結束時間戳,可以像這樣生成:
import pandas as pd
df = pd.DataFrame(
{'start':
{0: pd.Timestamp('1900-01-01 00:00:14.373000'), 1: pd.Timestamp('1900-01-01 00:00:16.342000'),2: pd.Timestamp('1900-01-01 00:00:18.743000'), 3: pd.Timestamp('1900-01-01 00:00:21.383000'), 4: pd.Timestamp('1900-01-01 00:00:22.812000')},
'end':
{0: pd.Timestamp('1900-01-01 00:00:16.342000'), 1: pd.Timestamp('1900-01-01 00:00:18.543000'), 2: pd.Timestamp('1900-01-01 00:00:20.712000'), 3: pd.Timestamp('1900-01-01 00:00:22.482000'), 4: pd.Timestamp('1900-01-01 00:00:24.653000')}})
start end
0 1900-01-01 00:00:14.373 1900-01-01 00:00:16.342
1 1900-01-01 00:00:16.342 1900-01-01 00:00:18.543
2 1900-01-01 00:00:18.743 1900-01-01 00:00:20.712
3 1900-01-01 00:00:21.383 1900-01-01 00:00:22.482
4 1900-01-01 00:00:22.812 1900-01-01 00:00:24.653
我想生成一個資料框填充開始和結束時間戳,這些時間戳不來,這意味著這些條目不存在的范圍。所以是這樣的:
pd.DataFrame(
{'start':
{0: pd.Timestamp('1900-01-01 00:00:00.000000'), 1: pd.Timestamp('1900-01-01 00:00:14.373000'), 2: pd.Timestamp('1900-01-01 00:00:16.342000'), 3: pd.Timestamp('1900-01-01 00:00:18.543000'), 4: pd.Timestamp('1900-01-01 00:00:20.712000'), 5: pd.Timestamp('1900-01-01 00:00:21.383000'), 6: pd.Timestamp('1900-01-01 00:00:22.482000'), 7: pd.Timestamp('1900-01-01 00:00:22.812000')},
'end':
{0: pd.Timestamp('1900-01-01 00:00:14.373000'), 1: pd.Timestamp('1900-01-01 00:00:16.342000'), 2: pd.Timestamp('1900-01-01 00:00:18.543000'), 3: pd.Timestamp('1900-01-01 00:00:20.712000'), 4: pd.Timestamp('1900-01-01 00:00:21.383000'), 5: pd.Timestamp('1900-01-01 00:00:22.482000'), 6: pd.Timestamp('1900-01-01 00:00:22.812000'), 7: pd.Timestamp('1900-01-01 00:00:24.653000')}})
start end
0 1900-01-01 00:00:00.000 1900-01-01 00:00:14.373
1 1900-01-01 00:00:14.373 1900-01-01 00:00:16.342
2 1900-01-01 00:00:16.342 1900-01-01 00:00:18.543
3 1900-01-01 00:00:18.543 1900-01-01 00:00:20.712
4 1900-01-01 00:00:20.712 1900-01-01 00:00:21.383
5 1900-01-01 00:00:21.383 1900-01-01 00:00:22.482
6 1900-01-01 00:00:22.482 1900-01-01 00:00:22.812
7 1900-01-01 00:00:22.812 1900-01-01 00:00:24.653
除了迭代各個行之外,我無法提供任何可行的解決方案,最好的方法是什么?
uj5u.com熱心網友回復:
IIUC,您可以獲得所有唯一的時間戳并從移位的值中生成一個新的資料幀:
vals = df[['start', 'end']].stack().unique()
vals2 = np.concatenate([np.array([0], dtype=vals.dtype), vals])
df2 = pd.DataFrame(zip(vals2, vals), columns=['start', 'end'])
輸出:
start end
0 1970-01-01 00:00:00.000 1900-01-01 00:00:14.373
1 1900-01-01 00:00:14.373 1900-01-01 00:00:16.342
2 1900-01-01 00:00:16.342 1900-01-01 00:00:18.543
3 1900-01-01 00:00:18.543 1900-01-01 00:00:18.743
4 1900-01-01 00:00:18.743 1900-01-01 00:00:20.712
5 1900-01-01 00:00:20.712 1900-01-01 00:00:21.383
6 1900-01-01 00:00:21.383 1900-01-01 00:00:22.482
7 1900-01-01 00:00:22.482 1900-01-01 00:00:22.812
8 1900-01-01 00:00:22.812 1900-01-01 00:00:24.653
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/444004.html
