我正在研究需要來自 BLE 設備的 RSSI 值的 ML 模型。對于這種情況,我創建了一個 Mac 應用程式,我將一個型別的字典存盤<K: Date,v: Int>在文本檔案中。請參閱下面的示例。
string = '[2021-10-17 06:52:00 0000: -47, 2021-10-17 06:52:04 0000: -50, 2021-10-17 06:52:03 0000: -50, 2021-10-17 06:52:02 0000: -47, 2021-10-17 06:52:08 0000: -46, 2021-10-17 06:51:57 0000: -50, 2021-10-17 06:52:09 0000: -48, 2021-10-17 06:52:05 0000: -49, 2021-10-17 06:52:01 0000: -48, 2021-10-17 06:51:58 0000: -50, 2021-10-17 06:51:59 0000: -50, 2021-10-17 06:52:06 0000: -47, 2021-10-17 06:52:07 0000: -48]'
這里,樣本中的負值是 RSSI 值。例如對于前 2 個索引
| 日期 | RSSI |
|---|---|
| 2021-10-17 06:52:00 0000 | -47 |
| 2021-10-17 06:52:04 0000 | -50 |
對于我執行任何計算,我需要資料<Date, Int>在 python 上具有等效型別。如何將上述字串轉換為 Pandas Dataframe 以執行計算?希望這提供了足夠的資訊。先感謝您。
uj5u.com熱心網友回復:
您可以使用re.findall一個小的正則運算式:
作為資料框
string = '[2021-10-17 06:52:00 0000: -47, 2021-10-17 06:52:04 0000: -50, 2021-10-17 06:52:03 0000: -50, 2021-10-17 06:52:02 0000: -47, 2021-10-17 06:52:08 0000: -46, 2021-10-17 06:51:57 0000: -50, 2021-10-17 06:52:09 0000: -48, 2021-10-17 06:52:05 0000: -49, 2021-10-17 06:52:01 0000: -48, 2021-10-17 06:51:58 0000: -50, 2021-10-17 06:51:59 0000: -50, 2021-10-17 06:52:06 0000: -47, 2021-10-17 06:52:07 0000: -48]'
import re
import pandas as pd
df = (pd.DataFrame.from_records(re.findall('([^,] ): (-?\d )(?:, )?', string[1:-1]),
columns=['Date', 'RSSI'])
.astype({'Date': 'datetime64', 'RSSI': 'int'})
)
輸出:
Date RSSI
0 2021-10-17 06:52:00 0000 -47
1 2021-10-17 06:52:04 0000 -50
2 2021-10-17 06:52:03 0000 -50
3 2021-10-17 06:52:02 0000 -47
4 2021-10-17 06:52:08 0000 -46
5 2021-10-17 06:51:57 0000 -50
6 2021-10-17 06:52:09 0000 -48
7 2021-10-17 06:52:05 0000 -49
8 2021-10-17 06:52:01 0000 -48
9 2021-10-17 06:51:58 0000 -50
10 2021-10-17 06:51:59 0000 -50
11 2021-10-17 06:52:06 0000 -47
12 2021-10-17 06:52:07 0000 -48
作為字典
import re
dict(re.findall('([^,] ): (-?\d )(?:, )?', string[1:-1]))
輸出:
{'2021-10-17 06:52:00 0000': '-47',
'2021-10-17 06:52:04 0000': '-50',
'2021-10-17 06:52:03 0000': '-50',
'2021-10-17 06:52:02 0000': '-47',
'2021-10-17 06:52:08 0000': '-46',
'2021-10-17 06:51:57 0000': '-50',
'2021-10-17 06:52:09 0000': '-48',
'2021-10-17 06:52:05 0000': '-49',
'2021-10-17 06:52:01 0000': '-48',
'2021-10-17 06:51:58 0000': '-50',
'2021-10-17 06:51:59 0000': '-50',
'2021-10-17 06:52:06 0000': '-47',
'2021-10-17 06:52:07 0000': '-48'}
作為具有正確型別的字典:
import re
import pandas as pd
{pd.to_datetime(k): int(v)
for k,v in re.findall('([^,] ): (-?\d )(?:, )?', string[1:-1])}
輸出:
{Timestamp('2021-10-17 06:52:00 0000', tz='UTC'): -47,
Timestamp('2021-10-17 06:52:04 0000', tz='UTC'): -50,
Timestamp('2021-10-17 06:52:03 0000', tz='UTC'): -50,
Timestamp('2021-10-17 06:52:02 0000', tz='UTC'): -47,
Timestamp('2021-10-17 06:52:08 0000', tz='UTC'): -46,
Timestamp('2021-10-17 06:51:57 0000', tz='UTC'): -50,
Timestamp('2021-10-17 06:52:09 0000', tz='UTC'): -48,
Timestamp('2021-10-17 06:52:05 0000', tz='UTC'): -49,
Timestamp('2021-10-17 06:52:01 0000', tz='UTC'): -48,
Timestamp('2021-10-17 06:51:58 0000', tz='UTC'): -50,
Timestamp('2021-10-17 06:51:59 0000', tz='UTC'): -50,
Timestamp('2021-10-17 06:52:06 0000', tz='UTC'): -47,
Timestamp('2021-10-17 06:52:07 0000', tz='UTC'): -48}
uj5u.com熱心網友回復:
你可以這樣做:
In [98]: l = string[1: -1].split(',')
In [140]: d = {i.split(': ')[0]: i.split(': ')[1] for i in l}
In [131]: df = pd.DataFrame(d.items(), columns=['Date', 'RSSI'])
In [132]: df
Out[132]:
Date RSSI
0 2021-10-17 06:52:00 0000 -47
1 2021-10-17 06:52:04 0000 -50
2 2021-10-17 06:52:03 0000 -50
3 2021-10-17 06:52:02 0000 -47
4 2021-10-17 06:52:08 0000 -46
5 2021-10-17 06:51:57 0000 -50
6 2021-10-17 06:52:09 0000 -48
7 2021-10-17 06:52:05 0000 -49
8 2021-10-17 06:52:01 0000 -48
9 2021-10-17 06:51:58 0000 -50
10 2021-10-17 06:51:59 0000 -50
11 2021-10-17 06:52:06 0000 -47
12 2021-10-17 06:52:07 0000 -48
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/350582.html
上一篇:為什么設定物件拆分字串?
