患者經常繞過最近的醫院去另一家醫院進行手術（原因很多）。我有 500,000 名患者在英國 24 家醫院就診。

我想知道在不是最近選擇的醫院就診的患者比例。所以說倫敦的一家醫院有 100 名患者，其中 20 名應該去劍橋，他們的比例是 20%。在下面的示例中，患者 2 最近的醫院很可能是 ulon1,ulat1。u 代表神經外科（=醫院）。

我有患者和醫院的經緯度。由于保密原因，我無法顯示患者代碼的資料。

基本上我的資料框看起來像這樣

d = {'patient_ID': [0, 1, 2, 3, 5,], 'patient_lon': [ 'plon1', 'plon2', 'plon3', 'plon4', 'plon5'], 'patient_lat': ['plat1','plat2', 'plat3', 'plat4','plat5'],\
      'unit_lon' : ['ulon1', 'ulon2', 'ulon3', 'ulon4', 'ulon5'], 'unit_lat': ['ulat1', 'ulat2','ulat3', 'ulat4', 'ulat5']}
pd.DataFrame(data=d)

|patient_ID  |patient_lon   | patient_lat  | unit_lon  | unit_lat
  --------     ----------     ----------   --------   -------
|0           | plon1        |   plat1      |  ulon1    |  ulat1
|1           | plon2        |   plat2      |  ulon2    |  ulat2
|2           | plon3        |   plat3      |  ulon3    |  ulat3
|3           | plon4        |   plat4      |  ulon4    |  ulat4
|5           | plon5        |   plat5      |  ulon5    |  ulat5

我使用Haversine方法來計算從患者到他們就診的醫院的距離。

我如何使用它來計算到 24 家醫院的所有距離并找到最小值作為“本地”醫院。（他們都提供我感興趣的神經外科手術）。然后將其與他們在新資料框列中實際訪問的那個進行比較。

順便說一句，我是一名外科醫生，所以在這里是個新手。

uj5u.com熱心網友回復：

我在下面的答案稍微重新格式化了資料，以便將 lat long 存盤在某些列的元組中，希望這沒問題，但如果不是，請回復，我們會制定答案。

1. 模擬一些可能的患者位置

from haversine import haversine
import pandas as pd
import numpy as np
import random

# number of simulated patient data
NUM_PATIENTS = 10

# a grid for sampling some patient locations from
COORD_LL = (51.578099, -0.232274)
COORD_UR = (52.797460, 1.556070)
GRID_LAT = np.linspace(COORD_LL[0], COORD_UR[0], num=NUM_PATIENTS)
GRID_LONG = np.linspace(COORD_LL[1], COORD_UR[1], num=NUM_PATIENTS)
GRID_LAT = np.around(GRID_LAT, decimals=4)
GRID_LONG = np.around(GRID_LONG, decimals=4)

2. 醫院門店位置

接下來我們將存盤醫院的名稱和緯度坐標。在您上面的示例中，這將是您的 24 家英國醫院，我再次在這里做了一些事情。

# names and locations of hospitals
HOSPITALS = dict(
    ADDBR=(52.1779, 0.1464),
    BURY=(52.2412, 0.6939),
    PBOROUGH=(52.5548, -0.2613),
    NWICH=(52.6091, 1.2609),
    LONDON=(51.5553, -0.0993),
)

3.組裝資料框

現在我們使用上面的資料來創建一些資料串列和一個資料框。

# Simulate patient data: generate lists   dataframe
patient_latlongs = tuple(zip(GRID_LAT, GRID_LONG))
patient_id = [i for i in range(len(patient_latlongs))]
unit_visited = [random.choice(list(HOSPITALS.keys())) for x in range(len(patient_latlongs))]
unit_visited_latlong = [HOSPITALS.get(x) for x in unit_visited]

df = pd.DataFrame.from_dict(
    {
        "patient_ID": patient_id,
        "patient_latlong": patient_latlongs,
        "unit_visited": unit_visited,
        "unit_visited_latlong": unit_visited_latlong,
    }
)

輸出：

   patient_ID     patient_latlong unit_visited unit_visited_latlong
0           0  (51.5781, -0.2323)       LONDON   (51.5553, -0.0993)
1           1  (51.7136, -0.0336)     PBOROUGH   (52.5548, -0.2613)
2           2   (51.8491, 0.1651)        ADDBR    (52.1779, 0.1464)
3           3   (51.9846, 0.3638)       LONDON   (51.5553, -0.0993)
4           4     (52.12, 0.5625)         BURY    (52.2412, 0.6939)
5           5   (52.2555, 0.7613)     PBOROUGH   (52.5548, -0.2613)
6           6      (52.391, 0.96)       LONDON   (51.5553, -0.0993)
7           7   (52.5265, 1.1587)       LONDON   (51.5553, -0.0993)
8           8    (52.662, 1.3574)       LONDON   (51.5553, -0.0993)
9           9   (52.7975, 1.5561)         BURY    (52.2412, 0.6939)

4. 尋找最近的醫院

我們撰寫了一個查找最近醫院的函式。這可能對我們的示例有點定制。如上所述，haversine 是一個非常方便的庫。該函式回傳最近醫院的密鑰。hospitals我們可以在我們的字典中查找。

def find_nearest_hospital(latlong: tuple, hospitals: dict) -> str:
    """
    Calculate nearest hospital and return name of it. Assumes a dict
    storing hospital names as keys   latlong as tuples.

    latlong: tuple
        input latlong tuple

    hospitals: dict
        key / value pairs storing hospital names as keys and locations in latlong tuple values

    returns:
        name of closest hospital
    """
    distances = {}
    for hospital, location in hospitals.items():
        distances.update({hospital: haversine(latlong, location)})

    return min(distances, key=distances.get)

5. 計算距離

在資料框中分配新列，計算離患者最近的醫院。Transform 比 apply 快一點，理想情況下我們可能會使用 numpy 矢量化函式，但這對于您的用例來說可能已經足夠快了。如果沒有，請回信，我們可以看看。

df = df.assign(
    closest_unit=df["patient_latlong"].transform(lambda x: find_nearest_hospital(x, HOSPITALS)),
    closest_unit_lat=lambda x: x["closest_unit"].replace(
        {k: v[0] for k, v in HOSPITALS.items()},
    ),
    closest_unit_long=lambda x: x["closest_unit"].replace(
        {k: v[1] for k, v in HOSPITALS.items()},
    ),
    visited_closest=lambda x: (x["closest_unit"] == x["unit_visited"]),
)

輸出：

   patient_ID     patient_latlong unit_visited unit_visited_latlong  \
0           0  (51.5781, -0.2323)     PBOROUGH   (52.5548, -0.2613)   
1           1  (51.7136, -0.0336)         BURY    (52.2412, 0.6939)   
2           2   (51.8491, 0.1651)        ADDBR    (52.1779, 0.1464)   
3           3   (51.9846, 0.3638)        NWICH    (52.6091, 1.2609)   
4           4     (52.12, 0.5625)        ADDBR    (52.1779, 0.1464)   
5           5   (52.2555, 0.7613)        ADDBR    (52.1779, 0.1464)   
6           6      (52.391, 0.96)        NWICH    (52.6091, 1.2609)   
7           7   (52.5265, 1.1587)       LONDON   (51.5553, -0.0993)   
8           8    (52.662, 1.3574)       LONDON   (51.5553, -0.0993)   
9           9   (52.7975, 1.5561)     PBOROUGH   (52.5548, -0.2613)   

  closest_unit  closest_unit_lat  closest_unit_long  visited_closest  
0       LONDON           51.5553            -0.0993            False  
1       LONDON           51.5553            -0.0993            False  
2        ADDBR           52.1779             0.1464             True  
3        ADDBR           52.1779             0.1464            False  
4         BURY           52.2412             0.6939            False  
5         BURY           52.2412             0.6939            False  
6         BURY           52.2412             0.6939            False  
7        NWICH           52.6091             1.2609            False  
8        NWICH           52.6091             1.2609            False  
9        NWICH           52.6091             1.2609            False

在相關/不相關的筆記上，來自神經病房的寫作。

轉載請註明出處，本文鏈接：https://www.uj5u.com/caozuo/514747.html

標籤：Python熊猫谷歌地图谷歌距离矩阵 API

上一篇：谷歌靜態API大型縮放地圖

下一篇：nginx組態檔rewrite和if

從python/pandas中的經緯度資料中查找最近的醫院

1. 模擬一些可能的患者位置

2. 醫院門店位置

3.組裝資料框

4. 尋找最近的醫院

5. 計算距離