我正在嘗試從 Covid 資料庫中獲取一些值,并且我撰寫了以下代碼,它可以按我的意愿作業(見下文),但在代碼之后我有一個問題要問你:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
def main():
pd.set_option('display.max_rows', None)
df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
df=df[df["Country/Region"]=="Italy"]
df=df.drop(columns=["Province/State","Lat","Long","Country/Region"])
df = df.columns.to_frame().T.append(df, ignore_index=True)
df.columns = range(len(df.columns))
df=df.T
df = df.rename(columns={0: 'date', 1: 'nuovi_casi'})
df['nuovi_casi'] = df['nuovi_casi'].diff(periods=1).fillna(1)
df = df[(df['date'] > '11/26/21') & (df['date'] <= '12/8/21')]
print(df)
dati_giornalieri=list(df.nuovi_casi)
sommatoriaitalia=(sum(dati_giornalieri)/1390000000)*100
print(sommatoriaitalia)
print(dati_giornalieri)
現在我想添加這部分代碼來詢問用戶開始日期和結束日期是什么:
def main():
start_date=str(input("Enter starting date in format mm/dd/yy"))
end_date=str(input("Enter ending date in format mm/dd/yy"))
pd.set_option('display.max_rows', None)
df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
df=df[df["Country/Region"]=="Italy"]
df=df.drop(columns=["Province/State","Lat","Long","Country/Region"])
df = df.columns.to_frame().T.append(df, ignore_index=True)
df.columns = range(len(df.columns))
df=df.T
df = df.rename(columns={0: 'date', 1: 'nuovi_casi'})
df['nuovi_casi'] = df['nuovi_casi'].diff(periods=1).fillna(1)
df = df[(df['date'] > start_date) & (df['date'] <= end_date)]
但是在 df = df[(df['date'] > start_date) & (df['date'] <= end_date)] 行中出現錯誤,因為他無法將日期與字串進行比較。我實際上嘗試匯入日期時間:
start_date = datetime.strptime(input('Enter Start date in the format m/d/y'), '%m/%d/%y')
但我實際上得到了相同的結果,因為仍然存在問題,因為出于某種原因,它只考慮每月一天或類似的事情,但無論如何結果并不如預期。
如何解決問題,選擇介于兩者之間的天數?謝謝。
uj5u.com熱心網友回復:
在比較之前將值轉換為日期時間:
start_date = pd.to_datetime(start_date, format="%m/%d/%y")
end_date = pd.to_datetime(end_date, format="%m/%d/%y")
df["date"] = pd.to_datetime(df["date"], format="%m/%d/%y")
df = df[df["date"].between(start_date, end_date, inclusive="right")]
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/382537.html
