我有一列股票代碼,并從該列中創建了一個以逗號分隔的符號字串,該字串放置在同一資料框 DF 中名為 v1 的新列中。我還將逗號分隔的字串帶到了一個新的資料框 DF1。在這兩種情況下,我只希望字串出現在第 1 列中,而不是每一列中。在這兩個資料框中有什么方法可以讓逗號分隔的符號字串只出現在第一行而不在所有行中重復嗎?如果可能的話,有人可以解釋一下。謝謝
分隔的逗號字串代碼
v1 = df['Ticker'].tolist()
v1 = ",".join(map(str,v1))
df['v1'] = v1
df1 = df[['v1']]
print(df)
print (df1)
當前DF輸出
No. Ticker ... AH Change v1
0 1 AAPL ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
1 2 MSFT ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
2 3 TSLA ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
3 4 FB ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
電流 DF1 輸出
0 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
1 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
2 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
3 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
所需的 DF 輸出
No. Ticker ... AH Change v1
0 1 AAPL ... - AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
1 2 MSFT ... -
2 3 TSLA ... -
3 4 FB ... -
所需的 DF1 輸出
0 AAPL,MSFT,TSLA,FB,BRK-B,NVDA,TSM,JPM,V,JNJ,HD,...
完整代碼
import pandas as pd
import requests
import bs4
import time
import random
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
def testDf(version):
url = 'https://finviz.com/screener.ashx?v={version}&r={page}&f=sh_outstanding_o1000&c=0,1,2,3,4,5,6,7,71,72&f=ind_stocksonly&o=-marketcap'
page = 1
screen = requests.get(url.format(version=version, page=page), headers=headers)
soup = bs4.BeautifulSoup(screen.text, features='lxml')
pages = int(soup.find_all('a', {'class': 'screener-pages'})[-1].text)
data = []
for page in range(1, 1 * pages, 20):
print(version, page)
screen = requests.get(url.format(version=version, page=page), headers=headers).text
tables = pd.read_html(screen)
tables = tables[-2]
tables.columns = tables.iloc[0]
tables = tables[1:]
data.append(tables)
time.sleep(random.random())
return pd.concat(data).reset_index(drop=True).rename_axis(columns=None)
df = testDf('152').copy()
v1 = df['Ticker'].tolist()
v1 = ",".join(map(str,v1))
df['v1'] = v1
df1 = df[['v1']]
print(df)
print (df1)
uj5u.com熱心網友回復:
grouping = df.groupby('v1')
indices = []
for x in grouping.groups.values():
indices.extend(x[1:])
df.loc[indices, 'v1'] = ''
df1 = pd.DataFrame(grouping.groups.keys())
注意:這種變化df是不可逆轉的。
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/412068.html
標籤:
