Re: [問題]pandas重複分割,重複存檔
# -*- coding: utf-8 -*-
import pandas as pd
from geocodequery import GeocodeQuery
df = pd.read_csv('./birdsIwant3.csv',low_memory=False)
def addrs(location):
gq = GeocodeQuery("zh-tw", "tw")
gq.get_geocode(location)
print location
return pd.Series({"lat": gq.get_lat(), "lng": gq.get_lng()})
def test(location):
return pd.Series({"lat" :5, "lng":10})
df['lat'] = 0
df['lng'] = 0
query_count = 2
loop_count = int(df.shape[0]/query_count)
for lc in range(2):
df.loc[lc*query_count: (lc+1)*query_count, ['lat','lng']] = df[lc*query_count: (lc+1)*query_count]['location'].apply(addrs) ##the problem##
df.to_csv('./birdsIwant3_1.csv',index=False)
print pd.read_csv('./birdsIwant3_1.csv', low_memory=False)
query_count 是每次回圈查詢的次數
loop_count 是計算總共需要跑幾次迴圈
我跑了兩個迴圈,print只截取有lat lng的
count.317 birdName.317 lat lng
0 NaN NaN 24.373316 121.310400
1 NaN NaN 24.205938 121.010132
2 NaN NaN 24.373316 121.310400
3 NaN NaN 24.774906 120.970782
是有存進去的
因為檔案很大,每次寫回去都有點久
如果記憶體充足或者api不會被google擋掉的話,可以考慮看看全部查完再寫回去
或者試試看每個迴圈拆成一個檔案最後合併起來
不知道有沒有解決你的問題,有問題在一起討論~
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 122.116.217.21
※ 文章網址: https://www.ptt.cc/bbs/Python/M.1428814721.A.B09.html
推
04/12 14:14, , 1F
04/12 14:14, 1F
→
04/12 14:15, , 2F
04/12 14:15, 2F
→
04/12 14:16, , 3F
04/12 14:16, 3F
→
04/12 15:15, , 4F
04/12 15:15, 4F
→
04/12 15:16, , 5F
04/12 15:16, 5F
→
04/12 15:17, , 6F
04/12 15:17, 6F
→
04/12 15:21, , 7F
04/12 15:21, 7F
→
04/12 15:21, , 8F
04/12 15:21, 8F
→
04/12 15:23, , 9F
04/12 15:23, 9F
推
04/13 09:06, , 10F
04/13 09:06, 10F
→
04/13 09:08, , 11F
04/13 09:08, 11F
→
04/13 09:11, , 12F
04/13 09:11, 12F
→
04/13 09:11, , 13F
04/13 09:11, 13F
→
04/13 09:12, , 14F
04/13 09:12, 14F
→
04/14 11:27, , 15F
04/14 11:27, 15F
→
04/14 12:05, , 16F
04/14 12:05, 16F
→
04/14 12:06, , 17F
04/14 12:06, 17F
→
04/14 12:08, , 18F
04/14 12:08, 18F
→
04/14 12:09, , 19F
04/14 12:09, 19F
→
04/14 12:10, , 20F
04/14 12:10, 20F
→
04/14 12:11, , 21F
04/14 12:11, 21F
→
04/16 00:12, , 22F
04/16 00:12, 22F
→
04/16 00:12, , 23F
04/16 00:12, 23F
討論串 (同標題文章)
完整討論串 (本文為第 3 之 3 篇):
Python 近期熱門文章
PTT數位生活區 即時熱門文章