環境資訊公開與監督專案

最後編輯:2016-03-22 建立:2015-06-13 歷史紀錄

JIMMY Hby 綠色公民行動聯盟

 

資料欄位

https://docs.google.com/spreadsheets/d/1XaU4R6EtzCMZmKN6CdHurvMINdOYQkkbgbLJT5SQgQ8/edit#gid=0

 

洪申翰高雄市政府煙道連續監測opendata欄位:

https://docs.google.com/spreadsheets/d/1My4nux9wZ6_v7I_lL8NZrSwp4ECWXa6UPl-kRGGuPDQ/edit?usp=sharing

    Tonyq Wang這個格式還待確認,以高雄最後給出來為準

 

11月opendata工作坊記錄:

http://beta.hackfoldr.org/1VxkIpLVa4ZK-MIFcTy7DH-Oqson14N9j2QqjlsnGYao

 

馬軍NGO分享

https://hackpad.com/2015.09.10--lGAFJzYZmrU

 

 

 

JIMMY H參考資料

 

CHE L中國環境團體的資料:

 

JIMMY H公司基本資料:

 

洪申翰其他關於企業網絡的資料:

  • https://g0v.hackpad.com/hxJBK9v7frt
  • GuanXi

 

卞中佩企業污染的資料整理:

  • http://hackfoldr.org/gxv/rInrPBeYd5I

 

公開資訊觀測站:

http://mops.twse.com.tw/mops/web/index

 

經濟部工廠資料:

http://gcis.nat.gov.tw/Fidbweb/index.jsp

CHE LFinjon Kiang: https://wdc.gov.tw/syi.idb/

 

JIMMY H開罰紀錄:

  • http://prtr.epa.gov.tw/FacilityInfo/Data?search=False
    • KIANG拉了一份下來 https://github.com/kiang/prtr.epa.gov.tw
  • JIMMY Hhttp://prtr.epa.gov.tw/Penalty/Statistics (offline)
  • 日月光 sample:

 

污染情況與即時監測:

 

JIMMY HTools / Code:

http://docs.casperjs.org/en/latest/quickstart.html#now-let-s-scrape-google

 

MRSEAN@GGIT Repo:

https://github.com/swilsonian/pollutionscraper

 

 

 

Questions:

  • How to get keywords to search for to our script - pass in a textfile with one keyword per line?
  • How to prevent duplicate data - for example, one keyword in the keywords textfile is "日月光" and another is "新日月光"?
    • This will cause duplicate rows in our csv
    • We could keep track of Entity IDs for factories in a hash table - each factory has an Entity ID
    • Before opening a factory page, check if that factory's entity ID is in our list -- If it is, don't open the factory page again
  • Besides the one table, What other data needs to be retrieved from each factory page?
  • Should we add a CSV row for a factory that doesn't have any table data?
  • How many CSV files do we want to create? One per keyword, or one per search?
  • How do the CSV(s) we create get uploaded to the final storage repo?