lí kám ū thiann-tio̍h in ê giân

編輯歷史

時間 作者 版本
2014-03-31 15:30 Kirby Wu r1687
顯示 diff
(35 行未修改)
(minimalist fetch and json from irc included in this project, will use 319 Event Data Collection when available)
- Transcripts typicaly include:
+ Transcripts typically include:
*Timestamp
*Place
(51 行未修改)
2014-03-29 19:34 – 19:35 Bropheus Huang r1663 – r1686
顯示 diff
(65 行未修改)
很多有意思的話,我注意不到。
text 和 video 的 livestream 事實上我大腦無法處理。(你們會有這個問題嗎?)
+ *看非中文的 livestream 就有可能無法處理 XD
找到的資料大部分都是 chronological.
現在有沒有計劃作一個讓人look back at what have been said
(21 行未修改)
2014-03-28 05:03 – 05:03 A-Tsioh r1580 – r1662
顯示 diff
(51 行未修改)
*elasticsearch for the index
*a-tsioh: just curious, how do you plan to host the server? or do we have a hosted elaticsearch server ready? (cuz one server should be able to serve multiple projects, if the loading is not heavy)
+ *got a server in Europe, if the link is not fast enough, we can see how to host it in Taiwan, installing elasticsearch is extremely simple.
*Livescript on the clientside, we may need some d3js (just because I like it)
*select a UI toolkit (no preference for now
(34 行未修改)
2014-03-27 07:30 – 07:30 Simon Pai r1548 – r1579
顯示 diff
(29 行未修改)
*e.g., "KMT的奧步啊" "深藍是超想當中國人的" have similar stand pointa
*sounds hard if we don't define the topics on which the standpoints are given beforehand
+ *FYI, Gene Hong (黑貘) did a related project: http://gene.speaking.tw/2014/03/tvbs-tvbs-10-1129.html
*The Data
(55 行未修改)
2014-03-27 06:19 – 06:31 A-Tsioh r1341 – r1547
顯示 diff
(4 行未修改)
*目的
+ 讓人瀏覽在立法院附近(包括PTT,IRC,網路)人人所說的言
Provide a webpage to search and browse public speeches made around the 立法院
+
*User story
+ 使用者可能無法當時一直聽或看文博,或許想看global picture
+ 加上幾個月後,想記起現在在發生什麼
+ 從一個PO,可以看到上下一個跟類似的(同一個話題)
+
A typical user may not have followed the whole event or missed part of it. He wants to have a clearer picture of the though shared on the different stages around the legislative yuan.
*需要什麼
*understand and select what source of data may be relevant and shall be indexed
+ *選重要資料的來源,
*Structure the data so that the meta-data can be used (timestamps, location, kind of intervention/speaker)
*Create a semantic space to find similar messages
(5 行未修改)
*is this possible? (can you explain more ?give an example)
*e.g., "KMT的奧步啊" "深藍是超想當中國人的" have similar stand pointa
+ *sounds hard if we don't define the topics on which the standpoints are given beforehand
*The Data
See 319 Event Data Collection
- (minimalist fetch and json from irc included)
+ (minimalist fetch and json from irc included in this project, will use 319 Event Data Collection when available)
+
Transcripts typicaly include:
*Timestamp
(50 行未修改)
2014-03-26 04:40 – 04:40 A-Tsioh r1336 – r1340
顯示 diff
(4 行未修改)
*目的
- Provide a webpage to search and browse public speeches made around the
+ Provide a webpage to search and browse public speeches made around the 立法院
+
*User story
(69 行未修改)
2014-03-25 15:36 – 16:08 Simon Pai r1168 – r1335
顯示 diff
(33 行未修改)
*dealing with videos
- Just a question, if we precise location and time of the begining of each recording, does it make sense to link it to the text feeds ?
+ Just a question, if we precise location and time of the beginning of each recording, does it make sense to link it to the text feeds ?
(1 行未修改)
*Django on the serverside ( I need some python libs)
*elasticsearch for the index
+ *a-tsioh: just curious, how do you plan to host the server? or do we have a hosted elaticsearch server ready? (cuz one server should be able to serve multiple projects, if the loading is not heavy)
*Livescript on the clientside, we may need some d3js (just because I like it)
*select a UI toolkit (no preference for now
(17 行未修改)
*Kirby Wu 我有 logbot 跟 bbs 的 crawler, 文字直播跟鄉民的消息都可以做.. (logbot 其實比較好的是拿 dump database or api endpoint )
*video 則需要有逐字稿或至少自動翻譯.. facebook 可以建立一個專頁, 請大家把相關訊息轉入, 再用程式自動備份...其他資料來源就得 case by case?
+ *PTT 八卦板上有很多其他資料的 link, 至少廣為流傳的都不會漏掉
*Pierre Magistry I was thinking about starting with the live transcription archive (and 文播記錄) that are already timestamped, this may help to align Mandarin and English and maybe even video if possible
(12 行未修改)
2014-03-25 04:55 – 04:55 Simon Pai r1158 – r1167
顯示 diff
(72 行未修改)
*所以我覺得能用logbot解決當然很好,現在可能先以HumanAPI為主,當然如果logbot翻譯水準很高,又另當別論
*
+
+ *FYI: http://share.inside.com.tw/posts/4292
2014-03-25 03:14 (unknown) r1157
顯示 diff
(74 行未修改)
2014-03-25 03:13 – 03:14 A-Tsioh r1136 – r1156
顯示 diff
lí kám ū thiann-tio̍h in ê giân
+ Github page:
+ https://github.com/a-tsioh/kam-u-thiann-tioh
*目的
(51 行未修改)
或許有而有只是沒發現。
沒有的話,我願意來負責
-
- Like · · Share
- *
- *Seen by 19
- *Matthieu Fontaine and 2 others like this.
- *
*
*Kirby Wu 我有 logbot 跟 bbs 的 crawler, 文字直播跟鄉民的消息都可以做.. (logbot 其實比較好的是拿 dump database or api endpoint )
*video 則需要有逐字稿或至少自動翻譯.. facebook 可以建立一個專頁, 請大家把相關訊息轉入, 再用程式自動備份...其他資料來源就得 case by case?
- *3 hrs · Like
- *
- *
+
*Pierre Magistry I was thinking about starting with the live transcription archive (and 文播記錄) that are already timestamped, this may help to align Mandarin and English and maybe even video if possible
- *3 hrs · Like
- *
- *
- *Pierre Magistry one usecase would also be to allow people that were not following to catch up (including foreign press)
- *3 hrs · Like · 1
- *
- *
+ Pierre Magistry one usecase would also be to allow people that were not following to catch up (including foreign press)
*Pierre Magistry 有些無關的事我得先處理。晚一點會回來。
*有興趣的人請舉手。計劃會需要比我多瞭解情況和資料的人(算.txt租吧)還有UI-designer 我個人會看一下怎麼用elasticsearch 作 search and classification,也可以pre-process data。
*如果你在臺大附近,今晚就可以見面,不然網路上也可以
- *2 hrs · Like
- *
- *
- *Pierre Magistry thinking of calling this project
- *" kám ū thiann-tioh in ê giân ?"
- *45 mins · Like
- *
- *
- *Kirby Wu i can help, but maybe over internet... let's just create a hackpad to formalize the idea and features?
- *15 mins · Like
- *
- *
- *Pierre Magistry ok
- *14 mins · Like
- *
*
*Ymow Wu 我前天做好一個,有點像是包在kirby你說的架構底下,我原本是希望報名上台發表言論的人,能夠用這系統報名,直接接到文字轉播系統,但是目前可能會暫緩,先用新聞瀏覽器的模式擴大傳播效應,自動翻譯我原本要把文字轉播丟到google翻譯再接回來直接放在Androdi app,但是
(1 行未修改)
*google翻譯不可行, 那成品還不夠精準,都要中英日三語直播,我們可以幫忙的譯者會挺你們到底
*所以我覺得能用logbot解決當然很好,現在可能先以HumanAPI為主,當然如果logbot翻譯水準很高,又另當別論
- *2 mins · Like
+ *
2014-03-24 17:43 – 17:47 Kirby Wu r1096 – r1135
顯示 diff
(17 行未修改)
*analyze semantic to recognize stand point of specific message
*is this possible? (can you explain more ?give an example)
+ *e.g., "KMT的奧步啊" "深藍是超想當中國人的" have similar stand pointa
*The Data
(79 行未修改)
2014-03-24 17:03 – 17:04 Simon Pai r1092 – r1095
顯示 diff
(15 行未修改)
*a timeline for quickly understanding what have been said
*a map for quickly identifying where the message is delivered
- *analyze semantic to recognize stand point of sepcific message
+ *analyze semantic to recognize stand point of specific message
*is this possible? (can you explain more ?give an example)
(80 行未修改)
2014-03-24 17:03 – 17:03 A-Tsioh r1066 – r1091
顯示 diff
(19 行未修改)
*The Data
+ See 319 Event Data Collection
+ (minimalist fetch and json from irc included)
Transcripts typicaly include:
*Timestamp
(75 行未修改)
2014-03-24 14:25 – 14:38 A-Tsioh r1045 – r1065
顯示 diff
(16 行未修改)
*a map for quickly identifying where the message is delivered
*analyze semantic to recognize stand point of sepcific message
- *is this possible?
+ *is this possible? (can you explain more ?give an example)
*The Data
(77 行未修改)
2014-03-24 13:54 – 14:07 Kirby Wu r817 – r1044
顯示 diff
(12 行未修改)
*Create a semantic space to find similar messages
*Define a list of relevant keyword that should be spotted by the system to relate messages.
-
+ *word cloud or sth like that for what people are saying now
+ *a timeline for quickly understanding what have been said
+ *a map for quickly identifying where the message is delivered
+ *analyze semantic to recognize stand point of sepcific message
+ *is this possible?
*The Data
(77 行未修改)
2014-03-24 12:22 – 13:02 A-Tsioh r123 – r816
顯示 diff
(2 行未修改)
*目的
-
+ Provide a webpage to search and browse public speeches made around the
*User story
-
+ A typical user may not have followed the whole event or missed part of it. He wants to have a clearer picture of the though shared on the different stages around the legislative yuan.
*需要什麼
+ *understand and select what source of data may be relevant and shall be indexed
+ *Structure the data so that the meta-data can be used (timestamps, location, kind of intervention/speaker)
+ *Create a semantic space to find similar messages
+ *Define a list of relevant keyword that should be spotted by the system to relate messages.
+
+
+ *The Data
+ Transcripts typicaly include:
+ *Timestamp
+ *Place
+ *person
+ *text
+ *may have an English translation
+ From the text, we may extract some keywords (like names of politic figures, event, places...)
+
+ *dealing with videos
+ Just a question, if we precise location and time of the begining of each recording, does it make sense to link it to the text feeds ?
*用的技術
+ *Django on the serverside ( I need some python libs)
+ *elasticsearch for the index
+ *Livescript on the clientside, we may need some d3js (just because I like it)
+ *select a UI toolkit (no preference for now
----
(58 行未修改)
2014-03-24 10:11 – 10:14 A-Tsioh r16 – r122
顯示 diff
lí kám ū thiann-tio̍h in ê giân
+
+
+ *目的
+
+
+ *User story
+
+
+ *需要什麼
+
+
+ *用的技術
+
+ ----
+ 先把FB 的 messages 放在這
+
+ _________
大家好!
(53 行未修改)
2014-03-24 10:11 (unknown) r15
顯示 diff
(56 行未修改)
2014-03-24 10:07 – 10:11 A-Tsioh r1 – r14
顯示 diff
- Untitled
+ lí kám ū thiann-tio̍h in ê giân
- This pad text is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents!
+ 大家好!
+ 我身爲國語有限制的外國人。
+ 在現在發生的事件,資料已經多得難以處理。
+ 很多有意思的話,我注意不到。
+ text 和 video 的 livestream 事實上我大腦無法處理。(你們會有這個問題嗎?)
+ 找到的資料大部分都是 chronological.
+ 現在有沒有計劃作一個讓人look back at what have been said
+ just like a search engine over the transcript/videos
+ 或許有而有只是沒發現。
+ 沒有的話,我願意來負責
+
+ Like · · Share
+ *
+ *Seen by 19
+ *Matthieu Fontaine and 2 others like this.
+ *
+ *
+ *Kirby Wu 我有 logbot 跟 bbs 的 crawler, 文字直播跟鄉民的消息都可以做.. (logbot 其實比較好的是拿 dump database or api endpoint )
+ *video 則需要有逐字稿或至少自動翻譯.. facebook 可以建立一個專頁, 請大家把相關訊息轉入, 再用程式自動備份...其他資料來源就得 case by case?
+ *3 hrs · Like
+ *
+ *
+ *Pierre Magistry I was thinking about starting with the live transcription archive (and 文播記錄) that are already timestamped, this may help to align Mandarin and English and maybe even video if possible
+ *3 hrs · Like
+ *
+ *
+ *Pierre Magistry one usecase would also be to allow people that were not following to catch up (including foreign press)
+ *3 hrs · Like · 1
+ *
+ *
+ *Pierre Magistry 有些無關的事我得先處理。晚一點會回來。
+ *有興趣的人請舉手。計劃會需要比我多瞭解情況和資料的人(算.txt租吧)還有UI-designer 我個人會看一下怎麼用elasticsearch 作 search and classification,也可以pre-process data。
+ *如果你在臺大附近,今晚就可以見面,不然網路上也可以
+ *2 hrs · Like
+ *
+ *
+ *Pierre Magistry thinking of calling this project
+ *" kám ū thiann-tioh in ê giân ?"
+ *45 mins · Like
+ *
+ *
+ *Kirby Wu i can help, but maybe over internet... let's just create a hackpad to formalize the idea and features?
+ *15 mins · Like
+ *
+ *
+ *Pierre Magistry ok
+ *14 mins · Like
+ *
+ *
+ *Ymow Wu 我前天做好一個,有點像是包在kirby你說的架構底下,我原本是希望報名上台發表言論的人,能夠用這系統報名,直接接到文字轉播系統,但是目前可能會暫緩,先用新聞瀏覽器的模式擴大傳播效應,自動翻譯我原本要把文字轉播丟到google翻譯再接回來直接放在Androdi app,但是
+ *@Sunny Chien 說 :
+ *google翻譯不可行, 那成品還不夠精準,都要中英日三語直播,我們可以幫忙的譯者會挺你們到底
+ *所以我覺得能用logbot解決當然很好,現在可能先以HumanAPI為主,當然如果logbot翻譯水準很高,又另當別論
+ *2 mins · Like
2014-03-24 10:06 (unknown) r0
顯示 diff
+ Untitled
+ This pad text is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents!