Difference between revisions of "A.I. Natural Language Processing 1.0"

From Hackerspace.gr
Jump to: navigation, search
Line 51: Line 51:
 
** iperdiavgeia  
 
** iperdiavgeia  
 
** alibaba
 
** alibaba
 +
** Wiktionary
 +
** europarl
 +
** Ethinko tipografio (et.gr ?)
 +
  
 
* Topics
 
* Topics
Line 90: Line 94:
 
*** glosiko  modelo, paragogi diafimiseon apo perigrafi, using nlg
 
*** glosiko  modelo, paragogi diafimiseon apo perigrafi, using nlg
 
*** 47k corpus is wayy too small
 
*** 47k corpus is wayy too small
 +
** scrapping
 +
*** using browser
 +
** dictionaries - parsing existing ones from the 'net
 +
*** Existing ones kind of suck
 +
** Open data in greek we can use
 +
** europarl
 +
*** hugely big corpus
 +
*** can be used to train
 +
** Ethinko tipografio
 +
*** Another open source of information
  
 
+
 
* project ideas
 
* project ideas
 
** apokodikopiisi grammikis a ~
 
** apokodikopiisi grammikis a ~

Revision as of 15:49, 1 June 2013

Natural language processing.jpg
Starts Organizer
Sat 01 Jun 2013 15:00 Everyone interested
Ends Event Owner
Sat 01 Jun 2013 16:00 User:Skmp

Let the skynet begin



Ευθύμης Πετρόπουλος-Τράκας "Πρώτη συνάντηση, για την Επεξεργασία Φυσικής Γλώσσας. Σας περιμένω όλους εκεί, με μπόλικη διάθεση, υλικό προς μελέτη και φυσικά ιδέες!"

(coordination and note taking by skmp)


http://doodle.com/zrag5ak9atvzd8yd

  • definitions
    • nlp - natural language processing
    • nlg - natural language generation
  • People
    • theodim
    • skmp
    • Nikos Katzouris
    • Orestis Roditis
    • Timos Petropoulos
      • Interests ~ Cryptography (as applicable to nlp ?), Apply newer algos to greek
    • Panagiotis Katsivelis
      • Interests ~ nlg
    • Vasilis Salapatas
  • referenced projects
    • nusami
    • Standfold parser
    • asoe
      • pos-tagger (nlp.aub.gr)
      • Name entity recogniser (NER)
    • eclipse
      • pydev extension
    • simple nlg
    • nltk
    • wordnet
    • ispell
    • alchemy (sentiment analysis)
    • Openbox
    • iperdiavgeia
    • alibaba
    • Wiktionary
    • europarl
    • Ethinko tipografio (et.gr ?)


  • Topics
    • Introduction of each person
    • Plan is for a Study group (once per one or two weeks)
    • We'll pick a subject and work on it
      • apokodikopiisi grammikis a ~
    • Statistical vs structured algorithms
    • language discussion (it always must happen :p)
      • Standfold parser & related libraries (java-based)
    • mailing list tag decided to be [nlp]
    • language chat again
      • perhaps we can decide on a common base ?
    • nlg
      • using a dictionary
      • There's nothing for greek right now
    • nltk like, but updated w/ support for greek
    • hosting probably on github
    • back to project talk
    • ispell / spell checking
    • Sentiment analysis
      • alchemy
    • nothing in Greece really
      • Not really open/open source
    • turney algorithm
      • tries to match betwen bad/good
      • Implement in Greek ?
    • Dimokritos, Openbox
    • iperdiavgeia -> search diavgeia api
    • provlepsi aftoktonion meso fb
    • Sentiment analysis
    • topic detection is (more general than sentiment analysis)
    • Aggregation of open knowledge
    • its hard to get data
    • Scale of required corpus
      • 2M "entries" is a good number
    • alibaba
    • ptixiaki tou panagioti
      • glosiko modelo, paragogi diafimiseon apo perigrafi, using nlg
      • 47k corpus is wayy too small
    • scrapping
      • using browser
    • dictionaries - parsing existing ones from the 'net
      • Existing ones kind of suck
    • Open data in greek we can use
    • europarl
      • hugely big corpus
      • can be used to train
    • Ethinko tipografio
      • Another open source of information


  • project ideas
    • apokodikopiisi grammikis a ~
    • basic dictionary-based nlg for greek (panagiotis)
    • nltk like, but updated w/ support for greek (who ?)
    • Dictionary/language graph for greek ? (theodim)
    • spell checker improvement (theodim)
    • turney-based sentiment analysis for Greek
    • topic detection (theodim)
    • Corpus -- there's nothing in greek -- perhaps crawl for it ? (panagiotis)