Site menu:

NOSDAC Workshop

The Workshop on Automatic Processing of Non-standard Data Souces in Corpus-based Research (NOSDAC) was held at the University of Cologne at August 31st, 2012.


Program

The workshop featured nine presentations discussing a number of challenges (e.g. spelling variation, encoding, representation and annotation) in handling historical, internet and SMS data for linguistic research. Each abstract received a 40 minute slot including 30 minutes for presentation and 10 minutes of discussion.

Here is the program of the 2012 NOSDAC Workshop excluding coffee and the lunch break:

09:20-10:00
 
Tagging of Blog Comments: Frequent Erros, Adjustments and Improved Accuracy
Bianka Trevisan and Melanie Neunerdt (RWTH University Aachen, Germany)
10:00-10:40
 
Identifying Patterns in Internet Language: Preliminary Results for Portuguese
Marcos Zampieri, Jürgen Hermes and Stephan Schwiebert (University of Cologne, Germany)
11:00-11:40
 
Annotation of Corpora via Lexica with Variant Relations
Armin Hoenen, Rüdiger Gleim, Alexander Mehler (University of Frankfurt, Germany)
11:40-12:20
 
Non-standard Data in Swiss Text Messages
Simone Ueberwasser (University of Zurich, Switzerland)
14:00-14:40
 
 
Processing and Representing Computer-mediated Discourse: An Open Issue in Corpus Linguistics
Michael Beißwenger (TU Dortmund University, Germany) and Lothar Lemnitzer (Berlin-Brandenburg Academy of Sciences and Humanities, Germany)

14:40-15:20
 
Wikipedia-based Corpora for Analyzing Revisions, Discussions and Text Quality in Collaborative Writing
Johannes Daxenberger, Oliver Ferschke and Iryna Gurevych (TU Darmstadt, Germany)

15:20-16:00
 
Code Alternation (Arabic - French) in Tunisian Newsgroups and Blogs
Sascha Diwersy (University of Cologne, Germany) and Fabrice Isaac (University of Paris 13, France)
16:20-17:00
 
The Digital Romansh Chrestomathy - A Collaborative Digitisation Project
Claes Neuefeind (University of Cologne, Germany)
17:00-17:40
 
CoLaMer - A Corpus of Merowingian Latin
Rembert Eufe (University of Regensburg, Germany)

After the workshop, all authors were invited to submit a paper for publication in a special volume of the the ZSM Studien Series. Received papers were peer-reviewed by a committee of experts and selected according to scientifc merit. This volume was sponsored by the Center for Multilingualism (ZSM) of the University of Cologne.