Abstract
Many aspects of linguistic research, whatever their aims and objectives, are reliant on cross-language analysis for their results. In particular, any research into generic attributes, universals, or inter-language comparisons, requires samples of languages in a readily accessible format, which are clean and of adequate size for statistical analysis. Implicit in such understanding and detection of 'universal' attributes of language, is the need to study and analyse a representative set of the human language chorus. So, as an ongoing process during recent years, many raw text samples, in electronic format, have been collected to create a suitably diverse repository. Predominantly, the texts attained are freely available on a variety of sites over the Internet and cover all of the major language groups. These comprise Austro- Asiatic, Amerindian, Sino-Tibetan, Indo-European (Indo-Iranian, Hellenic, Celtic, Italic, Germanic and Slavic) Austroesian, Attaic, Uralic, Niger-Congo and independents and currently total over fifty language scripts.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the workshop on the amazing utility of parallel and comparable corpora |
| Subtitle of host publication | Fourth International Conference on Language Resources and Evaluation (LREC), 2004 |
| Editors | Nicoletta Calzolari |
| Place of Publication | Paris |
| Publisher | European Language Ressources Association |
| Pages | 50-53 |
| Number of pages | 4 |
| Publication status | Published - 25 May 2004 |
| Event | 4th International Conference on Language Resources and Evaluation - Centro Cultural de Belem, Lisbon, Portugal Duration: 24 May 2004 → 28 May 2024 http://www.lrec-conf.org/lrec2004/index.php |
Conference
| Conference | 4th International Conference on Language Resources and Evaluation |
|---|---|
| Country/Territory | Portugal |
| City | Lisbon |
| Period | 24/05/04 → 28/05/24 |
| Internet address |