Increasing our ignorance of language: identifying language structure in an unknown 'signal'

John Elliott, Eric Atwell, Bill Whyte

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes algorithms and software developed to characterise and detect generic intelligent language-like features in an input signal, using natural language learning techniques: looking for characteristic statistical "language-signatures" in test corpora. As a first step towards such species-independent language-detection, we present a suite of programs to analyse digital representations of a range of data, and use the results to extrapolate whether or not there are language-like structures which distinguish this data from other sources, such as music, images, and white noise. Outside our own immediate NLP sphere, generic communication techniques are of particular interest in the astronautical community, where two sessions are dedicated to SETI at their annual International conference with topics ranging from detecting ET technology to the ethics and logistics of message construction (Elliott and Atwell, 1999; Ollongren, 2000; Vakoch, 2000).
Original languageEnglish
Title of host publicationProceedings of CoNLL-2000 and LLL-2000
EditorsClaire Cardie, Walter Daelemans, Claire Nédellec, Erik Tjong Kim Sang
Place of PublicationNew Brunswick, NJ
PublisherAssociation for Computational Linguistics
Pages25-30
Number of pages5
Publication statusPublished - 13 Sept 2000
EventFourth Conference on Computational Language Learning (CoNLL-2000) - Lisbon, Portugal
Duration: 13 Sept 200014 Sept 2000

Conference

ConferenceFourth Conference on Computational Language Learning (CoNLL-2000)
Country/TerritoryPortugal
CityLisbon
Period13/09/0014/09/00

Fingerprint

Dive into the research topics of 'Increasing our ignorance of language: identifying language structure in an unknown 'signal''. Together they form a unique fingerprint.

Cite this