This is a collection of the papers presented at the Corpus Linguistics 2005 conference which was held in Birmingham July 14-17 2005. Some of the papers are either as Word documents or as PDF files.
The proceedings have been divided into 11 subcategories:
Contrastive Corpus Linguistics
Language Learning & Error Analysis through Corpora
Language Processing & Corpus Tool
Phraseology & Patterns in language
Rachel Aires, Diana Santos & Sandra Aluisio: "Yes, user!": compiling a corpus according to what the user wants
See "Yes, user!".doc
Latifa Al-Sulaiti and Eric Atwell: Extending the Corpus of Contemporary Arabic
See Extending the Corpus of Contemporary Arabic.doc
Wendy Anderson & Dave Beavan: Internet delivery of time-synchronised multimedia: the SCOTS Projects
See Traditional transcriptions.doc
Caroline Barri�re & Akakpo Agbago: Corpus Construction for Terminology
See Terminology.doc
Sara Piccioni: The Lorca corpus at the crossroads of philology and corpus linguistics
See The Lorca corpus at the crossroads of philology and corpus linguistics.doc
Gong Wengao: English in computer-mediated environments: a neglected dimension in large English corpus compilation
See English in computer-mediated environments.pdf
Hilary Nesi, Sheena Gardner, Richard Forsyth, Dawn Hindle, Paul Wickens, Signe Ebeling, Maria Leedham, Paul Thompson and Alois Heuboeck: Towards the compilation of a corpus of assessed student writing
See Towards the compilation of a corpus.doc
Gisle Andersen: Assessing algorithms for automatic extraction of anglicisms in Norwegian texts
See Assessing algorithms.doc
J�zsef Andor: A Lexical Semantic-Pragmatic Analysis of the Meaning Potentials of Amplifying Prefixes in English and Hungarian A Corpus-based Case Study of Near Synonymy
See A Corpus-based Case Study of Near Synonymy.doc
Sandrelli Annalisa & Bendazzoli Claudio: Lexical patterns in simultaneous interpreting: a preliminary investigation of EPIC (European Parliament Interpreting Corpus)
See Lexical patterns in simultaneous interpreting: a preliminary investigation of EPIC.doc
Marianna Apidianaki: Translation prediction using word co-occurrence graphs
See Translation prediction using word cooccurrence graphs
Tatjana Bala�ic Bulc: Connectors in students' academic writing in two closely related languages
See Connectors in students' academic writing in two closely related languages.doc
Silvia Bernardini & Marco Baroni: Spotting translationese: A corpus-driven approach using support vector machines
See Spotting translationese.doc
Gabriela Castelo Branco Ribeiro & Maria Carmelita Padua Dias: Two corpus-based studies about the translation of adjectives in English and Brazilian Portuguese
Wallace Chen: Patterns of Connectors in the English-Chinese Parallel Corpus of Popular Science Texts
Debbie Elliott: Using corpora to automatically detect untranslated and �outrageous� words in machine translation output
Ana Frankenberg-Garcia: A corpus-based study of loan words in original and translated texts
See A corpus-based study of loan words.doc
Randall L. Jones : Analysis of lexical correspondence in an English-German parallel corpus
Zhenglin Jin & Caroline Barriere: Exploring sentence variations in bilingual corpora
See Exploring sentence variations with bilingual corpora.doc
Tony McEnery and Richard Xiao: Passive constructions in English and Chinese: A contrastive and translation study
See Passive constructions in English and Chinese.doc
Stella Neumann and Silvia Hansen-Schirra : The CroCo Project: Crosslinguistic corpora for the investigation of explicitation in translations
See The CroCo Project.pdf
Pablo Romero Fresco: The translation of phraseology in a parallel (English-Spanish) audiovisual corpus.
See The translation of phraseology in a parallel.doc
Doaa A. Samy: Named Entities: Structure and Translation. A Study Based on a Parallel Corpus (Arabic-Spanish-English)
See Named Entities.doc
Tam�s V�radi: Taking stock of the Bilingual Lexicon
See Taking Stock of the Bilingual Lexicon.doc
Nadine Aldinger: Corpus-driven genitive disambiguation
See Corpus-driven genitive disambiguation.doc
Minhee Bang: Representation of foreign countries in two US newspapers: premodifications of keywords, countries, country, nations and nation
See Representation of foreign countries in two US newspapers.doc
Michael Barlow: Input grammars and output grammars: Investigating the language of individual speakers Christian Chiarcos & Olga Krasavina: Rhetorical Distance Revisited: A pilot study
See Rhetorical Distance Revisited.doc
Huaqing Hong: SCORE: A Multimodal Corpus Database of Education Discourse in Singapore Schools
See Scope.pdf
Henk Louw: Really Too Very Much: Adverbial Intensifiers in Black South African English
See REALLY TOO VERY MUCH.doc
Ling Yin & Richard Power: Investigation of the structure of topic expressions: a corpus-based approach
See Investigation of the Structure of Topic Expressions.doc
Massimo Poesio & Ron Artstein: Annotating (anaphoric) ambiguity
See Annotating (Anaphoric) Ambiguity.pdf
Monika A. Bednarek: "He's nice but Tim" -- contrastive evaluation in the British press
See 'He's nice but Tim': contrast in British newspaper discourse.doc
Sara Radighieri: Arts in the news: Evaluative language use in the 'arts review'
See Arts in the news.doc
Solveig Granath & Michael Wherrity: Prepositions with that-clause complements in tagged corpora, with a special focus on in that
See Prepositions with that-clause complements in tagged corpora.doc
Vladimir Petkevic & Frantisek Cermak:Linguistically motivated tagging as the base for a corpus-based grammar
See Linguistically Motivated Tagging as a Base for a Corpus-Based Grammar.doc
Simone Sarmento: Distribution of Modal Verbs in an Aviation Corpus
See Distribution of Modal Verbs in an Aviation Corpus.doc
Chris Shei: Analysing Chinese Sentence-final Particles Using Academia Sinica Balanced Corpus of Modern Chinese
See Analysing Chinese Sentence.doc
Seo-in Shin: Automatic Pattern Extraction for Korean Sentence Parsing
See Automatic Pattern Extraction for Korean Sentence Parsing.doc
Mariko Abe and Yukio Tono: Variations in L2 spoken and written English: investigating patterns of grammatical errors a cross proficiency levels
See Variations in L2 spoken and written English.doc
Mar�a Bel�n D�ez Bedmar-Struggling with English at University level: error patterns and problematic areas of first-year students� interlanguage
See Bedmar Uni English.doc
Xiaotian Guo: Modal Auxiliaries in Phraseology: A Contrastive Study of learner English and NS English
See A Contrastive Study of Learner English and NS English.doc
Anke L�deling, Peter Adolphs, Emil Kroymann & Maik Walter: Multi-level error annotation in learner corpora
See Multi-level error annotation in learner corpora.doc
Zhang Yang: College English Course Corpus
Sabine Bartsch, Elke Teich, Monica Holtz & Richard Eckart: Corpus-based register profiling of texts from mechanical engineering
See Corpus-based register profiling of texts.pdf
Anja Belz: Corpus-driven Generation of weather Forecasts
See Corpus-driven Generation of weather Forecasts.pdf
Pernilla Danielsson & Andrew Sayers: Enhancing Concordance Method: Introducing the CHAB
Stefan Evert & Manuela Schonenberger : Separating the sheep from the goats: Clarifying corpus content using XML
See Separating the sheep from the goats.doc
David Hardcastle: Using the distributional hypothesis to derived co-occurrence scores from the British National Corpus
See Using the distributional hypothesis.doc
Laura L�fberg Scott Piao, Asko Nykanen, Krista Varantola, Paul Rayson and Jukka-Pekka Juntunen: A semantic tagger for the Finnish language
See A semantic tagger for the Finnish language.doc
Yuji Matsumoto, Masayuki Asahara, Kou Kawabe, Yurika Takashi, Yukio Tono, Akira Ohtani and Toshio Morita: ChaKi: An Annotated Corpora Management and Search System
See ChaKi.doc
D�bora Oliveira, Diana Santos, Luis Sarmento & Belinda Maia: Corpus analysis for indexing: when corpus-based terminology makes a difference
See Corpus analysis for indexing.doc
Shih-Ping Wang: Integrating corpora and word-focused tasks into a linguistics project for word growth
See Integrating corpora and word-focused tasks into a linguistics project.doc
Maria ZIMINA- Bi-text topography and quantitative approaches of parallel text processing
See Bi-text Topography and Quantitative Approaches.doc
Eros Zanchetta and Marco Baroni: Morph-it! A free corpus-based morphological resource for the Italian language
See Morph-it!.doc
Antti Arppe: The role of morphological features in distinguishing semantically similar words
See The role of morphological features in distinguishing semantically similar words.doc
J�rg Asmussen: Automatic determination of new words within domain-specific vocabularies using document classification and frequency profiling
See Automatic detection of new domain-specific words.
Marco Baroni & Stefan Evert: Testing the extrapolation quality of word frequency models
See Testing the extrapolation quality.pdf
Dr Paul Doyle: Replicating Corpus-Based Linguistics: Investigating Lexical Networks in Text
See Replication and Corpus Linguistics.pdf
Cvetana Krstev & Dusko Vitas : Corpus and Lexicon � Mutual In-completeness
See Corpus and Lexicon.doc
Jennifer Pedler: Using semantic associations for the detection of real-word spelling errors
See Using semantic associations for the detection of real-word spelling errors.doc
Scott S.L. Piao, Dawn Archer, Olga Mudraya, Paul Rayson, Roger Garside, Tony McEnery, Andrew Wilson: A Large Semantic Lexicon for Corpus Annotation
See A Large Semantic Lexicon for Corpus Annotation.doc
Elisabete Marques Ranchhod: Using Corpora to Increase Portuguese MWE Dictionaries. Tagging MWE in a Portuguese Corpus.
See Using Corpora to Increase Portuguese MWE Dictionaries.pdf
Sofie Van Gijsel, Dirk Speelman & Dirk Geeraerts: A Variationist, Corpus Linguistic Analysis of Lexical Richness
See Lexical Richness.doc
Frantisek Cermak & Michal Křen: Large Corpora, Lexical Frequencies and Coverage of Texts
See Large Corpora, Lexical Frequencies and Coverage of Texts.doc
Christopher Gledhill & Pierre Frath: A Reference-based Theory of Phraseological Units: the Evidence of Fossils.
See A Reference-based Theory of Phraseological Units.doc
Eva Hajičov�, Jiri Havelka & Katerina Vesela: Corpus Evidence of Contextual Boundness and Focus
See Corpus Evidence of Contextual Boundness and Focus.doc
Csaba Oravecz, Karoly Varasdi & Viktor Nagy: Lexical idiosyncrasy in MWE extraction
See Lexical idiosyncrasy in MWE extraction.doc
Bertus van Rooy: Expressions of modality in Black South African English
See Expressions of modality in Black South African English.doc
Petra Storjohann: Corpus-driven vs. corpus-based approach to the study of relational patterns
See Corpus-driven vs. corpus-based approach.doc
Christiane Wanzeck: The Determination of Phraseological Units in Historical Corpora: An Analysis System for Early New High German
See The Determination of Phraseological Units in Historical Corpora.doc
Abdulrahman Almuhareb & Massimo Poesio: Finding Attributes in the Web
See Finding Attributes in the Web Using a Parser.pdf
Ilias Koutsis, Geroge Kouklakis, George Mikros & George Markopoulos: MINOTAVROS A tool for the semiautomated creation of large corpora from the Web.
See Minotavros.doc
Alexander Mehler & Rudiger Gleim: Polymorphism in Generic Web Units � A Corpus Linguistic Study PCLC/
See Alexander_Mehler_and_Ruediger_Gleim_Corpus_Linguistics_2005.pdf
Antoinette Renouf: The WebCorp Search Engine: a holistic approach to web text search
See The WebCorp Search Engine.doc
Jes�s Tom�s, Francisco Casacuberta & Jaime Lloret: WebMining: Non�supervised system to obtain parallel corpus from the Web
See WebMining.pdf
Motoko Ueyama & Marco Baroni: Automated construction and evaluation of a Japanese web-based reference corpus
See Automated Construction and Evaluation of Japanese Web-based Reference Corpora.doc
Adriano Allora: A Tentative Typology of Net-mediated Communication
See A Tentative Typology of Net-mediated Communication.pdf
Knut Hofland & Annette Myre Jorgensen: COLA: A Spanish spoken corpus of youth language
See COLA.doc
Kikuo Maekawa: Quantitative Analysis of Word-form Variation Using a Spontaneous Speech Corpus
See Quantitative Analysis of Word-form Variation.doc
Antonio Moreno-Sandoval & Ana Gonzales-Ledesma: Pragmatic analysis of man-machine interactions in a spontaneous speech corpus
See Pragmatic analysis of man-machine interactions.doc