Corpus linguistics (CL), with its decidedly empirical approach to language research, has greatly enriched previous paradigms to the point of becoming an obligatory methodological reference in the current landscape of linguistic studies.
We are interested in highlighting two strands, one dealing with developments in corpus linguistics and the other focusing on its applications. Like other empirical research within linguistics, LC research straddles the humanities and the social sciences, on the basis of computational linguistics. From the humanities it takes its primary interest in the study of language in its multiple aspects, from the social sciences it has taken a large part of its methodology based on quantification (mathematics, statistics, etc.), and from computer science, the development of increasingly sophisticated analytical tools. In this respect, the methodologies used in LC, far from being static, continue to evolve and incorporate important developments, whether through the creation of increasingly sophisticated software packages in ad hoc corpus research, the creation of specific portals, or the creation of tools focused on a variety of research tasks.
Research on LC developments is related to qualitative analysis methods, to textual annotation and to the use of quantitative analysis. In addition, some recent computer science developments, such as so-called sentiment analysis or opinion mining, have turned their interest to the analysis of large amounts of data on the web (big data).
In terms of applications, corpus linguistics has no limits, its great strength being the investigation of large databases that the analyst cannot manipulate effectively through manual analysis. LC is now being applied to any area of linguistic research, be it digital genres of any kind or non-digital genres. In the case of non-digital genres, the solution necessarily involves digitisation, since LC necessarily operates on digitised texts. However, although LC is the fundamental methodology for many researchers, it does not dispense with qualitative or manual analysis, and in its scientific production it is articulated in synergies with other approaches. It is very difficult today to conceive of a dictionary or a grammar without corpus research. But beyond lexicography and phraseology, which have grown hand in hand with the corpus, we find applications in all types of linguistic analysis, whether pragmatic or discursive, including, more recently, stylistic analysis. Not forgetting applications to the acquisition and teaching of second languages, or research into specialised languages. Nor should we forget the invaluable contribution of LC to translatology, given that the corpus is a fundamental tool for translators. There are real networks of researchers working on specific aspects.
However, returning to the starting point, we are interested in focusing our research on those aspects that evaluate the strength of proposals based on techniques developed within corpus linguistics in research on different fronts.
- Design and compilation of representative linguistic corpora of fiction and non-fiction genres.
- Computer developments in corpus linguistics.
- Automatic developments of statistical quantification based on corpora.
- Developments and analysis of monomodal or multimodal texts of digital and non-digital genres based on corpus techniques.
- Lexicographic, phraseological, translatological and language learning and teaching applications.
- Lexicographical, phraseological, grammatical and translatological analysis
Evaluation of corpus linguistics in the compilation and elaboration of monolingual, multilingual lexicography, including phraseological aspects, the writing of grammars based on actual language use and translation.
- Corpus-assisted language and linguistics teaching and learning
Research into learners' interlanguage, contrastive studies through corpus analysis, and the use of corpus techniques in language teaching and linguistics in the classroom (data driven learning).
- Developments in the design of monomodal and multimodal corpora with and without annotation
Analysis of the fundamentals in the compilation of corpora, whether synchronic (current) or diachronic, monolingual or multilingual, for the examination of the language as a whole, or part of it. Analysis of the role of different types of annotation: grammatical, semantic, discursive, multimodal.
- Analysis of written, oral and digital discourses and genres
Research by prioritising the use of corpus tools from a variety of genres and discourses in order to evaluate the effectiveness of corpus tools.
- Developments in opinion mining (sentiment analysis)
Analysis of the effectiveness of tools designed by computational linguists and computer engineers in examining digital genres, such as computer-mediated communication. It includes the valuation or subjectivity conveyed by users in commercial and non-commercial uses.
Partners
- Luisa Chieritchetti - Università Degli Estudi Di Bergamo (Italia).
- Giovanni Garofalo - Università Degli Estudi Di Bergamo (Italia).
- Amador Moreno, Carolina
- Tello Barbé, Cristina María
Work team
- Monika Bednarek - University of Sidney (Sidney, Australia).
- Helen Caple - University of New South Wales (Sidney, Australia).
- Maria Luisa Carrió Pastor - UPV-Valencia
- Annabel Kay Ruiz - USA-FUS
- Antonio Moreno Ortiz - UMA-Málaga
- Piazza, Roberta
- Eva María Gómez Jiménez - UGR-Granada
Blasco Ibáñez Campus
Av. Blasco Ibáñez, 32
46010 València (Valencia)