Corpus Linguistics: developments and applications - CORPLING

Reference of the Group:

GIUV2018-425

 
Description of research activity:
Corpus linguistics (CL), with its decidedly empirical approach to language research, has greatly enriched previous paradigms to the point of becoming an obligatory methodological reference in the current landscape of linguistic studies. We are interested in highlighting two strands, one dealing with developments in corpus linguistics and the other focusing on its applications. Like other empirical research within linguistics, LC research straddles the humanities and the social sciences, on the basis of computational linguistics. From the humanities it takes its primary interest in the study of language in its multiple aspects, from the social sciences it has taken a large part of its methodology based on quantification (mathematics, statistics, etc.), and from computer science, the development of increasingly sophisticated analytical tools. In this respect, the methodologies used in LC, far from being static, continue to evolve and incorporate important developments, whether through the creation of increasingly sophisticated software packages in ad hoc corpus research, the creation of specific portals, or the creation of tools focused on a variety of research tasks. Research on LC...Corpus linguistics (CL), with its decidedly empirical approach to language research, has greatly enriched previous paradigms to the point of becoming an obligatory methodological reference in the current landscape of linguistic studies. We are interested in highlighting two strands, one dealing with developments in corpus linguistics and the other focusing on its applications. Like other empirical research within linguistics, LC research straddles the humanities and the social sciences, on the basis of computational linguistics. From the humanities it takes its primary interest in the study of language in its multiple aspects, from the social sciences it has taken a large part of its methodology based on quantification (mathematics, statistics, etc.), and from computer science, the development of increasingly sophisticated analytical tools. In this respect, the methodologies used in LC, far from being static, continue to evolve and incorporate important developments, whether through the creation of increasingly sophisticated software packages in ad hoc corpus research, the creation of specific portals, or the creation of tools focused on a variety of research tasks. Research on LC developments is related to qualitative analysis methods, to textual annotation and to the use of quantitative analysis. In addition, some recent computer science developments, such as so-called sentiment analysis or opinion mining, have turned their interest to the analysis of large amounts of data on the web (big data). In terms of applications, corpus linguistics has no limits, its great strength being the investigation of large databases that the analyst cannot manipulate effectively through manual analysis. LC is now being applied to any area of linguistic research, be it digital genres of any kind or non-digital genres. In the case of non-digital genres, the solution necessarily involves digitisation, since LC necessarily operates on digitised texts. However, although LC is the fundamental methodology for many researchers, it does not dispense with qualitative or manual analysis, and in its scientific production it is articulated in synergies with other approaches. It is very difficult today to conceive of a dictionary or a grammar without corpus research. But beyond lexicography and phraseology, which have grown hand in hand with the corpus, we find applications in all types of linguistic analysis, whether pragmatic or discursive, including, more recently, stylistic analysis. Not forgetting applications to the acquisition and teaching of second languages, or research into specialised languages. Nor should we forget the invaluable contribution of LC to translatology, given that the corpus is a fundamental tool for translators. There are real networks of researchers working on specific aspects. However, returning to the starting point, we are interested in focusing our research on those aspects that evaluate the strength of proposals based on techniques developed within corpus linguistics in research on different fronts.
[Read more][Hide]
 
Web:
 
Scientific-technical goals:
  • Diseño y compilacion de corpus linguisticos representativos de generos de ficcion y no-ficcion.
  • Desarrollos informaticos de linguistica de corpus.
  • Desarrollos automaticos de cuantificacion estadistica basados en corpus
  • Desarrollos y analisis de textos monomodales o multimodales de generos digitales y no digitales a partir de tecnicas de corpus.
  • Aplicaciones lexicograficas, fraseologicas, traductologicas y al aprendizaje y enseñanza de lenguas.
 
Research lines:
  • Developments in the design of monomodal and multimodal corpora with and without annotation.Analysis of the fundamentals in the compilation of corpora, whether synchronic (current) or diachronic, monolingual or multilingual, for the examination of the language as a whole, or part of it. Analysis of the role of different types of annotation: grammatical, semantic, discursive, multimodal.
  • Analysis of written, oral and digital discourses and genres.Research by prioritising the use of corpus tools from a variety of genres and discourses in order to evaluate the effectiveness of corpus tools.
  • Lexicographical, phraseological, grammatical and translatological analysis.Evaluation of corpus linguistics in the compilation and elaboration of monolingual, multilingual lexicography, including phraseological aspects, the writing of grammars based on actual language use and translation.
  • Corpus-assisted language and linguistics teaching and learning.Research into learners' interlanguage, contrastive studies through corpus analysis, and the use of corpus techniques in language teaching and linguistics in the classroom (data driven learning).
  • Developments in opinion mining (sentiment analysis).Analysis of the effectiveness of tools designed by computational linguists and computer engineers in examining digital genres, such as computer-mediated communication. It includes the valuation or subjectivity conveyed by users in commercial and non-commercial uses.
 
Group members:
Name Nature of participation Entity Description
CARMEN GREGORI SIGNESDirectorUniversitat de València
Research team
MIGUEL FUSTER MARQUEZMemberUniversitat de València
INMACULADA GARNES TARAZONAMemberUniversitat de València
MILAGROS DEL SAZ RUBIOCollaboratorUniversitat Politècnica de Valènciatenured university professor
JOSE SANTAEMILIA RUIZCollaboratorUniversitat de València
SERGIO MARUENDA BATALLERCollaboratorUniversitat de València
MARIA LLUISA GEA VALORCollaboratorUniversitat de València
ANA BELEN CABREJAS PEÑUELASCollaboratorUniversitat de València
MARIA ALCANTUD DIAZCollaboratorUniversitat de València
MARICEL ESTEBAN FONOLLOSACollaboratorUniversitat de València
LAURA MIÑANO MAÑEROCollaboratorUniversitat de València
PAULA RODRIGUEZ ABRUÑEIRASCollaboratorUniversidad de Santiago de Compostelapre-tenured lecturer
VERONICA FALQUET APARISICollaboratorUniversitat de València
VERONICA FALQUET APARISICollaboratorUniversitat de València - Estudi GeneralUVEG PhD student
LUISA CHIERICHETTICollaboratorUniversita' Degli Studi Di Bergamotenured university professor
GIOVANNI GAROFALOCollaboratorUniversita' Degli Studi Di Bergamofull university professor
MONIKA BEDNAREKCollaboratorUniversity of Sidney (Australia)full university professor
HELENE CAPLECollaboratorUniversity of New South Wales (Sidney, Australia)professor
JULIA VALEIRAS JURADOCollaboratorUniversitat Jaume Iprofessor
MARIA EIVOR JORDA MATHIASENCollaboratorUniversitat de València
LUCIA BELLES CALVERACollaboratorUniversitat de València
Laura Mercé Moreno SerranoCollaboratorUniversitat de València - Estudi GeneralUVEG PhD student
CRISTINA MARIA TELLO BARBECollaboratorUniversitat de València
CRISTINA MARIA TELLO BARBECollaboratorUniversitat de València - Estudi GeneralUVEG PhD student
Carolina Amador Moreno CollaboratorAPPLIED BIOLOGICAL MATERIALS INC.full university professor
Carmen Maíz ArévaloCollaboratorUniversidad Complutense de Madridtenured university professor
CARMEN PÉREZ SABATERCollaboratorUniversitat Politècnica de Valènciatenured university professor
MARÍA GARCÍA GÁMEZCollaboratorUNIVERSIDAD DE MALAGAtrainee research staff
MARÍA TERESA TABOADA GÓMEZCollaboratorSimon Fraser Universityfull university professor
STEFANIA M. MACICollaboratorUniversita' Degli Studi Di Bergamofull university professor
LAURA HIDALGO DOWNINGCollaboratorUniversidad Autónoma de Madridfull university professor
OLGA CRUZ MOYACollaboratorUniversidad Pablo de Olavide (Sevilla)tenured university professor
CARLA FERNÁNDEZ MELENDRESCollaboratorUNIVERSIDAD DE MALAGAPhD student from other entities
OLENA HALYNSKACollaboratorUniversidad Lingüística Estatal de Kyiv, Ucraniaresearcher
JAVIER FERNÁNDEZ CRUZCollaboratorUNIVERSIDAD DE MALAGAtenured university professor
PAOLO ROSSOCollaboratorUniversitat Politècnica de Valènciafull university professor
EWA PIECHURSKA-KUCIELCollaboratorUniversity of Opolefull university professor
ROSA MUÑOZ LUNACollaboratorUNIVERSIDAD DE MALAGAtenured university professor
GLORIA ÁLVAREZ BENITOCollaboratorUniversidad de Sevillatenured university professor
JOSE GABRIEL DE AMORES CARREDANOCollaboratorUniversidad de Sevillatenured university professor
MARTA INÉS TORDESILLAS COLADOCollaboratorUniversidad Autónoma de Madridtenured university professor
 
CNAE:
  • -
 
Associated structure:
  • Inter-univ. Institute for Applied Modern Languages (IULMA)
 
Keywords:
  • compilación de corpus, monomodal, multimodal, diacronía, sincronía, anotación.
  • corpus tools, genres, discourses, effectiveness
  • analysis, lexicography, phraseology, grammar, corpus linguistics, translation
  • data driven learning, interlanguage, L2, language teaching, contrastive linguistics
  • software de corpus, géneros digitales, uso comercial, uso político.