CorTexT: the technological plateform of IFRIS

The CorTexT initiative of IFRIS represents a collaborative project to interface research dynamics and instrumental dynamics into a Platform

Context and Issues

Provision, on the internet, of a growing number of resources is a privileged field of study for the analysis of textual data. Tracking of science and innovation in scientific and patent databases delivers detailed information to analyze the scientific production.

Before this mass of data, more or less structured, scientific work in the humanities and social research and innovation must now confront the analysis of large heterogeneous data corpus to both characterize and measure the phenomena taking they are studying.

Answers provided by the cortext platform

These two aspects require the use of research methodology and tools developed in various scientific and technical understanding of the mechanisms involved: automatic processing of language information retrieval, knowledge engineering, sociology of networks, scientometrics, analyzing controversies and semiotics.

To support the work of analysis and interpretation of the problems encountered in the French Institute for "Research, Innovation and Society", the IFRIS develops a digital platform called "CorText" for processing large text corpora for research, expertise and community learning.

CorText is a project supported by the Research Unit INRA SenS of IFRIS.


The objective is to provide IFRIS research teams with tools, process scripts, procedures and methods to help researchers treat, characterize, quantify and analyze textual data.

For this, the CorText team provides skills and tools to serve two complementary approaches:

The numerical analysis of data. The tools available will take the form of positioning indicators of individual and collective characterization, in the wake of current thinking on bibliometrics. In this context, the main sources of data are the basis of structured scientific production (articles, citations, patents ...).

Distributional and relational analysis From textual data often heterogeneous, available on the internet, we want to show all the relationships between different concepts or actors to describe a particular space (a theme, region, debate, controversy, discipline ...). A classic example would be, for example, analyze public debates reports (blogs, newspapers, ...) to render the relationship between actors and arguments in controversies more explicit .




Members of the CorText staff :


IFRIS, Université Paris-Est

5 boulevard Descartes

77420 Champs Sur Marne