Scientific background and objectives of the French Lexicon project 

    For more than a century now, researchers in psycholinguistics have tried to understand what are the cognitive mechanisms and the processing units involved in reading with adults. Until now, most of the studies were devoted to monosyllabic words only. This is a paradox since monosyllabic words represent less than 20% of the lexicon. It is therefore urgent to enlarge the field of research to polysyllabic words and this is our main objective in the present project.

    This project has two main goals. First, we will collect reaction times in two tasks, lexical decision and naming, on about 40,000 French words via the testing of 1200 French-speaking subjects (Program 1). Second, thanks to this mega corpus, we will provide answers to some important unresolved theoretical issues in the field of silent reading and reading aloud. Thanks to this large corpus, we will conduct multiple regression analyses on continuous variables that used to be treated as categorized in factorial designs. In particular, we will study the real role of word length in lexical decision and naming, with length in number of letters, number of phonemes and number of syllables (Program 2). We will also study in details the feedforward and feedback consistency effects in both tasks, a hotly debated question (Program 3). Finally, we will re-examine another important issue in reading, that is, the influence of orthographic and phonological neighbourhood (Program 4). We will systematically compare the results of both tasks since the same stimuli will be tested in both tasks. We will also compare our French data with the English data from the English Lexicon Project (the only mega corpus available on a large set of words ≈ 40081).

   To succeed in the realization of this large-scale project, we have put together a young, dynamic and interdisciplinary team (whose members have already worked and published together), which is extremely competent in the areas of psycholinguistics, statistics and mathematical modelling.  

 

Description of the project and methodology

    Our starting poing will be Lexique,  the French lexical database we have developped (see  www.lexique.org ; New, Pallier, Ferrand, & Matos, 2001 ; New, Pallier, Brysbaert & Ferrand, 2004). We will select 40,000 monosyllabic and polysyllabic words, of variable lengths and frequencies, among the 130,000 distinct lexical entries available in Lexique. We will also include inflected forms (such as feminine forms, plurals, and verbal forms). 1230 subjects will be tested in the classical chronometric tasks used in psycholinguistics, 750 in the lexical decision task and 480 in the naming task. Given the scope of the project, subjects will be recruited from two different universities, Université René Descartes (Paris) and Université Blaise Pascal (Clermont-Ferrand). Collected reaction times will be subject to multiple regression analyses in order to study the influence of the different tested variables. In particular, we will choose the following variables : length (in number of letters, phonemes and syllables), phonological onset (for the naming task), number of orthographic and phonological neighbors, lexical frequency, and feedforward and feedback consistency. These variables were choosen given their theoretical importance for models of visual word recognition and word naming. Finally, this reaction times corpus for 40,000 French words will offer researchers a precious tool to evaluate and constrain the developpment of models of silent reading and reading aloud. Furthermore, it will be useful for researchers of other fields of cognitive psychology, such as memory, visual perception and neuropsychology. This corpus will help them to match their stimuli on a number of variables, such as reaction times, lexical frequency, orthographic neighborhood, etc.


Expected results

 The collected reaction times and the sophisticated analyses we will conduct will allow us to (1) understand more precisely the functional architecture of the different levels of processing involved in reading, (2) detail the nature of the representations on which these processes apply, and (3) study the type of coding (orthographic, phonological, semantic) used by these different levels of processing. These results will be crucial for models of reading. In particular, this corpus will allow us to study in details the processing of polysyllabic words, a field largely neglected until now. Overall, this work will lead us to a better understanding of factors at play in reading.