LGLex syntactic lexicon version 3.3 - 2011/03/31
http://infolingu.univ-mlv.fr/
License: LGPL-LR

The LGLex lexicon is a syntatic lexicon of French verbs, nouns playing the 
predicative role, frozen expressions and adverbs generated by the LGExtract 
tool (Constant &  Tolone, 2010; Tolone, 2011) from the Lexicon-Grammar tables 
under LGPL-LR license. It is available both in text and XML format.

Note:
Some tables of nouns have also morphologically derived verb: 
  f1a,f1b,f1c,f1d,f1r,f2a,f2b,f2c,f21,f3,f4,f5,f9,ad,dr1,drc,es,fr1,fs1,is1
Others have morphologically derived adjective:
  an01,an02,an03,an04,an05,an06,ansy,es
In addition, the entries of the adverb table peco are also predicative 
adjectives, while those of the adverb tables ppco and pvco are also frozen 
expressions.
These additional entries, in addition to being mentioned in the complete 
lexicon, were extracted in the directory extra-lexicon/.

_______________________________________________________________________________
Description of the content of the LGLex lexicon in XML format (element <syn-lexicon>):

Each entry is delimited by the tag <entry> with an attribute id which is 
the entry identifier generated by concatenating its grammatical category, 
its class (or table) it comes from and the index of the entry in the table. 
For instance, in <entry id="V_33_24" status="completed">, the identifier is 
V_33_24, which correspond to the 24th entry (or the input having the property 
<ID> valued 24) in the verb class 33. The status can be "completed" for a 
fully coded entry, "to complete" for an entry that has at least one property 
uncoded, or "to encode" for an entry that has at least half of its properties
 uncoded.
Note: We call here a property encoded, a property present in the table and 
coded ~. This ignores the coding O in the table of classes also means that 
the property must be encoded, but not listed in the table.
Then, information is gathered in four different tags:
- <lexical-info>: lemma and lexical information;
- <arguments>: arguments and their nature;
- <all-construction>: accepted constructions;
- <example>: an illustrative example.


1) <lexical-info> contains lexical information corresponding to the entry and 
has an attribute cat indicating its category ("verb", "noun" for predicative 
noun, "adj" for predicative adjective, "expr" for frozen expression and 
"adverb"):

For verbs ONLY:
- <lexical-value> includes <lemma>, the value of which is the lemma of the 
entry, that can be completed with preverbal pronouns like "se", "y", "en", 
"le", "la", "les" and/or the negation "ne ... pas" when they are obligatory 
in this entry (tags <ppvse>, <ppvy>, <ppven>, <ppvle>, <ppvla>, <ppvles>, 
<neg> with the attribute value "true");
- <aux> indicates the possible auxiliaries accepted by this entry. It contains
a list including the tags <avoir> or <être> with value "true";
- <traduction> contains for some entries the translation of the lemma in English;

For nouns, adjective, frozen expressions and adverbs:
- <lexical-value> is composed of element <complete>, the value of which is the
 whole entry (it can be multiword) and the elements <noun1>, <adj1>, <det2>, 
<noun2> (for nouns), containing the values of its different components. The 
morphologically derived adjective (resp. verb) might also be indicated in tag 
<adjassoc> (resp. <verbassoc>).
Complete list of elements for all categories:
<adj>, <adj1>, <adv>, <adv1>, <c>, <c0>, <c02>, <c1>, <c12>, <c2>, <c3>, <cc>, 
<cv>, <conj>, <conj2>, <conjcoord>, <conjsub>, <det>, <det0>, <det02>, <det1>, 
<det12>, <det2>, <det3>, <detc>, <detv>, <ilya>, <modif>, <MPA>, <noun1>, 
<noun2>, <nv>, <poss1>, <prep>, <prep0>, <prep1>, <prep2>, <prep3>, 
<prepdetv>, <prepc>, <prepv>, <verb>, <verb2>, <comme>, <ce>, <il>, <ca> and 
also <adjassoc>, <advassoc>, <nounassoc>, <ppvassos>, <ppvseassoc>, 
<ppvenassoc>, <ppvyassoc>, <ppvnegassoc>, <verbassoc>;

For nouns and frozen expressions:
- <Vsup> describes the support verbs associated with the deverbal noun (or 
the frozen expression made of an advjective or an adverb and a support verb). 
It contains <cat>, which has always the value "verb" and a list of <value>, 
representing the possible lexical values of the support verb part of basic 
construction of the entry;

For nouns ONLY:
- <Vconv> is constructed in the same manner of <Vsup> and refers to a second 
list of lexical values for the converse support verb Vconv as it can appear 
in some converse constructions (e.g. "N1 Vconv Det N à N0");
- <det-modif-list> includes a list of <determiner-modifier> which indicates 
the distribution of determiners with possible modifiers for the noun. 
<determiner-modifier> has a tag <det>, which indicates the possible 
determiners separated by '+' (the value <E> stands for the absence of 
determiner). Tag <modifier> indicates whether the noun accept a modifier 
with the defined determiners. This can be completed with <value-modif>, which 
indicates the posible modifiers separated by '+';

For verbs and nouns:
- <prepositions> contains a sequence of <preposition> with an attribute id, 
corresponding to the number of the argument it introduces in the elementary 
construction (0 for subject, 1 for the first argument, 2 for for the second 
one, ...). In <preposition>, <prep> represents the different lexical values 
of the preposition.
For instance, the deverbal noun "allergie" (allergy) in table an01 uses 
support verb avoir (have) and enters the elementary sentence "N0 Vsup Det 
N Prép N1". The preposition associated with argument 1 is specified as follows:

<prepositions>
      <preposition id="1">
        <prep value="à" />
      </preposition>
</prepositions>

In the first complément ("Prép N1"), the preposition has the value "à": e.g., 
Léa a une certaine allergie à la poussière (Lea is allergic to dust);

- <locatifs> defines the locative preposition distributions. It contains a 
sequence of <locatif> with an attribute id, refering to the number of the 
associated argument. Element <loc> also has a list of <prep>, representing 
the possible lexical values of the locative prepositions.


2) <arguments> describes the distribution of the different arguments (subject 
and complements) of the entry. It includes a set of <constituent> having an 
attribute pos, that indicates the number of the associated argument in the 
elementary sentence that the entry enters. A constituent is a list of 
<component>, each of them having the following elements:
- <cat> which has an attribute specifying its syntactic nature: "NP" for 
noun phrase, "inf" for infinitive (V-inf W), "comp" for a complementizer 
phrase (Qu P), "leFaitComp" for the noun phrase le fait que P, "siPOuSiP" 
for the specific complementizer phrase si P ou si P and "adj" for an adjective;
- Various semantic features <hum> (human), <nothum> (non human), <plobl> 
(obligatory plural), <npr> (proper noun), <abst> (abstract) et <conc> 
(concrete) with the value "true" when they are verified;

For verbs ONLY:
- Other semantic features are possible: <source>, <destination>, <beneficiaire>
(beneficiary), <detrimentaire> (detrimental), <apparition> (appearance), 
<disparition> (disappearance), <mesure> (measure), <prix> (price); and for 
nouns: <coll> (collective noun), <plur> (plural);

There also exist several other optional features:
- <mood> with an attribute indicating the mood of the complementizer phrase 
("ind" for indicative and "subj" for subjonctive);
- <contr> with an attribute indicating the number of the argument that controls 
the infinitive. In the previous example, the distribution of the argument at 
position 1 (i.e., the first complement "Prép N1") is described as follows:

  <arguments>
     ...
    <constituent pos="1">
      <component>
        <cat value="inf" />
        <contr value="0" />
      </component>
      <component>
        <cat value="ceComp" />
        <mood value="ind" />
      </component>
      <component>
        <cat value="ceComp" />
        <mood value="subj" />
      </component>
      <component>
        <cat value="NP" />
        <nothum value="true" />
      </component>
    </constituent>
    ...
  </arguments>

The complement "Prép N1" can be:
- an infinitive controlled by argument 0, i.e., the subject N0: e.g., Léa a 
une allergie à travailler (Lea is allergic to work);
- a complementizer phrase in the indicative: e.g, Léa a une allergie à ce 
que nous voyageons (Lea is allergic to the fact that we travel);
- a complementizer phrase in the subjective: e.g., Léa a une allergie à ce 
qu'il fasse beau (Lea is allergic to shine);
- a noun phrase (see previous example).

For verbs ONLY:
- <origin> contains the list <orig>, which indicates the complete name of 
the columns in the table, that has been used to define the distribution;
- <introd-prep> contains a sequence of <prep>, providing the prepositions 
introducing the argument and indicating their lexical values;
- <introd-loc> contains a sequence of <loc>, providing the locative 
prepositions that introduce the argument and indicating their lexical values.


3) <all-constructions> list the different constructions that are accepted by 
the entry:
- <absolute-constructions> includes a list of absolute <construction>, which 
values are the titles of the columns entirely specifying the accepted 
construction with all its constituents.
For instance, in the construction "N0 V N1 Prép N2", N0 represents the 
subject, V indicates the verbal entry, N1 is the first complement and 
Prép N2 is the second one introduced by the preposition Prép.
The construction title is preceded by the string "o::" if the column 
associated with the construction has been coded '+' in the table ("o") 
or "true::" if it is a constant '+' in the table of classes ("true"). In this 
last case, the property is verified by all the entries of the table. Such 
properties are the definitional properties of each classes, including the 
base construction;
- <relative-constructions> contains the sequence of all relative 
<construction>, which values are the column titles of all columns specifying 
the name of the transformations that are applied on the base construction 
(e.g., "[passif par]");

For nouns ONLY:
- <reductionsGN> describing reductions of the base sentence construction 
into another construction of another syntactic category, in the present 
case, noun phrase. Reductions are described with a list of <construction>, 
the value of which are the titles of the columns specifying the reduction 
construction (e.g., "le N entre N0 et N1");
- <verbales> includes a list of verbal <construction>, which values are 
the titles of the columns specifying the construction which are accepted 
by the corresponding verbal entry (<verbassoc>);

For adverbs ONLY:
- <structureAdv> contains the base structure of the multiword adverb 
(e.g., "Prép Det Adj C") and also its variants into a list of <construction> 
(e.g., "Prép Det C"). This structure represents "Adv" in the absolute and 
relative constructions (e.g., the absolute construction "N0 V Adv W" can 
be writted "N0 V Prép Det Adj C W"). For simple adverbs, the structure is 
not indicated because it's represented directly by "Adv" in the constructions.


4) <example> illustrates the entry (solely for verbs and nouns):
The value of <example> is an example of sentence with the entry.
For verbs, all entries contains an example, whereas, for nouns, 
only a selection of nouns contains an example.

_______________________________________________________________________________
An entry of LGLex lexicon described in text format includes the same information and is represented as follows:

ID=category_tableNumber_entryNumber;status=...
lexical-info=[...]
args=(...)
all-constructions=[absolute=(...),
                   relative=(...)]
example=[...]

_______________________________________________________________________________
A verbal example of LGLex lexicon in the XML format (verb "candidater" (to apply) in the table 33):

<entry id="V_33_24" status="completed">
  <lexical-info cat="verb">
    <lexical-value>
      <lemma value="candidater" />
    </lexical-value>
      <aux>
        <avoir value="true" />
      </aux>
    <locatifs />
    <prepositions />
  </lexical-info>
  <arguments>
    <constituent pos="0">
      <component>
        <cat value="NP" />
        <hum value="true" />
        <origin>
          <orig value="N0 =: Nhum" />
        </origin>
        <introd-prep />
        <introd-loc />
      </component>
    </constituent>
    <constituent pos="1">
      <component>
        <cat value="NP" />
        <nothum value="true" />
        <origin>
          <orig value="N1 =: N-hum" />
        </origin>
        <introd-prep />
        <introd-loc />
      </component>
    </constituent>
  </arguments>
  <all-constructions>
    <absolute-constructions>
      <construction value="true::N0 V à N1" />
    </absolute-constructions>
    <relative-constructions>
      <construction value="[extrap]" />
      <construction value="Ppv =: y" />
    </relative-constructions>
  </all-constructions>
  <example>
    <example value="Max a candidaté à un poste" />
  </example>
</entry>

____________________________
Same example in text format:

ID=V_33_24;status=completed
lexical-info=[cat="verb",verb=[lemma="candidater"],
              aux-list=(avoir="true"),
              prepositions=(),
              locatifs=()]
args=(const=[pos="0",dist=(comp=[cat="NP",hum="true",origin=(orig="N0 =: Nhum"),introd-prep=(),introd-loc=()])],
      const=[pos="1",dist=(comp=[cat="NP",nothum="true",origin=(orig="N1 =: N-hum"),introd-prep=(),introd-loc=()])])
all-constructions=[absolute=(construction="true::N0 V à N1"),
                   relative=(construction="[extrap]",construction="Ppv =: y")]
example=[example="Max a candidaté à un poste"]

_______________________________________________________________________________
A nominal example of LGLex lexicon in XML format (noun "bise" in the table fnan, with support verb "faire": "faire la bise" (to kiss on the cheeks)):

<entry id="N_fnan_18" status="completed">
  <lexical-info cat="noun">
    <lexical-value>
      <complete value="bise" />
      <noun1 value="bise" />
    </lexical-value>
    <Vsup>
      <cat value="verb" />
      <value value="faire" />
    </Vsup>
    <det-modif-list>
      <determiner-modifier>
        <det value="un+une" />
        <modif value="false" />
      </determiner-modifier>
      <determiner-modifier>
        <det value="un+une" />
        <modif value="true" />
      </determiner-modifier>
      <determiner-modifier>
        <det value="la" />
        <modif value="false" />
      </determiner-modifier>
    </det-modif-list>
    <prepositions />
  </lexical-info>
  <arguments>
    <constituent pos="0">
      <component>
        <hum value="true" />
        <cat value="NP" />
      </component>
    </constituent>
    <constituent pos="1">
      <component>
        <hum value="true" />
        <cat value="NP" />
      </component>
    </constituent>
  </arguments>
  <all-constructions>
    <absolute-constructions>
      <construction value="true::N0 Vsup Det N à N1" />
      <construction value="true::N0 Vsup Det N" />
    </absolute-constructions>
    <relative-constructions />
    <reductions />
  </all-constructions>
  <example>
    <example value="" />
  </example>
</entry>

____________________________
Same example in text format:

ID=N_fnan_18;status=completed
lexical-info=[cat="noun",noun=[notperm=[complete="bise"],noun1="bise"]],
              Vsup=[cat="verb",list=(value="faire")],
              detN=[list-det-modif=(det-modif=[det="un+une",modif="false"],det-modif=[det="un+une",modif="true"],det-modif=[det="la",modif="false"]),
              prepositions=()]
args=(const=[pos="0",dist=(comp=[hum="true",cat="NP"])],
      const=[pos="1",dist=(comp=[hum="true",cat="NP"])])
all-constructions=[absolute=(construction="true::N0 Vsup Det N à N1",construction="true::N0 Vsup Det N"),
                   relative=(),
                   verbales=(),
                   reductionsGN=()]
example=[example=]

___________
References:

Constant, Matthieu & Tolone, Elsa (2010). A generic tool to generate a lexicon 
for NLP from Lexicon-Grammar tables. Lingue d'Europa e del Mediterraneo, 
Grammatica comparata, vol. 1, pp. 79--93. Edited by Michele De Gioia. Aracne.

Tolone, Elsa (2011). Analyse syntaxique à l'aide des tables du Lexique-Grammaire 
du français. Thèse de doctorat, LIGM, Université Paris-Est. 326 pp.
