USER GUIDE

* OUTILEX DIRECTORY

Outilex directory includes:
- README.txt (to get started)
- install-outilex (to compile and install C++ programs)
- outilexUI.jar (to run user interface)
- clean-outilex (to clean compiled programs)
- directory 'bin' (C++ compiled programs)
- directory 'data' (some linguistic data provided with the platform)
- directory 'docs' (documentation)
- directory 'lingdef' (linguistic definitions of the set of tags used in dictionaries and graphs)


This directory also contains a log file (outilex.log) that includes all commands that have been launched from the interface. This can help users familiarizing with the syntax of the commands of the different programs.


* RUNNING OUTILEX PLATFORM
Outilex platform User Interface (UI) can be launched by typing:
java -jar outilexUI.jar

* UI GENERAL DESCRIPTION

The UI is composed of:
- a menu (on top), (Tool bar, comming soon...)
- a process/resource panel (on the left): to create personal processing chain with available linguistic resources,
- a content panel (on the right): to display linguistic resources and processing results

Important note:
Many functionalities can be run via popup menus (right-click on the mouse)
Double-clicking on a resource (in the left panel), makes it display on the rightpanel.

* GETTING STARTED

-------------------------------
- a project-oriented platform
-------------------------------

Outilex platform works with a system of project. Each system is composed of a set of resources (texts, dictionaries and grammars). Menu Project allows user to create, open and save projects. 
A project is associated with one language. This language selects a linguistic definition file. Presently, language is forced to "French".
For example, if 'french' is the project language, the set of linguistic tags that will be used in the processings is defined in the file 'lingdef/french/lingdef.xml'.

-------------------------------------
- creating your own processing chain:
-------------------------------------

Processing a text is:
+ segment text in tokens and sentences (check box 'segmentation')
+ applying dictionaries on segmented text and obtaining a text automaton representing the possible analyses for each sentence (check box 'apply dictionaries')
+ apply a cascadus of grammars in the form of graphs on the text automaton, resulting to a new text automaton (check box 'apply graphs')
+ applying a grammar to the text automaton to obtain a concordance or a modified text (e.g. an annotated text) (check box 'locate pattern').

---------------
- segmentation
--------------
The text segmentation process creates a directory associated to the text (<text>.dir) and outputs a segmented text '<text>.segmentation'  put in this directory.


--------------------
- apply dictionaries
--------------------
You must insert and select dictionaries you need (click on button 'more', click on button 'less' to close).
The process will generate a text automaton,'<text>.fsa', in the text directory.
A copy of it is also made (file '<text>-0.fsa')

--------------------
- apply graphs
-------------------
You need to define a list of graphs that will be applied in cascadus on '<text>.fsa'. Each iteration j will generate a new text automaton '<text>-j.fsa'. The final automaton is '<text>-final.fsa'. A copy of it is made in file '<text>.fsa';

'<text>.fsa' is actually the current text automaton to be processed.

Important note: graphs should have outputs such as in graph 'test.xgrf'.

-------------------
- locate pattern
-----------------
You need to select a graph to be applied and the type of result you want.

-------------------
- examples of data
------------------
The data set included in the platform is composed of:
- 3 afp texts (utf-8 texts)

- 2 dictionaries (the public French dela and a very small sample of the French dela)
FORMATS: dic.xml.gz (Outilex XML format, utf-8), dic (Unitex format, utf-16 LE) and idx (Compressed format, necessary to apply dictionary)

- 2 small grammars:
+ test.xgrf: a very modest chunker grammar: to be applied on the text automaton with process 'apply graphs'
(create new transitions in text automaton, labeled with chunks)
+ chunk.xgrf: graph identifying sequences tagged CHUNK, to be applied with process 'locate pattern' 
FORMAT: xgrf (Outilex format,XML,utf-8)


