Proofing Tool GUI
V3.0 - ??.???.2015
© 2013-2015 Marco A.G.Pinto and Community Contributors.
Freely distributable and modifiable under the
Apache License v2.0.


Index
1-Introduction
2-Copyright & DISCLAIMER
3-Contacts
4-Thanks
5-How it works
  5.1a-Using UTF-8
  5.1b-EOL Windows VS Linux
  5.1c-Packing the files into Extensions
  5.1d-Shortcut keys
  5.2-Dictionary
      5.2.1-Creating a Dictionary
      5.2.2-Editing a Dictionary
      5.2.3-How Suffixes/Prefixes work
      5.2.4-What is position and rule
  5.3-Thesaurus
      5.3.1-Creating a Thesaurus
      5.3.2-Editing a Thesaurus
  5.4-Hyphenation
      5.4.1-Creating a Hyphenation
      5.4.2-Editing a Hyphenation
6-History



1-Introduction
This program was originally developed to easily edit the synonyms of OpenOffice and LibreOffice.

Later, I wanted to make it compatible with Firefox and Thunderbird, after it was possible to edit dictionaries.

I had this idea because I asked to the people in charge of the pt_PT project, from Minho University in Portugal, what I should do to suggest synonyms since only suggested words for the Portuguese speller were shown.

I was told that they didn't know how to add synonyms yet since the guy in charge of that project left it a long time ago (2006).

This is where my idea came from: develop something easy to use since I tried to check some official tools for the task and I didn't understand anything on them, not even how to use them.

My tool is so intuitive that even a 6-year-old kid can use it.

My goal was that in the future someone would use it in Thunderbird and fix the en_GB speller since it was full of typos and missing words. Since no one volunteered, I offered myself to take this task on board.


On 25.Aug.2013 I released a "forked" V2.00. In January 2014 my version was officially implemented in Apache OpenOffice and the same happened with Mozilla in May 2014. So far, 14'000+ words have been added since I embraced the project.



2-Copyright & DISCLAIMER
This program is copyrighted to Marco A.G.Pinto and Community Contributors.

It is freely distributable and modifiable under the Apache License v2.0.


3-Contacts
(coder)

S.Mail: Marco A.G.Pinto
Apartado 3083
2746-501 Queluz
(Portugal)
E.Mail: marcoagpinto@mail.telepac.pt


4-Thanks
Some special thanks go to:
Groups/Organisations:
 - Apache Community
 - LanguageTool Community
 - Mozilla Community
 - PureBasic Community

Persons:

 - Alberto Simões (Minho University)
 - Alexandro Colorado (Apache OpenOffice)
 - Andrea Pescetti (Apache OpenOffice)
 - Andrew Ferguson (PureBasic)

 - António Manuel Dias (former pt_PT maintainer)
 - Ashley Scott (PureBasic)
 - Bernd Krüger-Knauber (PureBasic)
 - Chris Saxon (PureBasic)
 - Daniel Naber (LanguageTool)
 - Filiep Spyckerelle (European Parliament)
 - Frédéric Laboureur (PureBasic)
 - Gervase Markham (Mozilla)

 - Guy Waterval (Apache OpenOffice)
 - Heinz Urban (PureBasic)
 - Ian Neal (Mozilla)
 - Jonathan Kew (Mozilla)

 - José Almeida (Minho University)
 - Kevin Scannell (Mozilla)
 - Martin Srebotnjak (LanguageTool)
 - Matthias Mailänder (LanguageTool)

 - Pedro Marques (IADE - Creative University)
 - Peter Chamberlin (Mozilla)
 - Ricardo Palomares Martínez (Apache OpenOffice)
 - Shantanu Oak (LibreOffice)
 - srod (PureBasic)
 - Stuart Swales (Apache OpenOffice)
 - Thomas Schulz (PureBasic)


5-How it works
5.1a-Using UTF-8
This tool was made to work with UTF-8 encoding.

A good trick to convert the old encoding formats to UTF-8 is to use, for example, the Notepad++ editor for Windows.

Simply open the files with it, change the encoding to UTF-8 using the menu: Encoding -> Convert to UTF-8 without BOM, so that accents appear well.

Then, use the Save As option and select "Normal text file (*.txt)" and it is done.

Please don't forget to change by hand in the header of the files, the word that has the old format, with the new one.

The headers with the font encoding are inside the files. See for example Version 2.4 (01/09/2007) of the Italian files:
- The Dictionary (.DIC + .AFF):
The .DIC has no keyword.

The .AFF has the following keyword:
SET ISO8859-15 -> Replace with SET UTF-8

- The Thesaurus (.DAT):
It has in the first line:
ISO8859-15 -> Replace with UTF-8


5.1b-EOL Windows VS Linux
I have done some tests saving in Windows and Linux and the Windows files become bigger than in Linux.

I believe this happens because the End of Line characters is different both in Windows and in Linux.

I have edited both files to compare and both have the same number of lines with the same words.

I believe this means that they both work, unless someone sees otherwise.



5.1c-Packing the files into Extensions
To create extensions you will have to use other package which I don't know yet.

You should use the SORT button before you can consider your Dictionary/Thesaurus/Hyphenation ready for being packed into an extension.

Making extensions for Mozilla seems easier than making for OpenOffice/LibreOffice, since for them it is more complex due to the fact that they can have multiple languages in one archive.


5.1d-Shortcut keys
TAB SWITCH RIGHT - CTR+TAB
TAB SWITCH LEFT - SHIFT+CTR+TAB
OPEN - CTR+O
SAVE - CTR+S
SAVE AS - SHIFT+CTR+S
FIND - CTR+F
ADD - CTR+A
DELETE - DEL
EXIT A WINDOW & ABORT OPEN/SAVE/SAVE AS - <ESC>
QUIT - CTR+Q



5.2-Dictionary
5.2.1-Creating a Dictionary
If you have a Dictionary in memory, use PURGE to delete all entries.

To create a Dictionary from zero you just have to press the button ADD to add words.

Use EDIT or double-click to change information regarding the words.

Use DELETE or <DEL> to remove entries.

The format of the Dictionary is two UTF-8 format files with the extension .DIC and .AFF .

Even though the tool reads the .AFF file, I still haven't read documentation about how it works. This means that creating a Dictionary from scratch will require some previous knowledge.

Now-and-then remember to SAVE/SAVE AS to play safe.


5.2.2-Editing a Dictionary
First download the extension of the language you intend to use, from the official pages.

You should have an .OXT or .XPI file which you rename to .ZIP in order to extract its contents to HDD.

Press OPEN and select the .DIC file of the Dictionary and my tool will also open the associated .AFF file.

Now just ADD/EDIT/DELETE the current entries.

Now-and-then remember to SAVE/SAVE AS to play safe.


5.2.3-How Suffixes/Prefixes work
A small explanation how to make suffixes/prefixes work, based on the e-mail written by Ricardo Palomares Martínez:

While editing dictionaries, you can add one or more identifiers in front of a word, after a "/". For example, the en_GB .AFF uses the identifier "S" to create plural:
party/S

This will look in the .AFF file and find:
SFX S Y 9
SFX S y ies [^aeiou]y
SFX S 0 s [aeiou]y
SFX S 0 es [sxz]
SFX S 0 es [cs]h
SFX S 0 s [^cs]h
SFX S 0 s [ae]u
SFX S 0 x [ae]u
SFX S 0 s [^ae]u
SFX S 0 s [^hsuxyz]

SFX S Y 9
SFX -> It is a suffix (PFX would mean a prefix).
S   -> The suffix identifier.
Y   -> Y for YES. It means the rule can be cross-used with other prefixes and suffixes.
       If N the rule can't be applied together with other affixes the word might have.
9   -> The number of lines related to this rule.

SFX S y ies [^aeiou]y
SFX       -> It is a suffix (PFX would mean a prefix).
S         -> It is the suffix/prefix identifier.
y         -> For a suffix it is the letter(s) to be removed from the end of the word.
             For a prefix, from the beginning of the word.
ies       -> For a suffix, it is the letter(s) to be added at the end of a word.
             For a prefix, from the beginning of the word.

[^aeiou]y -> Condition in regexp notation. Here, the rule is applied to words ending with
             a "y" and the letter next to the last is NOT a, e, i, o or u.
             Yes, the ^ means that the letters mustn't match.

So, party/S would produce: parties

And, boy/S would produce: boys, triggering the following rule which has a 0 saying that no letters are replaced, just added. It applies to words ending with a "y". There is no ^ which means that the second letter from the right must be a, e, i, o or u.
SFX S 0 s [aeiou]y

Also notice that if words have capitalised letters, the Hunspell in the used software will only accept them with capitalised letters exactly like in the .DIC (it suggests a typo if different).


5.2.4-What is position and rule
The derived words listicongadget has the fields: "Position" and "Rule".

"Position" is the characters position of the first line (header) of each rule used. For example:
SFX S Y 9
(It is a Suffix with identifier "S", "Yes" and "9" rules in it)

Then, inside the dictionary editor, you now have a column with the rule number after the header. Double-clicking in a listicongadget line will jump to the header, then you will just have to scroll a few lines down to the rule number.

Please notice that the editor gadget in the add/edit word window has a "clean" version of the .AFF with space repetitions removed in order to be faster finding the codes (less characters to process).


5.3-Thesaurus
5.3.1-Creating a Thesaurus
If you have a Thesaurus in memory, use PURGE to delete all entries.

To create a Thesaurus from zero you just have to press the button ADD to add synonyms.

Use EDIT or double-click to change information regarding the synonyms.

Use DELETE or <DEL> to remove entries.

The format of the Thesaurus is a UTF-8 format file with the extension .DAT .

Now-and-then remember to SAVE/SAVE AS to play safe.


5.3.2-Editing a Thesaurus
First download the extension of the language you intend to use, from the official pages.

You should have an .OXT file which you rename to .ZIP in order to extract its contents to HDD.

Press OPEN and select the .DAT file of the Thesaurus.

Now just ADD/EDIT/DELETE the current entries.

Now-and-then remember to SAVE/SAVE AS to play safe.



5.4-Hyphenation
5.4.1-Creating a Hyphenation
Not working yet.


5.4.2-Editing a Hyphenation
Not working yet.


6-History

V3.0 - ??.???.2015
Compiled with PureBasic 5.30.

o The manual has been improved a lot.