
Proofing Tool GUI
V3.0 - ??.???.2016
© 2013-2016 Marco A.G.Pinto and Community Contributors.
Freely distributable and modifiable under the
Apache License
v2.0.
OUTDATED/UNFINISHED MANUAL - REQUIRES A FULL REVISION
WHEN I HAVE THE TIME
Index
1-Introduction
2-Copyright & DISCLAIMER
3-Contacts
4-Thanks
5-How it works
5.1a-Using UTF-8
5.1b-EOL Windows VS Linux
5.1c-Packing the files into Extensions
5.1d-Shortcut keys
5.2-Dictionary
5.2.1-Creating a Dictionary
5.2.2-Editing a Dictionary
5.2.3-How
Suffixes/Prefixes work
5.2.4-What is position and rule
5.2.5-Menus
5.3-Thesaurus
5.3.1-Creating a Thesaurus
5.3.2-Editing a Thesaurus
5.3.3-Menus
5.4-Hyphenation
5.4.1-Creating a
Hyphenation
5.4.2-Editing a
Hyphenation
5.4.3-Menus
5.5-Autocorrect
5.5.1-Creating
an Autocorrect
5.5.2-Editing
an Autocorrect
5.5.3-Menus
6-History
1-Introduction
This program was originally developed to easily
edit the synonyms of OpenOffice and LibreOffice.
Later, I wanted to make it compatible with Firefox and
Thunderbird, after it was possible to edit dictionaries.
I had this idea because I asked to the people in
charge of the pt_PT project, from Minho University in Portugal, what I should do
to suggest synonyms since only suggested words for the Portuguese speller were
shown.
I was told that they didn't know how to add synonyms
yet since the guy in charge of that project left it a long time ago (2006).
This is where my idea came from: develop something
easy to use since I tried to check some official tools for the task and I didn't
understand anything on them, not even how to use them.
My tool is so intuitive that even a 6-year-old kid can
use it.
My goal was that in the future someone would use it in Thunderbird and fix the
en_GB speller since it was full of typos and missing words. Since no one
volunteered, I offered myself to take this task on board.
On 25.Aug.2013 I released a
"forked" V2.00. In January 2014 my version was
officially implemented in Apache OpenOffice and the same happened with Mozilla
in May 2014. So far, 14'000+ words have been added since I embraced the project.
2-Copyright & DISCLAIMER
This program is copyrighted to Marco A.G.Pinto and
Community Contributors.
It is freely distributable and modifiable under the
Apache License
v2.0.
3-Contacts
(coder)
S.Mail: |
Marco A.G.Pinto
Apartado 3083
2746-501 Queluz
(Portugal)
|
E.Mail: |
marcoagpinto@mail.telepac.pt |
4-Thanks
Some special thanks go to:
Groups/Organisations:
- Apache Community
- LanguageTool Community
- LibreOffice Community
- Mozilla Community
- PureBasic Community
Persons:
- Alberto Simões (Minho University)
- Alexandro Colorado (Apache OpenOffice)
- Andrea Pescetti (Apache OpenOffice)
- Andreas Mantke (LibreOffice)
- Andrew Ferguson (PureBasic)
- António Manuel Dias (former pt_PT maintainer)
- Ashley Scott (PureBasic)
- Bernd Krüger-Knauber (PureBasic)
- Chris Saxon (PureBasic)
- Daniel Naber (LanguageTool)
- Filiep Spyckerelle (European Parliament)
- Frédéric Laboureur (PureBasic)
- Gervase Markham (Mozilla)
- Guy Waterval (Apache OpenOffice)
- Heinz Urban (PureBasic)
- Ian Neal (Mozilla)
- Jonathan Kew (Mozilla)
- José Almeida (Minho University)
- Kevin Scannell (Mozilla)
- Martin Srebotnjak (LanguageTool)
- Matthias Mailänder (LanguageTool)
- Pedro Marques (IADE - Creative University)
- Peter Chamberlin (Mozilla)
- Ricardo Palomares Martínez (Apache OpenOffice)
- Shantanu Oak (LibreOffice)
- srod (PureBasic)
- Stuart Swales (Apache OpenOffice)
- Thomas Schulz (PureBasic)
5-How it works
5.1a-Using UTF-8
This tool was made to work with UTF-8
encoding.
A good trick to convert the old encoding formats to
UTF-8 is to use, for example, the
Notepad++
editor for Windows.
Simply open the files with it, change the encoding to
UTF-8 using the menu: Encoding -> Convert to UTF-8
without BOM, so that accents appear well.
Then, use the Save As option and select
"Normal text file (*.txt)" and it is done.
Please don't forget to change by hand in the header of the
files, the word that has the old format, with the new one.
The headers with the font encoding are
inside the files. See for example Version 2.4 (01/09/2007) of the Italian files:
- The Dictionary (.DIC + .AFF):
The .DIC has no keyword.
The .AFF has the following keyword:
SET ISO8859-15 -> Replace with
SET UTF-8
- The Thesaurus (.DAT):
It has in the first line:
ISO8859-15 -> Replace with
UTF-8
5.1b-EOL Windows VS Linux
I have done some tests saving in Windows and Linux and
the Windows files become bigger than in Linux.
I believe this happens because the End of Line characters is different both in
Windows and in Linux.
I have edited both files to compare and both have the same number of lines
with the same words.
I believe this means that they both work, unless someone sees otherwise.
5.1c-Packing the files into Extensions
To create extensions you will have to use other package which I don't know yet.
You should use the SORT button before you can consider your
Dictionary/Thesaurus/Hyphenation ready for
being packed into an extension.
Making extensions for Mozilla seems easier than making for OpenOffice/LibreOffice,
since for them it is more complex due to the fact that they can have multiple
languages in one archive.
5.1d-Shortcut keys
TAB SWITCH RIGHT - CTR+TAB
TAB SWITCH LEFT - SHIFT+CTR+TAB
OPEN - CTR+O
SAVE - CTR+S
SAVE AS - SHIFT+CTR+S
FIND - CTR+F
ADD - CTR+A
GOTO - CTR+G
DELETE - DEL
EXIT A WINDOW & ABORT OPEN/SAVE/SAVE AS - <ESC>
QUIT - CTR+Q
5.2-Dictionary
5.2.1-Creating a Dictionary
If you have a Dictionary in memory, use PURGE
to delete all entries.
To create a Dictionary from zero you just have to press the button ADD to add
words.
Use EDIT or double-click to change information regarding the words.
Use DELETE or <DEL> to remove entries.
The format of the Dictionary is two UTF-8 format files with the extension
.DIC and .AFF .
Even though the tool reads the .AFF file, I still haven't read
documentation about how it works. This means that creating a Dictionary from
scratch will require some previous knowledge.
Now-and-then remember to SAVE/SAVE
AS to play safe.
5.2.2-Editing a Dictionary
First download the extension of the language you
intend to use, from the official pages.
You should have an .OXT or .XPI file which you
rename to .ZIP in order to extract its contents to HDD.
Press OPEN and select the .DIC file of the
Dictionary and my tool will also open the associated .AFF file.
Now just ADD/EDIT/DELETE
the current entries.
Now-and-then remember to SAVE/SAVE
AS to play safe.
5.2.3-How Suffixes/Prefixes work
A small explanation how to make suffixes/prefixes work, based on the e-mail
written by Ricardo Palomares Martínez:
While editing dictionaries, you can add one or more identifiers in front of a
word, after a "/". For example, the en_GB .AFF uses the
identifier "S" to create plural:
party/S
This will look in the .AFF file and find:
SFX S Y 9
SFX S y ies [^aeiou]y
SFX S 0 s [aeiou]y
SFX S 0 es [sxz]
SFX S 0 es [cs]h
SFX S 0 s [^cs]h
SFX S 0 s [ae]u
SFX S 0 x [ae]u
SFX S 0 s [^ae]u
SFX S 0 s [^hsuxyz]
SFX S Y 9
SFX -> It is a suffix (PFX would mean a prefix).
S -> The suffix identifier.
Y -> Y for YES. It means the rule can be
cross-used with other prefixes and suffixes.
If N the rule
can't be applied together with other affixes the word might have.
9 -> The number of lines related to this
rule.
SFX S y ies [^aeiou]y
SFX -> It is a
suffix (PFX would mean a prefix).
S ->
It is the suffix/prefix identifier.
y ->
For a suffix it is the letter(s) to be removed from the end of the word.
For a prefix, from the beginning of the word.
ies -> For a
suffix, it is the letter(s) to be added at the end of a word.
For a prefix, from the beginning of the word.
[^aeiou]y
-> Condition in regexp notation. Here, the rule is applied to words ending with
a "y" and the letter next to the last is NOT a, e, i, o or u.
Yes, the ^ means that the letters mustn't match.
So, party/S would produce:
parties
And, boy/S would produce:
boys, triggering the following rule which has a 0
saying that no letters are replaced, just added. It applies to words ending with
a "y". There is no ^
which means that the second letter from the right must be a, e, i, o or u.
SFX S 0 s [aeiou]y
Also notice that if words have capitalised letters, the Hunspell in the used
software will only accept them with capitalised letters exactly like in the
.DIC (it suggests a typo if different).
5.2.4-What
is position and rule
The derived words listicongadget has the fields: "Position" and
"Rule".
"Position" is the characters position of the first line (header) of each rule used. For
example:
SFX S Y 9
(It is a Suffix with identifier "S", "Yes" and
"9" rules in it)
Then, inside the dictionary editor, you now have a column with the rule number
after the header. Double-clicking in a listicongadget line will jump to the
header, then you will just have to scroll a few lines down to the rule number.
Please notice that the editor gadget in the add/edit word window has a
"clean" version of the .AFF with space repetitions removed in order to
be faster finding the codes (less characters to process).
5.3-Thesaurus
5.3.1-Creating a Thesaurus
If you have a Thesaurus in memory, use PURGE to delete all entries.
To create a Thesaurus from zero you just have to press the button ADD to add
synonyms.
Use EDIT or double-click to change information regarding the synonyms.
Use DELETE or <DEL> to remove entries.
The format of the Thesaurus is a UTF-8 format file with the extension
.DAT
.
Now-and-then remember to SAVE/SAVE
AS to play safe.
5.3.2-Editing a Thesaurus
First download the extension of the language you
intend to use, from the official pages.
You should have an .OXT file which you
rename to .ZIP in order to extract its contents to HDD.
Press OPEN and select the
.DAT file of the Thesaurus.
Now just ADD/EDIT/DELETE
the current entries.
Now-and-then remember to SAVE/SAVE
AS to play safe.
In build 82 (14.Aug.2015) I improved the Thesaurus
part. It is now possible to use DEL to delete synonyms and added a menu
"Thesaurus Tools" with options being the most important one the "Combine" which
combines all meanings but only works with simple lines:
x|2
a
b
would generate:
a|2
x
b
and:
b|2
a
x
In build 76 (7.Jul.2015) I speeded up the decoding of
the .AFF (first time decode) and PTG now creates .idx files for the Thesaurus.
Please notice that I have been planning the .idx for around a year or such but
today I received an e-mail asking about it and I decided to give it a try and
coded it in around 10 minutes but tested only with the pt_PT Thesaurus. If you
find any issues with the .idx please let me know.
5.3.3-Menus
5.3.x-Unduplicate simple meanings
What is the definition of a "duplicate"
meaning?
It means for example:
apple|3
one
two
one
It means that it would remove the "one"
once becoming:
apple|2
one
two
It checks line by line and not column by column:
apple|1
-|one|two|one
This wouldn't change the meanings.
5.4-Hyphenation
5.4.1-Creating a
Hyphenation
Not working yet.
5.4.2-Editing a
Hyphenation
Not working yet.
5.5-Autocorrect
5.5.1-Creating
an Autocorrect
5.5.2-Editing an Autocorrect
6-History
V3.0 - ??.???.2015
Compiled with PureBasic 5.XX.
o The manual has been improved a lot.
UNDER WORK!
On build 83-86 (17.Nov.2015):
- Compiled with PB 5.40;
- The window now has a size of 800x480 instead of 640x480;
- Replaced the "Purge" button with "Erase";
- Added a "Goto" button;
- Cleaned the code a bit;
- Speeded up some operations;
- Better UTF-8 warnings;
- In the words editor replaced the ListIconGadget field "Position" with
"Code
Position".