This file is outdated.


Here are the proposals of the new API structures. These are about the only
thing not yet defined, and their acceptance will allow me to continue
development of the API.

* Block structure
Layouts will be supported from now on, but it's a very delicate issue. First of
all, not everything in life is rectangular, so ideally we should support any
kind of shape; most magazines insert images in their pages which are surrounded
by text following its shape. Supporting this, however, is difficult for several
reasons.
How would the shape be defined? It could be done using a series of dots, which
are connected by straight lines (piecewise linear). Using more complex forms,
such as Bzier curves, etc, is not in my opinion a good solution, since our use
of quantization is absolute.
Another problem is how to qualify the block contents. It's easy to check if it's
a image, but it's much harder to find if it's a mathematical expression, or
something else.
Running several different blockFinder modules may also prove a problem, since
they may run in conflict, finding blocks that already exist or that overlap.
This could be solved by marking pixels that already are part of blocks.

* Box structure/charFinder module
Once blocks are found, they are processed using charFinder. charFinder is
supposed to take advantage of information about the block to avoid images
instantly, for example. It should also mark the characters it frames with
information of the font, such as bold, italic, seriffed, or whatever can be
found. Problems:

- how to save this kind of information? It cannot be hardcoded as fields in the
structure, because new modules won't be able to save information that may be
important for them and we haven't thought of. It also would waste memory
unnecessarily.
- what about overlapping characters? It's something that is dealt currently in a
rather informal way, which can't be carried to the API as is. We could save
characters individually in their structures (struct box), but that would almost
double the memory used. It's not unreasonable, since images are likely to be
around 2000x3000, taking 6mb if we use chars in the image structure. If we use
3 bytes (24 bit color), however, that will be 18mb, and it won't be possible to
store two copies in memory, since 36mb is too heavy. 
A solution is to use a nonlinear approach: for each boxed character, save the
pixmap in the struct box, and immediately process it using the charRecognizer
modules. If the pixmap is recognized, then free it; otherwise, keep it. But it's
a bit more difficult to do, and this nonlinearity may prove problematic later
on. Any thoughts?
- any improvements to the struct box?
