HPDF 1.1 : Introducing typesetting

I have finally released HPDF 1.1 with some typesetting features. More details are in this post. It is very experimental but working. I am not happy at all with the API of HPDF but I have no choice. I need, for another project, to add features as fast as possible. I'll think about the elegance of the API later even if I need to change lots of things in HPDF (not really a good development methodology I agree). In addition to the typesetting features, I corrected lots of problems, optimized the code and changed a little bit the image API.

1. Typesetting a paragraph

Typesetting is a complex thing and the current implementation is very limited. You'll be able to use it to generate slides but not a book. There is no support for several pages (they have to be created manually. The typesetting code is assuming that the whole output is on the same page).

I have focused on the line breaking algorithm and on styles. Here is an example:

This example was created thanks to a ParagraphStyle.

Here is a part of the ParagraphStyle class (the full definition is in the Haddock documentation):

class ParagraphStyle a where
    lineWidth :: a -> PDFFloat -> Int -> PDFFloat
    linePosition :: a -> PDFFloat -> Int -> PDFFloat
    paraChange :: a -> [Letter] -> (a,[Letter])
    paragraphStyle :: a -> Maybe (Rectangle -> Draw b -> Draw ())

Let's see how this interface can be used to create the above example. When a paragraph monad is run, the text is transformed into a sequence of Letters. Here is a part of the Letter type:

data Letter  = Letter BoxDimension !AnyBox !(Maybe AnyStyle) 
             | Glue !PDFFloat !PDFFloat !PDFFloat !(Maybe AnyStyle)
             | AChar !AnyStyle !Char !PDFFloat

In this definition, a Letter is in fact any object that can be displayed. Perhaps I should have named it : generalized letter.

A sequence of letters is processed by the paraChange function. In the style used for the previous example, this function is removing the first letter of the paragraph (AChar), and is replacing it with a generalized Letter containing a colored bigger picture. The size of this new letter is remembered in a new version of the style. That's why paraChange is returning a new style in addition to a new sequence of letters.

Then, when the linebreaking algorithm is called, it is using the lineWidth and linePosition function to know the shape of the paragraph. The shape of the paragraph is dependent on the size of the letter recorded in the previous step.

There is a final trick : the new bigger letter should not change the interline space. So, the Box created by paraChange to contain the letter has null dimensions. Its only function is to display a big letter.

Finally, when the lines are displayed, the style function paragraphStyle is used. For that function, a paragraph is a sequence of lines with the same paragraph style. One argument of that function is the paragraph bounding rectangle. It is used to draw the red border and fill the paragraph background.

So, to display the previous example you finally just need to write:

setStyle BlueStyle
setParaStyle (BluePara 0)
paragraph $ do
    txt $ "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do"
    txt $ "eiusmod tempor incididunt ut labore et dolore magna aliqua. "
    txt $ "Ut enim ad minim veniam, quis nostrud exercitation ullamco "
    txt $ "laboris nisi ut aliquip ex ea commodo consequat. Duis aute "
    txt $ "irure dolor  in reprehenderit in voluptate velit esse cillum "
    txt $ "dolore eu fugiat nulla pariatur. Excepteur sint occaecat "
    txt $ "cupidatat non proident, sunt in cu"

The paragraph style is doing all the work.

2. Sentence and word styles

Here is a new example:

The method is the same. Instead of using a paragraph style, I am using sentence and word styles. The first style used in this example is a sentence style responsible for drawing a red rectangle around words. Note that for a sentence style, the unit of processing is the line. So, if a sentence is broken by the line breaking algorithm then it will be processed as several sentences (think about an URL).

After the red rectangle, the picture is containing an example of a word style. At the beginning of the style, a random generator is started and used to style the words. It means that the style is updated from word to word. It is not visible on this screenshot because there is a bug and the update was not occuring. I have corrected it before uploading the library to hackage ... and I hope this quick fix has not introduced other problems.

Finally, the last example is using a sentence style and a word style. The word style is styling the words and the glues in a different way. The sentence style is drawing a blue rectangle under the text, and a blue line over the text.

Note that the styling functions are receiving a Draw monad value as argument so they can potentially do much more like for instance rotating each word etc...

3. Paragraph shape

Another example:

This last example is using another paragraph style to fill a circle with text. Note that the display is stopping as soon as there is a line outside of the bounding rectangle bottom frontier.

4. Conclusion

It is perhaps too much work for a person alone during his spare time :-) In a next post, perhaps, I'll try to explain how allegories (an extension of category theory) are relevant to the problem of designing a line breaking algorithm and how I could improve my current algorithm.

You can find the lib on hackage

Once the lib is installed, go to the test folder, type make demo and then ./test. It should create the demo.pdf file.

It was tested with GHC only.

(This post was imported from my old blog. The date is the date of the import. The comments where not imported.)

blog comments powered by Disqus