Why Markdown in Haddock will not happen

Posted on August 30, 2013 by Fūzetsu

Today I’m going to talk about something that might disappoint a fair amount of people in the community. Namely, why Markdown syntax in Haddock is actually A Bad Idea™ and why it’s not going to happen.

Note that the following was written about a week ago, when I sat down to actually implement this so it might not be the most coherent thing ever: it was just meant to serve me as a TODO list (in fact, this was originally written in org-mode) as I go along. It is a rather long post so as I sometimes do, I include a section dedicated to summarising it all. If you are one of the proponents of an alternate syntax for Haddock, I’d appreciate it if you read the reasoning for this post before sending any angry e-mails!

Perhaps the most humorous part of this post is that it itself is written in Markdown and then converted into HTML…

Background

There has been a rather large push for Markdown to be allowed as a language for Haddock comments. In fact, there were many more propositions such as reST or even WikiCreole. The motivation is that people would not have to learn a new markup syntax in order to write documentation. I will present bits of what consists of Markdown as per the original Markdown documentation. Note that problems with Markdown have been pointed out: there is a whole plethora of Markdown flavours out there, mostly stemming from the fact that no formal specification actually exists. The differences between some common formats can be tried out using babelmark2. This is rather unfortunate because while multiple people want to see Markdown syntax available, they almost certainly want to see the Markdown syntax and rules they are used to. It might look like 10 people uniformly want Markdown but it’s actually the case that 2 people want GitHub Markdown, 2 people want Pandoc Markdown, 2 people want original Markdown and so on and so on. This means that the Markdown flavour chosen would have to be rather agnostic of any special features any of these might provide.

Furthermore, it should be noted that these suggestions were coming with respect to Haddock as it was 2-5 months ago (Feb-June 2013) which had many quirks and was a bit less expressive in terms of nested markup and such. This means that such propositions could have been motivated by:

Nevertheless, let us pretend that these problems don’t exist and let’s try to consider some Markdown that we would be implementing as part of this.

Reasoning

Original suggestions and their solutions

First I consider the markup listed in the original proposal:

With this the proposal ends. I’ll throw in some more thoughts on adding this separate syntax however, basing in on the original Markdown documentation for the features we would require, roughly in the order of appearance.

Problems with everything else

Conclusion

I have went over every relevant feature from the original Markdown and discussed suitability of a Markdown syntax mode in Haddock in terms of these features. As it turns out, there is only one thing that can reasonably be implemented: indented codeblocks. Yes, there are other things such as automatic escaping of special characters however they have nothing to do with Markdown itself, they are simply things that would be nice to have in Haddock itself and require no actual special syntax change. Would I therefore feel justified to introduce a whole new pragma for Haddock, implementing codeblocks with indentation instead of ‘@’s and butchering the rest of the original syntax?

Why is this the case? It seemed like such a good idea to a large amount of people when proposals were initially being presented. Even if you didn’t like Markdown, there were plenty of other calls for reST and Wiki syntax. It was going to be great: people don’t have to learn Haddock syntax and can concentrate on writing code more. Why can’t we have things like horizontal rules or inline HTML? I think the first sentence in the Markdown documentation after the introduction explains it pretty well: “Markdown’s syntax is intended for one purpose: to be used as a format for writing for the web.”. As it turns out, Haddock is not ‘the web’. It just happens that most people see it in action once it’s nicely rendered into XHTML and up on Hackage. It in fact also has back-ends for LaTeX and Hoogle! Does it make sense to have inline HTML tags in LaTeX? No. Does it make sense to have horizontal bars in Hoogle? No. Sure, you could argue that these backends could just ignore it but it makes no sense to allow Markdown, used as a mid-point between plain text and writing HTML by hand for Haddock. Haddock already has its own markup structures that other back-ends interface with, one of which happens to be for the web.

With all this in mind, I do not think that Markdown is something that can be reasonably added to Haddock. We could very poorly try to emulate it, butchering any existing syntax but the result would be that people who know Haddock would have to look it up and would make mistakes trying to write it due to changes AND people who know Markdown would have to look it up and would make mistakes because it’s not possible to provide an implementation that even remotely looks like what it does when writing something we’ll directly convert to HTML later.

Summary

To summarise, Markdown can’t happen simply because it’s a format for editing documents for the Web. It’s just a mid-way point between writing plain, unformulated text and writing HTML by hand. Many others exist and were suggested, such as reST and Wiki Creole. They suffer from the same problems, although often not as heavily.

While most people probably see Haddock generated documentation on Hackage, it is not what one might call a mid-point between HTML and straight up text. In fact, LaTeX and Hoogle back-ends exist which reconfirm this. Internally, the documentation comments are parsed into a markup structure that’s agnostic of which back-end it’s going to be used for. This makes adding back-ends possible. What does this mean for us? This means that there is no clear mapping of Haddock features to Markdown syntax.

Here we need to remember the reason for why Markdown was requested: it would allow people who know Markdown to write documentation instead of having to learn yet another syntax. This is also why it was more of a markup popularity contest rather than the case of picking the most suited tool for the job.

An easy example for why this goal can’t easily be achieved is definition lists: Markdown has no such thing. Worse yet, the definition list syntax is used in Markdown for something totally different, something we already have syntax for. More examples are possible and general problems with Markdown also surface: see the links to babelmark2 results in the reasoning sections.

This means that: * We end up having to redefine existing Haddock syntax when using Markdown

This makes it useless to anyone that already knows the Haddock syntax but prefers Markdown

This makes it horrible for any Markdown users as there exist tens of flavours that solve these differently and when people speak of Markdown, they actually speak of ‘the Markdown flavour that I use the most’. This also means any users have to look up how we solved those problems and what quirks this introduces, defeating the point.

Same as above, defeats the point if users have to look things up. This is also horrible for implementation side of things as we have to effectively implement, test and maintain two separate parsers and might miss out on features simply due to burden of implementation.

An easy example of this is inline HTML tags which Markdown allows. This makes no sense for LaTeX and Hoogle back-ends.

Again, while some of these things could be solved by picking a less popular markup alternative (reST or Creole), in the end we have to deal with the fact that these are not a 1:1 fit for Haddock, effectively defeating the point of being able to just jump in. If they are a 1:1 fit but everyone would have to learn it, they might as well learn Haddock syntax: there aren’t many entities and many of the quirks people complained about have been or are being ironed out.

For these reasons, Markdown will not be implemented.

I will send out the link to this post on the Haskell café mailing list so if you have any comments, please direct them there where everyone can see: this is a project for the community after all.