Java Markdown

Markdown is a simple and convenient format to write documentations as simple text. This format is commonly used by platform such as GitHub.

Java Markdown Generator
Java Markdown Editor
Java Markdown
Java Markdown Render

In this post, we will describe how to parse and use your markdown content to produce other formats. For this purpose, we will implement the Pegdown tool available at the address https://github.com/sirthias/pegdown.

Installing Pegdown

We use the version 1.2.1 of Pegdown based on Parboiled available at https://github.com/sirthias/parboiled. Jar files of these tools are respectively available at adresses https://github.com/sirthias/pegdown/downloads and https://github.com/sirthias/parboiled/downloads. You can notice that the asm tool is also required.

I have a Markdown file that I wish to convert to PDF so that I can upload it on Speakerdeck. I am using Pandoc to convert from markdown to PDF. My problem is I can't specify what content should go on what page of the PDF, because Markdown doesn't provide any feature like that. E.g., Markdown: ###Hello. abc. def ###Bye. ghi. jkl. There are two ways to format code in Markdown. You can either use inline code, by putting backticks (`) around parts of a line, or you can use a code block, which some renderers will apply syntax highlighting to.

After having installing these tools, we will have the following jar files in our classpath:

Get the best out of Visual Studio Code for Markdown.
The Java Markdown Generator library contains helper classes that are frequently used in our projects, they will do all the hard work for you. Most elements can be created in-line: Most elements can be created in-line.

Extensible markdown java implementation. Node.js binding for Discount. Parser for JavaScript/node.js.

asm-all-4.1.jar: the asm tool
parboiled-core-1.1.4.jar: the parboiled core jar
parboiled-java-1.1.4.jar: the parboiled jar specific for Java
pegdown-1.2.1.jar: the pegdown jar

Lets dive now into how to handle markdown content.

Parsing markdown with Pegdown

Pegdown provides a processor that parses your markdown content provided as input. Following code describes how to parse markdown:

String fileName = (...) PegDownProcessor processor = new PegDownProcessor(Extensions.ALL); char[] markdown = FileUtils.readAllChars(fileName); Preconditions.checkNotNull(markdown, 'The specified file isn't found - '+fileName);

RootNode rootNode = processor.parseMarkdown(markdown);

The parseMarkdown method actually parses the content and provides a RootNode document corresponding to an object representation. You are now ready to use it to create another content (XML, and so on).

Using markdown content

We want to parse the following markdown-formatted text content. We will base on it for the rest of the post.

An introduction sentence. Another introduction sentence.

An introduction sentence.

# First header title

Some content. Some content.

* List item 1: some description * List item 2: some description

Some content. Some content.

SomeClass clazz = new SomeClass(); clazz.test();

Some content.

Here is the markdown content that will use as input of the PegDown processor:

ParaNode SuperNode TextNode SpecialTextNode TextNode SpecialTextNode ParaNode SuperNode TextNode SpecialTextNode HeaderNode TextNode ParaNode SuperNode TextNode SpecialTextNode TextNode SpecialTextNode BulletListNode ListItemNode RootNode SuperNode TextNode SpecialTextNode TextNode ListItemNode RootNode SuperNode TextNode SpecialTextNode TextNode ParaNode SuperNode TextNode SpecialTextNode TextNode SpecialTextNode VerbatimNode ParaNode SuperNode TextNode SpecialTextNode

Based on the root node returned when parsing the markdown file, we can iterate basing the getChildren method of the Node class.

Node rootNode = (...) List<Node> nodes = rootNode.getChildren(); StringBuilder content = new StringBuilder(); for (Node node : nodes) { if (node instanceof HeaderNode) { HeaderNode headerNode = (HeaderNode) node; String text = getTextContent(node); (...) } else if (node instanceof ParaNode) { ParaNode paraNode = (ParaNode) node; String text = getTextContent(node); (...) } else if (node instanceof VerbatimNode) { VerbatimNode verbatimNode = (VerbatimNode) node; String text = getTextContent(node); (...) } else if (node instanceof BulletListNode) { BulletListNode bulletListNode = (BulletListNode) node; displayNodeChildren(bulletListNode); content.append('<ul>'); List<Node> listItemNodes = bulletListNode.getChildren(); for (Node childNode : listItemNodes) { if (childNode instanceof ListItemNode) { ListItemNode listItemNode = (ListItemNode) childNode; String text = getTextContent(childNode); (...) } } content.append('</ul>'); } }

The getTextContent methods implement how to get text from different blocks like headers, paragraphes and code listings:

Java Markdown Generator

private String getTextContent(Node node) { if (node instanceof TextNode) { return getTextContent((TextNode)node); } else if (node instanceof HeaderNode) { HeaderNode headerNode = (HeaderNode) node; return getTextContent((TextNode) headerNode.getChildren().get(0)); } else if (node instanceof ParaNode) { ParaNode paraNode = (ParaNode) node; Node firstChildNode = paraNode.getChildren().get(0); if (firstChildNode instanceof SuperNode) { return getTextContent((SuperNode) firstChildNode); } else if (firstChildNode instanceof TextNode) { return getTextContent((TextNode) firstChildNode); } } else if (node instanceof ListItemNode) { ListItemNode listItemNode = (ListItemNode) node; RootNode rootNode = (RootNode) listItemNode.getChildren().get(0); Node firstChildNode = rootNode.getChildren().get(0); if (firstChildNode instanceof SuperNode) { return getTextContent((SuperNode) firstChildNode); } else if (firstChildNode instanceof TextNode) { return getTextContent((TextNode) firstChildNode); } } return null; }

private String getTextContent(SuperNode node) { List<Node> nodes = node.getChildren(); StringBuilder content = new StringBuilder(); for (Node child : nodes) { if (child instanceof TextNode) { content.append(getTextContent((TextNode)child)); } else if (child instanceof SpecialTextNode) { content.append(getTextContent((SpecialTextNode)child)); } } return content.toString(); }

private String getTextContent(TextNode node) { return node.getText(); }

Generating new content

Now we can implement the complete transformation of our markdown content to a pseudo HTML format. We wrap headers within a h2 tag and program listing within a code tag. We leave paragraphes as they are without any wrapping. Following code describes this approach:

Java Markdown Editor

Node rootNode = (...) List<Node> nodes = rootNode.getChildren(); StringBuilder content = new StringBuilder(); for (Node node : nodes) { if (node instanceof HeaderNode) { HeaderNode headerNode = (HeaderNode) node; content.append('<h2>'); String text = getTextContent(node); if (text!=null) { content.append(text); } content.append('</h2>'); content.append('nn'); } else if (node instanceof ParaNode) { ParaNode paraNode = (ParaNode) node; String text = getTextContent(node); if (text!=null) { content.append(text); } content.append('nn'); } else if (node instanceof VerbatimNode) { VerbatimNode verbatimNode = (VerbatimNode) node; content.append('<code>'); String text = getTextContent(node); if (text!=null) { content.append(text); } content.append('</code>'); content.append('nn'); } else if (node instanceof BulletListNode) { BulletListNode bulletListNode = (BulletListNode) node; content.append('<ul>'); List<Node> listItemNodes = bulletListNode.getChildren(); for (Node childNode : listItemNodes) { if (childNode instanceof ListItemNode) { ListItemNode listItemNode = (ListItemNode) childNode; content.append('<li>'); String text = getTextContent(childNode); if (text!=null) { content.append(text); } content.append('</li>'); } } content.append('</ul>'); } }

Here is the final output:

An introduction sentence. Another introduction sentence.

An introduction sentence.

<h2>First header title</h2

Some content. Some content.

<ul> <li>List item 1: some description</li> <li>List item 2: some description</li> </ul>

Some content. Some content.

Java Markdown

<code> SomeClass clazz = new SomeClass(); clazz.test(); </code>

Java Markdown Render

Some content.