aboutsummaryrefslogtreecommitdiff
path: root/src/Text
AgeCommit message (Collapse)AuthorFilesLines
2018-10-21Man reader: moved all lexer functions to the front.John MacFarlane1-29/+29
2018-10-21Man reader: Moved handling of P, PP, LP to parser phase.John MacFarlane1-5/+7
2018-10-21Man reader: added type synonym for Arg.John MacFarlane1-9/+11
2018-10-21Man reader: Moved handling of B, I, BI, IB, etc. to parsing phase.John MacFarlane1-32/+36
Ultimately groff lexing should not handle man-specific macros. This approach also gives more correct results for the test case.
2018-10-21Man reader: Clean up inline parsing.John MacFarlane1-11/+16
2018-10-21Man reader: move macro resolution to lexer phase.John MacFarlane1-76/+88
We also introduce a new type ManTokens (a sequence of tokens) and remove MComment. This allows lexers to return empty strings of tokens, or multiple tokens (as when macros are resolved). One test still fails. This needs to be fixed by moving handling of .BI, .I, etc. to the parsing phase.
2018-10-21Muse writer: use lightweight markup after </em> tagAlexander Krotov1-0/+1
2018-10-20Man reader: allow unescaped " in plain arguments.John MacFarlane1-1/+6
2018-10-20Man reader: support UR/UE, MT/ME for links.John MacFarlane1-3/+22
Closes #4989.
2018-10-20Man reader: Fixed handling of nested fonts.John MacFarlane1-19/+36
Closes #4978.
2018-10-21Muse reader: make sure that the whole text is parsedAlexander Krotov1-0/+1
2018-10-21Muse reader: allow empty headersAlexander Krotov1-7/+1
Previously empty headers caused parser to terminate without parsing the rest of the document.
2018-10-20Man reader: Fix .B, .I, .BR, etc.John MacFarlane1-18/+36
2018-10-20Man reader: major restructuring, support macros.John MacFarlane1-139/+137
- Improved support for custom macro definitions. - LinePart type has been added. RoffStr is now one constructor of LinePart (the other being MacroArg). - MComment has lost its argument. - MEndMacro has been removed. - MStr has been removed (we now simply use LinePart). - Macros now store a list of tokens. - Each macro argument is a [LinePart], instead of a LinePart. - .BR now behaves as documented in man (and doesn't create a link).
2018-10-20Man reader: some support for custom macros.John MacFarlane1-17/+39
2018-10-20Man reader: skip macro definitions for now.John MacFarlane1-0/+12
2018-10-20Man reader: raise parse error if we don't get through whole input.John MacFarlane1-1/+1
2018-10-20Man reader: support `\*[lq]`, `\*[rq]`.John MacFarlane1-2/+2
2018-10-20Man reader: support '..' (end macro).John MacFarlane1-8/+10
Also give feedback for unknown character codes, and return a replacement character U+FFFD.
2018-10-20Man reader: handle lines with just one period.John MacFarlane1-1/+2
2018-10-20Man reader: block quotes (using RS..RE).John MacFarlane1-0/+4
2018-10-20Man reader: parse TP as definition lists.John MacFarlane1-25/+36
Closes #4981.
2018-10-20Man reader: handle shift in list style.John MacFarlane1-27/+33
Closes #4987.
2018-10-20Man reader: minor refactoring.John MacFarlane1-9/+9
2018-10-20Powerpoint: Support raw openxml in pptx writer.Jesse Rosenthal2-12/+25
This allows raw openxml blocks and inlines to be used in the pptx writer. A few caveats: 1. It's up to the user to write well-formed openxml. The chances for corruption, especially with such a brittle format as pptx, is pretty high. 2. Because of the tricky way that blocks map onto shapes, if you are using a raw block, it should be the only block on a slide (otherwise other text might end up overlapping it). 3. The pptx ooxml namespace abbreviations are different from the docx ooxml namespaces. Again, it's up to the user to get it right. Unzipped document and ooxml specification should be consulted. Closes: #4976
2018-10-19Man reader: skip optional .IP before code block.John MacFarlane1-0/+5
2018-10-19Man reader: improve treatment of .TH.John MacFarlane1-12/+8
This should just add to metadata (title, date, section), and not produce a level-1 header. (That might be done in the template, depending on the output format.)
2018-10-19Man reader: remove commented-out code.John MacFarlane1-23/+0
2018-10-19Man reader: Improved header parsing.John MacFarlane1-10/+11
- .SH should be level 1, .SS level 2. - The header title can come on the next line.
2018-10-19Man writer: avoid unnecessary `.RS`/`.RE` pair in defn lists.John MacFarlane1-1/+3
When the definition is just one paragraph, we don't need the `.RS\n.RE`.
2018-10-19Man reader: properly handle multi-block list items.John MacFarlane1-6/+6
Closes #4985.
2018-10-19Man reader: minor refactoring.John MacFarlane1-6/+14
2018-10-19Man reader: Nicer looking "skipped content" report.John MacFarlane1-1/+3
Just give the macro name, which users will recognize, rather than the internal token.
2018-10-19Man reader: got rid of MUnknownMacro and simplified code.John MacFarlane1-21/+4
2018-10-19Man reader: remove algebraic type for MacroKind.John MacFarlane1-22/+8
Instead, just use a String for the literal macro. This makes the code easier to follow and yields better info messages for ignored content. Closes #4980.
2018-10-19Use man reader for files with extension dot + digit.John MacFarlane1-0/+1
2018-10-19Man reader: minor improvements.John MacFarlane1-3/+3
use `trimInlines` for Para content to avoid leading and trailing spaces. Fix handling of \" in middle of line. Add more tests for escapes.
2018-10-19Man reader: generate Space elements correctly.John MacFarlane1-4/+4
Closes #4979.
2018-10-18Man reader: improve list parsing.John MacFarlane1-20/+13
We now handle all kinds of ordered list markers. We also avoid having an extra bullet character in bullet list contents.
2018-10-18Man reader: remove final newline in code blocks.John MacFarlane1-1/+5
This is consistent with other readers.
2018-10-18Man reader: use report instead of logMessage.John MacFarlane1-2/+6
2018-10-18Man reader: improved parsing of groff escapes.John MacFarlane1-80/+116
We now handle all the named escapes, plus combining accents and unicode escapes.
2018-10-18GroffChar: fixed interpretation of `\-`.John MacFarlane1-1/+1
It is the ascii - sign, not the unicode hyphen.
2018-10-18Merge branch 'Yanpas-groff_reader'John MacFarlane2-0/+561
2018-10-18Remove unneeded import.John MacFarlane1-1/+0
2018-10-18Groff escaping changes.John MacFarlane4-63/+70
- `--ascii` is now turned on automatically for man output, for portability. All man output will be escaped to ASCII. - In T.P.Writers.Groff, `escapeChar`, `escapeString`, and `escapeCode` now take a boolean parameter that selects ascii-only output. This is used by the Ms writer for `--ascii`, instead of doing an extra pass after writing the document. - In ms output without `--ascii`, unicode is used whenever possible (e.g. for double quotes). - A few escapes are changed: e.g. `\[rs]` instead of `\\` for backslash, and `\ga]` instead of `` \` `` for backtick.
2018-10-18Add Text.Pandoc.GroffChar.John MacFarlane2-20/+420
This will hold common escaping data for groff characters.
2018-10-17man/ms writers: use `\[at]` for escaped `@`.John MacFarlane1-1/+1
2018-10-17Move common groff functions to Text.Pandoc.Writers.GroffJohn MacFarlane4-151/+155
(unexported module). These are used in both the man and ms writers. Moved groffEscape out of Text.Pandoc.Writers.Shared [cancels earlier API change from adding it, which was after last release]. This fixes strong/code combination on man (should be `\f[CB]` not `\f[BC]`), mentioned in #4973. Updated tests. Closes #4975.
2018-10-17Man writer: use \f[R] instead of \f[] to reset fontAlexander Krotov1-2/+2
Fixes #4973