aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/UTF8.hs
AgeCommit message (Collapse)AuthorFilesLines
2013-02-08UTF8: Strip off BOM if present.John MacFarlane1-2/+9
Closes #743.
2013-01-06UTF8 module: Remove `\r` when reading.John MacFarlane1-4/+7
This should prevent problems with extra CRs on windows.
2012-09-29UTF8: Removed unneeded imports.John MacFarlane1-5/+0
2012-09-26UTF8: Better error message for invalid UTF8.John MacFarlane1-4/+6
Read bytestring and use Text's decodeUtf8 instead of using System.IO's hGetContents. This way you get a message saying "invalid UTF-8 stream" instead of "invalid byte sequence." You are also told which byte caused the problem.
2012-09-25Removed need for utf8-string package.John MacFarlane1-3/+27
* Depend on text. * Expose Text.Pandoc.UTF8. * Text.Pandoc.UTF8 now exports toString, fromString, toStringLazy, fromStringLazy. * These are used instead of the old utf8-string functions.
2012-09-25UTF8: use universalNewlineMode in reading.John MacFarlane1-1/+2
This treats both '\r\n' and '\n' as '\n' on input, no matter what platform we're running on.
2012-09-23Revert "More intelligent handling of text encodings."John MacFarlane1-16/+4
This reverts commit 7272735b3d413a644fd9ab01eeae8ae9cd5a925b.
2012-09-23More intelligent handling of text encodings.John MacFarlane1-4/+16
Previously, UTF-8 was enforced for both input and output. The new system: * For input, UTF-8 is tried first; if an error is raised, the locale encoding is tried. * For output, the locale encoding is always used.
2012-09-23Removed unneeded CPP conditional.John MacFarlane1-44/+0
Removed code that was conditional on base < 4.2, since now we require base >= 4.2.
2012-09-23UTF8: Export decodeArg.John MacFarlane1-1/+5
2012-09-23Export encodePath/decodePath from UTF8.John MacFarlane1-0/+1
Removed duplicate code in src/pandoc.hs.
2012-07-26Fixed whitespace errors.John MacFarlane1-1/+1
2012-06-25Test for base 4.4.0 instead of 4.5.0 for argument/filename encoding.John MacFarlane1-2/+2
2012-06-24Don't encode/decode file paths if base >= 4.5.John MacFarlane1-6/+16
Prior to base 4.5 (and perhaps earlier - check), filepaths and command line arguments were treated as unencoded lists of bytes, not unicode strings, so we had to work around that by encoding and decoding them. This commit adds CPP checks for base 4.5 that disable the encoding/decoding. Fixes a bug with multilingual filenames when pandoc was compiled with ghc 7.4. Closes #540.
2011-02-11UTF8: Encode filenames.John MacFarlane1-2/+3
(This is still needed, even with recent base.) Partially resolves Issue #286 (though now there is a new markdown2pdf problem).
2011-01-30UTF8: Use #if instead of #ifdef.John MacFarlane1-1/+1
2011-01-30UTF8 module: Use base 4.2 IO if available.John MacFarlane1-1/+44
This gives us proper line endings on windows, and some speed improvements. We fall back to the old functions if base < 4.2. hGetContents is now exported.
2010-09-10Encode filenames as UTF8.John MacFarlane1-2/+3
Resolves Issue #252 (pandoc doesn't properly handle unicode filenames).
2010-07-21Changed to using strict bytestrings in UTF8 module.John MacFarlane1-2/+2
This avoids a problem on Windows reading from stdin. Previously we'd get an error from hGetBufNonBlocking.
2010-05-06UTF8: Modified readFile and getContents to strip BOM if present.John MacFarlane1-2/+9
2010-05-06Added Text.Pandoc.UTF8 for portable UTF8 string IO.John MacFarlane1-0/+65
2007-11-29Moved everything from src into the top-level directory.fiddlosopher1-45/+0
git-svn-id: https://pandoc.googlecode.com/svn/trunk@1104 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-11-03Reverted back to state as of r1062. The template haskell changesfiddlosopher1-0/+45
are more trouble than they're worth. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1064 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-11-03Use template haskell to avoid the need for templates:fiddlosopher1-45/+0
+ Added library Text.Pandoc.Include, with a template haskell function $(includeStrFrom fname) to include a file as a string constant at compile time. + This removes the need for the 'templates' directory or Makefile target. These have been removed. + The base source directory has been changed from src to . + A new 'data' directory has been added, containing the ASCIIMathML.js script, writer headers, and S5 files. + The src/wrappers directory has been moved to 'wrappers'. + The Text.Pandoc.ASCIIMathML library is no longer needed, since Text.Pandoc.Writers.HTML can use includeStrFrom to include the ASCIIMathML.js code directly. It has been removed. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1063 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-10-27Modified fromUTF8 to strip out the BOM (byte order marker)fiddlosopher1-0/+1
wherever it is present. See http://en.wikipedia.org/wiki/Byte_Order_Mark and http://six.pairlist.net/pipermail/markdown-discuss/2007-October/000874.html. git-svn-id: https://pandoc.googlecode.com/svn/trunk@1054 788f1e2b-df1e-0410-8736-df70ead52e1b
2007-07-13Changed encodeUTF8 to toUTF8, decodeUTF8 to fromUTF8,fiddlosopher1-16/+16
for clarity. git-svn-id: https://pandoc.googlecode.com/svn/trunk@692 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-12-20+ Added module data for haddock.fiddlosopher1-1/+2
+ Reformatted code consistently. git-svn-id: https://pandoc.googlecode.com/svn/trunk@252 788f1e2b-df1e-0410-8736-df70ead52e1b
2006-10-17initial importfiddlosopher1-0/+43
git-svn-id: https://pandoc.googlecode.com/svn/trunk@2 788f1e2b-df1e-0410-8736-df70ead52e1b