parsers/markdown.lua

Sat, 10 Nov 2012 04:02:30 +0000

author
Matthew Wild <mwild1@gmail.com>
date
Sat, 10 Nov 2012 04:02:30 +0000
changeset 18
a96836139ff9
parent 12
4c759312950b
permissions
-rwxr-xr-x

parsers.markdown: Make module callable, to allow parsing text as a module

0
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1 #!/usr/bin/env lua
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
2
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
3 --[[
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
4 # markdown.lua -- version 0.32
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
5
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
6 <http://www.frykholm.se/files/markdown.lua>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
7
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
8 **Author:** Niklas Frykholm, <niklas@frykholm.se>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
9 **Date:** 31 May 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
10
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
11 This is an implementation of the popular text markup language Markdown in pure Lua.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
12 Markdown can convert documents written in a simple and easy to read text format
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
13 to well-formatted HTML. For a more thourough description of Markdown and the Markdown
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
14 syntax, see <http://daringfireball.net/projects/markdown>.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
15
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
16 The original Markdown source is written in Perl and makes heavy use of advanced
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
17 regular expression techniques (such as negative look-ahead, etc) which are not available
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
18 in Lua's simple regex engine. Therefore this Lua port has been rewritten from the ground
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
19 up. It is probably not completely bug free. If you notice any bugs, please report them to
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
20 me. A unit test that exposes the error is helpful.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
21
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
22 ## Usage
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
23
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
24 require "markdown"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
25 markdown(source)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
26
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
27 ``markdown.lua`` exposes a single global function named ``markdown(s)`` which applies the
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
28 Markdown transformation to the specified string.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
29
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
30 ``markdown.lua`` can also be used directly from the command line:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
31
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
32 lua markdown.lua test.md
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
33
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
34 Creates a file ``test.html`` with the converted content of ``test.md``. Run:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
35
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
36 lua markdown.lua -h
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
37
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
38 For a description of the command-line options.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
39
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
40 ``markdown.lua`` uses the same license as Lua, the MIT license.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
41
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
42 ## License
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
43
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
44 Copyright &copy; 2008 Niklas Frykholm.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
45
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
46 Permission is hereby granted, free of charge, to any person obtaining a copy of this
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
47 software and associated documentation files (the "Software"), to deal in the Software
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
48 without restriction, including without limitation the rights to use, copy, modify, merge,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
49 publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
50 to whom the Software is furnished to do so, subject to the following conditions:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
51
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
52 The above copyright notice and this permission notice shall be included in all copies
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
53 or substantial portions of the Software.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
54
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
55 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
56 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
57 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
58 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
59 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
60 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
61 THE SOFTWARE.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
62
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
63 ## Version history
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
64
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
65 - **0.32** -- 31 May 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
66 - Fix for links containing brackets
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
67 - **0.31** -- 1 Mar 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
68 - Fix for link definitions followed by spaces
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
69 - **0.30** -- 25 Feb 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
70 - Consistent behavior with Markdown when the same link reference is reused
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
71 - **0.29** -- 24 Feb 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
72 - Fix for <pre> blocks with spaces in them
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
73 - **0.28** -- 18 Feb 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
74 - Fix for link encoding
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
75 - **0.27** -- 14 Feb 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
76 - Fix for link database links with ()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
77 - **0.26** -- 06 Feb 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
78 - Fix for nested italic and bold markers
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
79 - **0.25** -- 24 Jan 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
80 - Fix for encoding of naked <
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
81 - **0.24** -- 21 Jan 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
82 - Fix for link behavior.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
83 - **0.23** -- 10 Jan 2008
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
84 - Fix for a regression bug in longer expressions in italic or bold.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
85 - **0.22** -- 27 Dec 2007
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
86 - Fix for crash when processing blocks with a percent sign in them.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
87 - **0.21** -- 27 Dec 2007
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
88 - Fix for combined strong and emphasis tags
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
89 - **0.20** -- 13 Oct 2007
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
90 - Fix for < as well in image titles, now matches Dingus behavior
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
91 - **0.19** -- 28 Sep 2007
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
92 - Fix for quotation marks " and ampersands & in link and image titles.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
93 - **0.18** -- 28 Jul 2007
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
94 - Does not crash on unmatched tags (behaves like standard markdown)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
95 - **0.17** -- 12 Apr 2007
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
96 - Fix for links with %20 in them.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
97 - **0.16** -- 12 Apr 2007
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
98 - Do not require arg global to exist.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
99 - **0.15** -- 28 Aug 2006
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
100 - Better handling of links with underscores in them.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
101 - **0.14** -- 22 Aug 2006
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
102 - Bug for *`foo()`*
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
103 - **0.13** -- 12 Aug 2006
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
104 - Added -l option for including stylesheet inline in document.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
105 - Fixed bug in -s flag.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
106 - Fixed emphasis bug.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
107 - **0.12** -- 15 May 2006
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
108 - Fixed several bugs to comply with MarkdownTest 1.0 <http://six.pairlist.net/pipermail/markdown-discuss/2004-December/000909.html>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
109 - **0.11** -- 12 May 2006
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
110 - Fixed bug for escaping `*` and `_` inside code spans.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
111 - Added license terms.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
112 - Changed join() to table.concat().
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
113 - **0.10** -- 3 May 2006
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
114 - Initial public release.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
115
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
116 // Niklas
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
117 ]]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
118
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
119
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
120 -- Set up a table for holding local functions to avoid polluting the global namespace
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
121 local M = {}
18
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
122 local MT = {__index = _G, __call = function (M, ...) return M.markdown(...); end }
0
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
123 setmetatable(M, MT)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
124 setfenv(1, M)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
125
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
126 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
127 -- Utility functions
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
128 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
129
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
130 -- Locks table t from changes, writes an error if someone attempts to change the table.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
131 -- This is useful for detecting variables that have "accidently" been made global. Something
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
132 -- I tend to do all too much.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
133 function lock(t)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
134 function lock_new_index(t, k, v)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
135 error("module has been locked -- " .. k .. " must be declared local", 2)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
136 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
137
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
138 local mt = {__newindex = lock_new_index}
18
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
139 local orig_mt = getmetatable(t)
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
140 if orig_mt then
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
141 for k, v in pairs(orig_mt) do
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
142 if k ~= "index" then
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
143 mt[k] = orig_mt[k]
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
144 end
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
145 end
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
146 end
0
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
147 setmetatable(t, mt)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
148 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
149
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
150 -- Returns the result of mapping the values in table t through the function f
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
151 function map(t, f)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
152 local out = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
153 for k,v in pairs(t) do out[k] = f(v,k) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
154 return out
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
155 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
156
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
157 -- The identity function, useful as a placeholder.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
158 function identity(text) return text end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
159
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
160 -- Functional style if statement. (NOTE: no short circuit evaluation)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
161 function iff(t, a, b) if t then return a else return b end end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
162
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
163 -- Splits the text into an array of separate lines.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
164 function split(text, sep)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
165 sep = sep or "\n"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
166 local lines = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
167 local pos = 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
168 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
169 local b,e = text:find(sep, pos)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
170 if not b then table.insert(lines, text:sub(pos)) break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
171 table.insert(lines, text:sub(pos, b-1))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
172 pos = e + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
173 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
174 return lines
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
175 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
176
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
177 -- Converts tabs to spaces
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
178 function detab(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
179 local tab_width = 4
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
180 local function rep(match)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
181 local spaces = -match:len()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
182 while spaces<1 do spaces = spaces + tab_width end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
183 return match .. string.rep(" ", spaces)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
184 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
185 text = text:gsub("([^\n]-)\t", rep)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
186 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
187 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
188
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
189 -- Applies string.find for every pattern in the list and returns the first match
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
190 function find_first(s, patterns, index)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
191 local res = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
192 for _,p in ipairs(patterns) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
193 local match = {s:find(p, index)}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
194 if #match>0 and (#res==0 or match[1] < res[1]) then res = match end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
195 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
196 return unpack(res)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
197 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
198
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
199 -- If a replacement array is specified, the range [start, stop] in the array is replaced
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
200 -- with the replacement array and the resulting array is returned. Without a replacement
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
201 -- array the section of the array between start and stop is returned.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
202 function splice(array, start, stop, replacement)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
203 if replacement then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
204 local n = stop - start + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
205 while n > 0 do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
206 table.remove(array, start)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
207 n = n - 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
208 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
209 for i,v in ipairs(replacement) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
210 table.insert(array, start, v)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
211 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
212 return array
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
213 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
214 local res = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
215 for i = start,stop do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
216 table.insert(res, array[i])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
217 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
218 return res
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
219 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
220 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
221
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
222 -- Outdents the text one step.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
223 function outdent(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
224 text = "\n" .. text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
225 text = text:gsub("\n ? ? ?", "\n")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
226 text = text:sub(2)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
227 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
228 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
229
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
230 -- Indents the text one step.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
231 function indent(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
232 text = text:gsub("\n", "\n ")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
233 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
234 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
235
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
236 -- Does a simple tokenization of html data. Returns the data as a list of tokens.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
237 -- Each token is a table with a type field (which is either "tag" or "text") and
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
238 -- a text field (which contains the original token data).
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
239 function tokenize_html(html)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
240 local tokens = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
241 local pos = 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
242 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
243 local start = find_first(html, {"<!%-%-", "<[a-z/!$]", "<%?"}, pos)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
244 if not start then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
245 table.insert(tokens, {type="text", text=html:sub(pos)})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
246 break
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
247 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
248 if start ~= pos then table.insert(tokens, {type="text", text = html:sub(pos, start-1)}) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
249
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
250 local _, stop
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
251 if html:match("^<!%-%-", start) then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
252 _,stop = html:find("%-%->", start)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
253 elseif html:match("^<%?", start) then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
254 _,stop = html:find("?>", start)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
255 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
256 _,stop = html:find("%b<>", start)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
257 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
258 if not stop then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
259 -- error("Could not match html tag " .. html:sub(start,start+30))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
260 table.insert(tokens, {type="text", text=html:sub(start, start)})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
261 pos = start + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
262 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
263 table.insert(tokens, {type="tag", text=html:sub(start, stop)})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
264 pos = stop + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
265 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
266 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
267 return tokens
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
268 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
269
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
270 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
271 -- Hash
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
272 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
273
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
274 -- This is used to "hash" data into alphanumeric strings that are unique
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
275 -- in the document. (Note that this is not cryptographic hash, the hash
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
276 -- function is not one-way.) The hash procedure is used to protect parts
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
277 -- of the document from further processing.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
278
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
279 local HASH = {
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
280 -- Has the hash been inited.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
281 inited = false,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
282
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
283 -- The unique string prepended to all hash values. This is to ensure
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
284 -- that hash values do not accidently coincide with an actual existing
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
285 -- string in the document.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
286 identifier = "",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
287
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
288 -- Counter that counts up for each new hash instance.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
289 counter = 0,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
290
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
291 -- Hash table.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
292 table = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
293 }
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
294
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
295 -- Inits hashing. Creates a hash_identifier that doesn't occur anywhere
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
296 -- in the text.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
297 function init_hash(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
298 HASH.inited = true
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
299 HASH.identifier = ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
300 HASH.counter = 0
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
301 HASH.table = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
302
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
303 local s = "HASH"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
304 local counter = 0
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
305 local id
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
306 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
307 id = s .. counter
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
308 if not text:find(id, 1, true) then break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
309 counter = counter + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
310 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
311 HASH.identifier = id
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
312 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
313
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
314 -- Returns the hashed value for s.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
315 function hash(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
316 assert(HASH.inited)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
317 if not HASH.table[s] then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
318 HASH.counter = HASH.counter + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
319 local id = HASH.identifier .. HASH.counter .. "X"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
320 HASH.table[s] = id
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
321 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
322 return HASH.table[s]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
323 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
324
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
325 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
326 -- Protection
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
327 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
328
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
329 -- The protection module is used to "protect" parts of a document
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
330 -- so that they are not modified by subsequent processing steps.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
331 -- Protected parts are saved in a table for later unprotection
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
332
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
333 -- Protection data
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
334 local PD = {
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
335 -- Saved blocks that have been converted
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
336 blocks = {},
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
337
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
338 -- Block level tags that will be protected
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
339 tags = {"p", "div", "h1", "h2", "h3", "h4", "h5", "h6", "blockquote",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
340 "pre", "table", "dl", "ol", "ul", "script", "noscript", "form", "fieldset",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
341 "iframe", "math", "ins", "del"}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
342 }
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
343
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
344 -- Pattern for matching a block tag that begins and ends in the leftmost
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
345 -- column and may contain indented subtags, i.e.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
346 -- <div>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
347 -- A nested block.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
348 -- <div>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
349 -- Nested data.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
350 -- </div>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
351 -- </div>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
352 function block_pattern(tag)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
353 return "\n<" .. tag .. ".-\n</" .. tag .. ">[ \t]*\n"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
354 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
355
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
356 -- Pattern for matching a block tag that begins and ends with a newline
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
357 function line_pattern(tag)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
358 return "\n<" .. tag .. ".-</" .. tag .. ">[ \t]*\n"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
359 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
360
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
361 -- Protects the range of characters from start to stop in the text and
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
362 -- returns the protected string.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
363 function protect_range(text, start, stop)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
364 local s = text:sub(start, stop)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
365 local h = hash(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
366 PD.blocks[h] = s
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
367 text = text:sub(1,start) .. h .. text:sub(stop)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
368 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
369 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
370
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
371 -- Protect every part of the text that matches any of the patterns. The first
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
372 -- matching pattern is protected first, etc.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
373 function protect_matches(text, patterns)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
374 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
375 local start, stop = find_first(text, patterns)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
376 if not start then break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
377 text = protect_range(text, start, stop)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
378 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
379 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
380 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
381
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
382 -- Protects blocklevel tags in the specified text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
383 function protect(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
384 -- First protect potentially nested block tags
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
385 text = protect_matches(text, map(PD.tags, block_pattern))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
386 -- Then protect block tags at the line level.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
387 text = protect_matches(text, map(PD.tags, line_pattern))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
388 -- Protect <hr> and comment tags
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
389 text = protect_matches(text, {"\n<hr[^>]->[ \t]*\n"})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
390 text = protect_matches(text, {"\n<!%-%-.-%-%->[ \t]*\n"})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
391 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
392 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
393
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
394 -- Returns true if the string s is a hash resulting from protection
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
395 function is_protected(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
396 return PD.blocks[s]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
397 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
398
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
399 -- Unprotects the specified text by expanding all the nonces
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
400 function unprotect(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
401 for k,v in pairs(PD.blocks) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
402 v = v:gsub("%%", "%%%%")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
403 text = text:gsub(k, v)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
404 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
405 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
406 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
407
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
408
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
409 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
410 -- Block transform
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
411 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
412
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
413 -- The block transform functions transform the text on the block level.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
414 -- They work with the text as an array of lines rather than as individual
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
415 -- characters.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
416
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
417 -- Returns true if the line is a ruler of (char) characters.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
418 -- The line must contain at least three char characters and contain only spaces and
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
419 -- char characters.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
420 function is_ruler_of(line, char)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
421 if not line:match("^[ %" .. char .. "]*$") then return false end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
422 if not line:match("%" .. char .. ".*%" .. char .. ".*%" .. char) then return false end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
423 return true
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
424 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
425
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
426 -- Identifies the block level formatting present in the line
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
427 function classify(line)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
428 local info = {line = line, text = line}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
429
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
430 if line:match("^ ") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
431 info.type = "indented"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
432 info.outdented = line:sub(5)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
433 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
434 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
435
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
436 for _,c in ipairs({'*', '-', '_', '='}) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
437 if is_ruler_of(line, c) then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
438 info.type = "ruler"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
439 info.ruler_char = c
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
440 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
441 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
442 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
443
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
444 if line == "" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
445 info.type = "blank"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
446 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
447 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
448
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
449 if line:match("^(#+)[ \t]*(.-)[ \t]*#*[ \t]*$") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
450 local m1, m2 = line:match("^(#+)[ \t]*(.-)[ \t]*#*[ \t]*$")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
451 info.type = "header"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
452 info.level = m1:len()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
453 info.text = m2
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
454 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
455 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
456
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
457 if line:match("^ ? ? ?(%d+)%.[ \t]+(.+)") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
458 local number, text = line:match("^ ? ? ?(%d+)%.[ \t]+(.+)")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
459 info.type = "list_item"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
460 info.list_type = "numeric"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
461 info.number = 0 + number
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
462 info.text = text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
463 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
464 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
465
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
466 if line:match("^ ? ? ?([%*%+%-])[ \t]+(.+)") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
467 local bullet, text = line:match("^ ? ? ?([%*%+%-])[ \t]+(.+)")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
468 info.type = "list_item"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
469 info.list_type = "bullet"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
470 info.bullet = bullet
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
471 info.text= text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
472 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
473 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
474
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
475 if line:match("^>[ \t]?(.*)") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
476 info.type = "blockquote"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
477 info.text = line:match("^>[ \t]?(.*)")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
478 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
479 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
480
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
481 if is_protected(line) then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
482 info.type = "raw"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
483 info.html = unprotect(line)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
484 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
485 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
486
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
487 info.type = "normal"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
488 return info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
489 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
490
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
491 -- Find headers constisting of a normal line followed by a ruler and converts them to
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
492 -- header entries.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
493 function headers(array)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
494 local i = 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
495 while i <= #array - 1 do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
496 if array[i].type == "normal" and array[i+1].type == "ruler" and
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
497 (array[i+1].ruler_char == "-" or array[i+1].ruler_char == "=") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
498 local info = {line = array[i].line}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
499 info.text = info.line
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
500 info.type = "header"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
501 info.level = iff(array[i+1].ruler_char == "=", 1, 2)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
502 table.remove(array, i+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
503 array[i] = info
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
504 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
505 i = i + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
506 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
507 return array
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
508 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
509
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
510 -- Find list blocks and convert them to protected data blocks
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
511 function lists(array, sublist)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
512 local function process_list(arr)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
513 local function any_blanks(arr)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
514 for i = 1, #arr do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
515 if arr[i].type == "blank" then return true end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
516 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
517 return false
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
518 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
519
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
520 local function split_list_items(arr)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
521 local acc = {arr[1]}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
522 local res = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
523 for i=2,#arr do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
524 if arr[i].type == "list_item" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
525 table.insert(res, acc)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
526 acc = {arr[i]}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
527 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
528 table.insert(acc, arr[i])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
529 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
530 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
531 table.insert(res, acc)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
532 return res
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
533 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
534
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
535 local function process_list_item(lines, block)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
536 while lines[#lines].type == "blank" do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
537 table.remove(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
538 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
539
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
540 local itemtext = lines[1].text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
541 for i=2,#lines do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
542 itemtext = itemtext .. "\n" .. outdent(lines[i].line)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
543 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
544 if block then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
545 itemtext = block_transform(itemtext, true)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
546 if not itemtext:find("<pre>") then itemtext = indent(itemtext) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
547 return " <li>" .. itemtext .. "</li>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
548 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
549 local lines = split(itemtext)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
550 lines = map(lines, classify)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
551 lines = lists(lines, true)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
552 lines = blocks_to_html(lines, true)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
553 itemtext = table.concat(lines, "\n")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
554 if not itemtext:find("<pre>") then itemtext = indent(itemtext) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
555 return " <li>" .. itemtext .. "</li>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
556 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
557 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
558
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
559 local block_list = any_blanks(arr)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
560 local items = split_list_items(arr)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
561 local out = ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
562 for _, item in ipairs(items) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
563 out = out .. process_list_item(item, block_list) .. "\n"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
564 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
565 if arr[1].list_type == "numeric" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
566 return "<ol>\n" .. out .. "</ol>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
567 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
568 return "<ul>\n" .. out .. "</ul>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
569 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
570 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
571
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
572 -- Finds the range of lines composing the first list in the array. A list
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
573 -- starts with (^ list_item) or (blank list_item) and ends with
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
574 -- (blank* $) or (blank normal).
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
575 --
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
576 -- A sublist can start with just (list_item) does not need a blank...
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
577 local function find_list(array, sublist)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
578 local function find_list_start(array, sublist)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
579 if array[1].type == "list_item" then return 1 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
580 if sublist then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
581 for i = 1,#array do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
582 if array[i].type == "list_item" then return i end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
583 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
584 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
585 for i = 1, #array-1 do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
586 if array[i].type == "blank" and array[i+1].type == "list_item" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
587 return i+1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
588 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
589 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
590 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
591 return nil
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
592 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
593 local function find_list_end(array, start)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
594 local pos = #array
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
595 for i = start, #array-1 do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
596 if array[i].type == "blank" and array[i+1].type ~= "list_item"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
597 and array[i+1].type ~= "indented" and array[i+1].type ~= "blank" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
598 pos = i-1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
599 break
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
600 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
601 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
602 while pos > start and array[pos].type == "blank" do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
603 pos = pos - 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
604 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
605 return pos
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
606 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
607
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
608 local start = find_list_start(array, sublist)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
609 if not start then return nil end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
610 return start, find_list_end(array, start)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
611 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
612
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
613 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
614 local start, stop = find_list(array, sublist)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
615 if not start then break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
616 local text = process_list(splice(array, start, stop))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
617 local info = {
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
618 line = text,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
619 type = "raw",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
620 html = text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
621 }
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
622 array = splice(array, start, stop, {info})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
623 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
624
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
625 -- Convert any remaining list items to normal
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
626 for _,line in ipairs(array) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
627 if line.type == "list_item" then line.type = "normal" end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
628 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
629
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
630 return array
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
631 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
632
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
633 -- Find and convert blockquote markers.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
634 function blockquotes(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
635 local function find_blockquote(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
636 local start
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
637 for i,line in ipairs(lines) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
638 if line.type == "blockquote" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
639 start = i
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
640 break
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
641 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
642 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
643 if not start then return nil end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
644
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
645 local stop = #lines
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
646 for i = start+1, #lines do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
647 if lines[i].type == "blank" or lines[i].type == "blockquote" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
648 elseif lines[i].type == "normal" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
649 if lines[i-1].type == "blank" then stop = i-1 break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
650 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
651 stop = i-1 break
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
652 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
653 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
654 while lines[stop].type == "blank" do stop = stop - 1 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
655 return start, stop
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
656 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
657
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
658 local function process_blockquote(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
659 local raw = lines[1].text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
660 for i = 2,#lines do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
661 raw = raw .. "\n" .. lines[i].text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
662 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
663 local bt = block_transform(raw)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
664 if not bt:find("<pre>") then bt = indent(bt) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
665 return "<blockquote>\n " .. bt ..
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
666 "\n</blockquote>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
667 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
668
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
669 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
670 local start, stop = find_blockquote(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
671 if not start then break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
672 local text = process_blockquote(splice(lines, start, stop))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
673 local info = {
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
674 line = text,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
675 type = "raw",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
676 html = text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
677 }
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
678 lines = splice(lines, start, stop, {info})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
679 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
680 return lines
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
681 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
682
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
683 -- Find and convert codeblocks.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
684 function codeblocks(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
685 local function find_codeblock(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
686 local start
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
687 for i,line in ipairs(lines) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
688 if line.type == "indented" then start = i break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
689 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
690 if not start then return nil end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
691
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
692 local stop = #lines
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
693 for i = start+1, #lines do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
694 if lines[i].type ~= "indented" and lines[i].type ~= "blank" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
695 stop = i-1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
696 break
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
697 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
698 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
699 while lines[stop].type == "blank" do stop = stop - 1 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
700 return start, stop
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
701 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
702
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
703 local function process_codeblock(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
704 local raw = detab(encode_code(outdent(lines[1].line)))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
705 for i = 2,#lines do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
706 raw = raw .. "\n" .. detab(encode_code(outdent(lines[i].line)))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
707 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
708 return "<pre><code>" .. raw .. "\n</code></pre>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
709 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
710
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
711 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
712 local start, stop = find_codeblock(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
713 if not start then break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
714 local text = process_codeblock(splice(lines, start, stop))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
715 local info = {
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
716 line = text,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
717 type = "raw",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
718 html = text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
719 }
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
720 lines = splice(lines, start, stop, {info})
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
721 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
722 return lines
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
723 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
724
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
725 -- Convert lines to html code
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
726 function blocks_to_html(lines, no_paragraphs)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
727 local out = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
728 local i = 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
729 while i <= #lines do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
730 local line = lines[i]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
731 if line.type == "ruler" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
732 table.insert(out, "<hr/>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
733 elseif line.type == "raw" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
734 table.insert(out, line.html)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
735 elseif line.type == "normal" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
736 local s = line.line
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
737
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
738 while i+1 <= #lines and lines[i+1].type == "normal" do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
739 i = i + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
740 s = s .. "\n" .. lines[i].line
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
741 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
742
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
743 if no_paragraphs then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
744 table.insert(out, span_transform(s))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
745 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
746 table.insert(out, "<p>" .. span_transform(s) .. "</p>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
747 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
748 elseif line.type == "header" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
749 local s = "<h" .. line.level .. ">" .. span_transform(line.text) .. "</h" .. line.level .. ">"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
750 table.insert(out, s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
751 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
752 table.insert(out, line.line)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
753 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
754 i = i + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
755 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
756 return out
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
757 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
758
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
759 -- Perform all the block level transforms
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
760 function block_transform(text, sublist)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
761 local lines = split(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
762 lines = map(lines, classify)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
763 lines = headers(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
764 lines = lists(lines, sublist)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
765 lines = codeblocks(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
766 lines = blockquotes(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
767 lines = blocks_to_html(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
768 local text = table.concat(lines, "\n")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
769 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
770 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
771
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
772 -- Debug function for printing a line array to see the result
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
773 -- of partial transforms.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
774 function print_lines(lines)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
775 for i, line in ipairs(lines) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
776 print(i, line.type, line.text or line.line)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
777 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
778 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
779
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
780 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
781 -- Span transform
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
782 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
783
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
784 -- Functions for transforming the text at the span level.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
785
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
786 -- These characters may need to be escaped because they have a special
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
787 -- meaning in markdown.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
788 escape_chars = "'\\`*_{}[]()>#+-.!'"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
789 escape_table = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
790
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
791 function init_escape_table()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
792 escape_table = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
793 for i = 1,#escape_chars do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
794 local c = escape_chars:sub(i,i)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
795 escape_table[c] = hash(c)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
796 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
797 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
798
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
799 -- Adds a new escape to the escape table.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
800 function add_escape(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
801 if not escape_table[text] then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
802 escape_table[text] = hash(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
803 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
804 return escape_table[text]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
805 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
806
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
807 -- Escape characters that should not be disturbed by markdown.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
808 function escape_special_chars(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
809 local tokens = tokenize_html(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
810
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
811 local out = ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
812 for _, token in ipairs(tokens) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
813 local t = token.text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
814 if token.type == "tag" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
815 -- In tags, encode * and _ so they don't conflict with their use in markdown.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
816 t = t:gsub("%*", escape_table["*"])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
817 t = t:gsub("%_", escape_table["_"])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
818 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
819 t = encode_backslash_escapes(t)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
820 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
821 out = out .. t
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
822 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
823 return out
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
824 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
825
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
826 -- Encode backspace-escaped characters in the markdown source.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
827 function encode_backslash_escapes(t)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
828 for i=1,escape_chars:len() do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
829 local c = escape_chars:sub(i,i)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
830 t = t:gsub("\\%" .. c, escape_table[c])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
831 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
832 return t
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
833 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
834
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
835 -- Unescape characters that have been encoded.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
836 function unescape_special_chars(t)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
837 local tin = t
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
838 for k,v in pairs(escape_table) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
839 k = k:gsub("%%", "%%%%")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
840 t = t:gsub(v,k)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
841 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
842 if t ~= tin then t = unescape_special_chars(t) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
843 return t
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
844 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
845
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
846 -- Encode/escape certain characters inside Markdown code runs.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
847 -- The point is that in code, these characters are literals,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
848 -- and lose their special Markdown meanings.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
849 function encode_code(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
850 s = s:gsub("%&", "&amp;")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
851 s = s:gsub("<", "&lt;")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
852 s = s:gsub(">", "&gt;")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
853 for k,v in pairs(escape_table) do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
854 s = s:gsub("%"..k, v)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
855 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
856 return s
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
857 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
858
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
859 -- Handle backtick blocks.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
860 function code_spans(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
861 s = s:gsub("\\\\", escape_table["\\"])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
862 s = s:gsub("\\`", escape_table["`"])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
863
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
864 local pos = 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
865 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
866 local start, stop = s:find("`+", pos)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
867 if not start then return s end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
868 local count = stop - start + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
869 -- Find a matching numbert of backticks
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
870 local estart, estop = s:find(string.rep("`", count), stop+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
871 local brstart = s:find("\n", stop+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
872 if estart and (not brstart or estart < brstart) then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
873 local code = s:sub(stop+1, estart-1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
874 code = code:gsub("^[ \t]+", "")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
875 code = code:gsub("[ \t]+$", "")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
876 code = code:gsub(escape_table["\\"], escape_table["\\"] .. escape_table["\\"])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
877 code = code:gsub(escape_table["`"], escape_table["\\"] .. escape_table["`"])
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
878 code = "<code>" .. encode_code(code) .. "</code>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
879 code = add_escape(code)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
880 s = s:sub(1, start-1) .. code .. s:sub(estop+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
881 pos = start + code:len()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
882 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
883 pos = stop + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
884 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
885 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
886 return s
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
887 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
888
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
889 -- Encode alt text... enodes &, and ".
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
890 function encode_alt(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
891 if not s then return s end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
892 s = s:gsub('&', '&amp;')
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
893 s = s:gsub('"', '&quot;')
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
894 s = s:gsub('<', '&lt;')
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
895 return s
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
896 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
897
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
898 -- Handle image references
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
899 function images(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
900 local function reference_link(alt, id)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
901 alt = encode_alt(alt:match("%b[]"):sub(2,-2))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
902 id = id:match("%[(.*)%]"):lower()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
903 if id == "" then id = text:lower() end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
904 link_database[id] = link_database[id] or {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
905 if not link_database[id].url then return nil end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
906 local url = link_database[id].url or id
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
907 url = encode_alt(url)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
908 local title = encode_alt(link_database[id].title)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
909 if title then title = " title=\"" .. title .. "\"" else title = "" end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
910 return add_escape ('<img src="' .. url .. '" alt="' .. alt .. '"' .. title .. "/>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
911 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
912
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
913 local function inline_link(alt, link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
914 alt = encode_alt(alt:match("%b[]"):sub(2,-2))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
915 local url, title = link:match("%(<?(.-)>?[ \t]*['\"](.+)['\"]")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
916 url = url or link:match("%(<?(.-)>?%)")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
917 url = encode_alt(url)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
918 title = encode_alt(title)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
919 if title then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
920 return add_escape('<img src="' .. url .. '" alt="' .. alt .. '" title="' .. title .. '"/>')
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
921 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
922 return add_escape('<img src="' .. url .. '" alt="' .. alt .. '"/>')
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
923 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
924 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
925
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
926 text = text:gsub("!(%b[])[ \t]*\n?[ \t]*(%b[])", reference_link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
927 text = text:gsub("!(%b[])(%b())", inline_link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
928 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
929 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
930
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
931 -- Handle anchor references
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
932 function anchors(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
933 local function reference_link(text, id)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
934 text = text:match("%b[]"):sub(2,-2)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
935 id = id:match("%b[]"):sub(2,-2):lower()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
936 if id == "" then id = text:lower() end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
937 link_database[id] = link_database[id] or {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
938 if not link_database[id].url then return nil end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
939 local url = link_database[id].url or id
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
940 url = encode_alt(url)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
941 local title = encode_alt(link_database[id].title)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
942 if title then title = " title=\"" .. title .. "\"" else title = "" end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
943 return add_escape("<a href=\"" .. url .. "\"" .. title .. ">") .. text .. add_escape("</a>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
944 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
945
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
946 local function inline_link(text, link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
947 text = text:match("%b[]"):sub(2,-2)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
948 local url, title = link:match("%(<?(.-)>?[ \t]*['\"](.+)['\"]")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
949 title = encode_alt(title)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
950 url = url or link:match("%(<?(.-)>?%)") or ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
951 url = encode_alt(url)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
952 if title then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
953 return add_escape("<a href=\"" .. url .. "\" title=\"" .. title .. "\">") .. text .. "</a>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
954 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
955 return add_escape("<a href=\"" .. url .. "\">") .. text .. add_escape("</a>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
956 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
957 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
958
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
959 text = text:gsub("(%b[])[ \t]*\n?[ \t]*(%b[])", reference_link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
960 text = text:gsub("(%b[])(%b())", inline_link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
961 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
962 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
963
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
964 -- Handle auto links, i.e. <http://www.google.com/>.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
965 function auto_links(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
966 local function link(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
967 return add_escape("<a href=\"" .. s .. "\">") .. s .. "</a>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
968 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
969 -- Encode chars as a mix of dec and hex entitites to (perhaps) fool
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
970 -- spambots.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
971 local function encode_email_address(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
972 -- Use a deterministic encoding to make unit testing possible.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
973 -- Code 45% hex, 45% dec, 10% plain.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
974 local hex = {code = function(c) return "&#x" .. string.format("%x", c:byte()) .. ";" end, count = 1, rate = 0.45}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
975 local dec = {code = function(c) return "&#" .. c:byte() .. ";" end, count = 0, rate = 0.45}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
976 local plain = {code = function(c) return c end, count = 0, rate = 0.1}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
977 local codes = {hex, dec, plain}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
978 local function swap(t,k1,k2) local temp = t[k2] t[k2] = t[k1] t[k1] = temp end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
979
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
980 local out = ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
981 for i = 1,s:len() do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
982 for _,code in ipairs(codes) do code.count = code.count + code.rate end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
983 if codes[1].count < codes[2].count then swap(codes,1,2) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
984 if codes[2].count < codes[3].count then swap(codes,2,3) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
985 if codes[1].count < codes[2].count then swap(codes,1,2) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
986
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
987 local code = codes[1]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
988 local c = s:sub(i,i)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
989 -- Force encoding of "@" to make email address more invisible.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
990 if c == "@" and code == plain then code = codes[2] end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
991 out = out .. code.code(c)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
992 code.count = code.count - 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
993 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
994 return out
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
995 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
996 local function mail(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
997 s = unescape_special_chars(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
998 local address = encode_email_address("mailto:" .. s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
999 local text = encode_email_address(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1000 return add_escape("<a href=\"" .. address .. "\">") .. text .. "</a>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1001 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1002 -- links
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1003 text = text:gsub("<(https?:[^'\">%s]+)>", link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1004 text = text:gsub("<(ftp:[^'\">%s]+)>", link)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1005
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1006 -- mail
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1007 text = text:gsub("<mailto:([^'\">%s]+)>", mail)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1008 text = text:gsub("<([-.%w]+%@[-.%w]+)>", mail)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1009 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1010 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1011
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1012 -- Encode free standing amps (&) and angles (<)... note that this does not
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1013 -- encode free >.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1014 function amps_and_angles(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1015 -- encode amps not part of &..; expression
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1016 local pos = 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1017 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1018 local amp = s:find("&", pos)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1019 if not amp then break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1020 local semi = s:find(";", amp+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1021 local stop = s:find("[ \t\n&]", amp+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1022 if not semi or (stop and stop < semi) or (semi - amp) > 15 then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1023 s = s:sub(1,amp-1) .. "&amp;" .. s:sub(amp+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1024 pos = amp+1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1025 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1026 pos = amp+1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1027 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1028 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1029
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1030 -- encode naked <'s
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1031 s = s:gsub("<([^a-zA-Z/?$!])", "&lt;%1")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1032 s = s:gsub("<$", "&lt;")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1033
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1034 -- what about >, nothing done in the original markdown source to handle them
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1035 return s
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1036 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1037
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1038 -- Handles emphasis markers (* and _) in the text.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1039 function emphasis(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1040 for _, s in ipairs {"%*%*", "%_%_"} do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1041 text = text:gsub(s .. "([^%s][%*%_]?)" .. s, "<strong>%1</strong>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1042 text = text:gsub(s .. "([^%s][^<>]-[^%s][%*%_]?)" .. s, "<strong>%1</strong>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1043 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1044 for _, s in ipairs {"%*", "%_"} do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1045 text = text:gsub(s .. "([^%s_])" .. s, "<em>%1</em>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1046 text = text:gsub(s .. "(<strong>[^%s_]</strong>)" .. s, "<em>%1</em>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1047 text = text:gsub(s .. "([^%s_][^<>_]-[^%s_])" .. s, "<em>%1</em>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1048 text = text:gsub(s .. "([^<>_]-<strong>[^<>_]-</strong>[^<>_]-)" .. s, "<em>%1</em>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1049 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1050 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1051 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1052
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1053 -- Handles line break markers in the text.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1054 function line_breaks(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1055 return text:gsub(" +\n", " <br/>\n")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1056 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1057
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1058 -- Perform all span level transforms.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1059 function span_transform(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1060 text = code_spans(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1061 text = escape_special_chars(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1062 text = images(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1063 text = anchors(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1064 text = auto_links(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1065 text = amps_and_angles(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1066 text = emphasis(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1067 text = line_breaks(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1068 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1069 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1070
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1071 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1072 -- Markdown
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1073 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1074
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1075 -- Cleanup the text by normalizing some possible variations to make further
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1076 -- processing easier.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1077 function cleanup(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1078 -- Standardize line endings
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1079 text = text:gsub("\r\n", "\n") -- DOS to UNIX
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1080 text = text:gsub("\r", "\n") -- Mac to UNIX
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1081
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1082 -- Convert all tabs to spaces
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1083 text = detab(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1084
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1085 -- Strip lines with only spaces and tabs
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1086 while true do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1087 local subs
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1088 text, subs = text:gsub("\n[ \t]+\n", "\n\n")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1089 if subs == 0 then break end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1090 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1091
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1092 return "\n" .. text .. "\n"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1093 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1094
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1095 -- Strips link definitions from the text and stores the data in a lookup table.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1096 function strip_link_definitions(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1097 local linkdb = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1098
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1099 local function link_def(id, url, title)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1100 id = id:match("%[(.+)%]"):lower()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1101 linkdb[id] = linkdb[id] or {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1102 linkdb[id].url = url or linkdb[id].url
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1103 linkdb[id].title = title or linkdb[id].title
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1104 return ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1105 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1106
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1107 local def_no_title = "\n ? ? ?(%b[]):[ \t]*\n?[ \t]*<?([^%s>]+)>?[ \t]*"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1108 local def_title1 = def_no_title .. "[ \t]+\n?[ \t]*[\"'(]([^\n]+)[\"')][ \t]*"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1109 local def_title2 = def_no_title .. "[ \t]*\n[ \t]*[\"'(]([^\n]+)[\"')][ \t]*"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1110 local def_title3 = def_no_title .. "[ \t]*\n?[ \t]+[\"'(]([^\n]+)[\"')][ \t]*"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1111
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1112 text = text:gsub(def_title1, link_def)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1113 text = text:gsub(def_title2, link_def)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1114 text = text:gsub(def_title3, link_def)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1115 text = text:gsub(def_no_title, link_def)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1116 return text, linkdb
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1117 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1118
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1119 link_database = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1120
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1121 -- Main markdown processing function
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1122 function markdown(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1123 init_hash(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1124 init_escape_table()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1125
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1126 text = cleanup(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1127 text = protect(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1128 text, link_database = strip_link_definitions(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1129 text = block_transform(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1130 text = unescape_special_chars(text)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1131 return text
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1132 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1133
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1134 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1135 -- End of module
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1136 ----------------------------------------------------------------------
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1137
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1138 setfenv(1, _G)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1139 M.lock(M)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1140
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1141 -- Expose markdown function to the world
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1142 markdown = M.markdown
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1143
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1144 -- Class for parsing command-line options
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1145 local OptionParser = {}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1146 OptionParser.__index = OptionParser
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1147
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1148 -- Creates a new option parser
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1149 function OptionParser:new()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1150 local o = {short = {}, long = {}}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1151 setmetatable(o, self)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1152 return o
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1153 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1154
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1155 -- Calls f() whenever a flag with specified short and long name is encountered
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1156 function OptionParser:flag(short, long, f)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1157 local info = {type = "flag", f = f}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1158 if short then self.short[short] = info end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1159 if long then self.long[long] = info end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1160 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1161
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1162 -- Calls f(param) whenever a parameter flag with specified short and long name is encountered
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1163 function OptionParser:param(short, long, f)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1164 local info = {type = "param", f = f}
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1165 if short then self.short[short] = info end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1166 if long then self.long[long] = info end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1167 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1168
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1169 -- Calls f(v) for each non-flag argument
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1170 function OptionParser:arg(f)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1171 self.arg = f
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1172 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1173
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1174 -- Runs the option parser for the specified set of arguments. Returns true if all arguments
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1175 -- where successfully parsed and false otherwise.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1176 function OptionParser:run(args)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1177 local pos = 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1178 while pos <= #args do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1179 local arg = args[pos]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1180 if arg == "--" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1181 for i=pos+1,#args do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1182 if self.arg then self.arg(args[i]) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1183 return true
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1184 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1185 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1186 if arg:match("^%-%-") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1187 local info = self.long[arg:sub(3)]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1188 if not info then print("Unknown flag: " .. arg) return false end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1189 if info.type == "flag" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1190 info.f()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1191 pos = pos + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1192 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1193 param = args[pos+1]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1194 if not param then print("No parameter for flag: " .. arg) return false end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1195 info.f(param)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1196 pos = pos+2
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1197 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1198 elseif arg:match("^%-") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1199 for i=2,arg:len() do
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1200 local c = arg:sub(i,i)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1201 local info = self.short[c]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1202 if not info then print("Unknown flag: -" .. c) return false end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1203 if info.type == "flag" then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1204 info.f()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1205 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1206 if i == arg:len() then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1207 param = args[pos+1]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1208 if not param then print("No parameter for flag: -" .. c) return false end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1209 info.f(param)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1210 pos = pos + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1211 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1212 param = arg:sub(i+1)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1213 info.f(param)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1214 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1215 break
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1216 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1217 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1218 pos = pos + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1219 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1220 if self.arg then self.arg(arg) end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1221 pos = pos + 1
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1222 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1223 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1224 return true
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1225 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1226
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1227 -- Handles the case when markdown is run from the command line
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1228 local function run_command_line(arg)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1229 -- Generate output for input s given options
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1230 local function run(s, options)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1231 s = markdown(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1232 if not options.wrap_header then return s end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1233 local header = ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1234 if options.header then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1235 local f = io.open(options.header) or error("Could not open file: " .. options.header)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1236 header = f:read("*a")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1237 f:close()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1238 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1239 header = [[
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1240 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1241 <html>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1242 <head>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1243 <meta http-equiv="content-type" content="text/html; charset=CHARSET" />
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1244 <title>TITLE</title>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1245 <link rel="stylesheet" type="text/css" href="STYLESHEET" />
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1246 </head>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1247 <body>
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1248 ]]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1249 local title = options.title or s:match("<h1>(.-)</h1>") or s:match("<h2>(.-)</h2>") or
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1250 s:match("<h3>(.-)</h3>") or "Untitled"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1251 header = header:gsub("TITLE", title)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1252 if options.inline_style then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1253 local style = ""
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1254 local f = io.open(options.stylesheet)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1255 if f then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1256 style = f:read("*a") f:close()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1257 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1258 error("Could not include style sheet " .. options.stylesheet .. ": File not found")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1259 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1260 header = header:gsub('<link rel="stylesheet" type="text/css" href="STYLESHEET" />',
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1261 "<style type=\"text/css\"><!--\n" .. style .. "\n--></style>")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1262 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1263 header = header:gsub("STYLESHEET", options.stylesheet)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1264 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1265 header = header:gsub("CHARSET", options.charset)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1266 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1267 local footer = "</body></html>"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1268 if options.footer then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1269 local f = io.open(options.footer) or error("Could not open file: " .. options.footer)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1270 footer = f:read("*a")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1271 f:close()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1272 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1273 return header .. s .. footer
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1274 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1275
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1276 -- Generate output path name from input path name given options.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1277 local function outpath(path, options)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1278 if options.append then return path .. ".html" end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1279 local m = path:match("^(.+%.html)[^/\\]+$") if m then return m end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1280 m = path:match("^(.+%.)[^/\\]*$") if m and path ~= m .. "html" then return m .. "html" end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1281 return path .. ".html"
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1282 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1283
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1284 -- Default commandline options
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1285 local options = {
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1286 wrap_header = true,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1287 header = nil,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1288 footer = nil,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1289 charset = "utf-8",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1290 title = nil,
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1291 stylesheet = "default.css",
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1292 inline_style = false
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1293 }
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1294 local help = [[
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1295 Usage: markdown.lua [OPTION] [FILE]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1296 Runs the markdown text markup to HTML converter on each file specified on the
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1297 command line. If no files are specified, runs on standard input.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1298
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1299 No header:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1300 -n, --no-wrap Don't wrap the output in <html>... tags.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1301 Custom header:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1302 -e, --header FILE Use content of FILE for header.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1303 -f, --footer FILE Use content of FILE for footer.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1304 Generated header:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1305 -c, --charset SET Specifies charset (default utf-8).
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1306 -i, --title TITLE Specifies title (default from first <h1> tag).
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1307 -s, --style STYLE Specifies style sheet file (default default.css).
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1308 -l, --inline-style Include the style sheet file inline in the header.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1309 Generated files:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1310 -a, --append Append .html extension (instead of replacing).
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1311 Other options:
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1312 -h, --help Print this help text.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1313 -t, --test Run the unit tests.
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1314 ]]
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1315
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1316 local run_stdin = true
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1317 local op = OptionParser:new()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1318 op:flag("n", "no-wrap", function () options.wrap_header = false end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1319 op:param("e", "header", function (x) options.header = x end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1320 op:param("f", "footer", function (x) options.footer = x end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1321 op:param("c", "charset", function (x) options.charset = x end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1322 op:param("i", "title", function(x) options.title = x end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1323 op:param("s", "style", function(x) options.stylesheet = x end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1324 op:flag("l", "inline-style", function(x) options.inline_style = true end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1325 op:flag("a", "append", function() options.append = true end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1326 op:flag("t", "test", function()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1327 local n = arg[0]:gsub("markdown.lua", "markdown-tests.lua")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1328 local f = io.open(n)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1329 if f then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1330 f:close() dofile(n)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1331 else
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1332 error("Cannot find markdown-tests.lua")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1333 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1334 run_stdin = false
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1335 end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1336 op:flag("h", "help", function() print(help) run_stdin = false end)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1337 op:arg(function(path)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1338 local file = io.open(path) or error("Could not open file: " .. path)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1339 local s = file:read("*a")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1340 file:close()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1341 s = run(s, options)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1342 file = io.open(outpath(path, options), "w") or error("Could not open output file: " .. outpath(path, options))
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1343 file:write(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1344 file:close()
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1345 run_stdin = false
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1346 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1347 )
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1348
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1349 if not op:run(arg) then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1350 print(help)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1351 run_stdin = false
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1352 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1353
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1354 if run_stdin then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1355 local s = io.read("*a")
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1356 s = run(s, options)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1357 io.write(s)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1358 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1359 end
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1360
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1361 -- If we are being run from the command-line, act accordingly
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1362 if arg and arg[0]:find("markdown%.lua$") then
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1363 run_command_line(arg)
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1364 else
18
a96836139ff9 parsers.markdown: Make module callable, to allow parsing text as a module
Matthew Wild <mwild1@gmail.com>
parents: 12
diff changeset
1365 return M
0
b40ca010c49c Initial commit
Matthew Wild <mwild1@gmail.com>
parents:
diff changeset
1366 end

mercurial