Alexis Kypridemos
<h1>
tags? Is each new line truly a paragraph, and how does linking content fit into this?<h1>
, <p>
, <a>
, and <img>
. In other words, it will be possible to include top-level headings, body text, linked text, and images. There will be no support for bulleted or ordered lists, tables, or any other elements for this particular tool.<script>
and you’re good to go. WYSIWYG editors are powerful and support all kinds of formatting, even applying CSS classes to content for styling.<h1>
and <p>
tags.<textarea>
. For the output element and related styling, choices abound. The following is merely one example with some very basic CSS to place the input <textarea>
on the left and an output <div>
on the right:onkeyup
event handler on the <textarea>
to call a JavaScript function called convert()
that does what it says: convert the plain text into HTML. The conversion function should accept one parameter, a string, for the user’s plain text input entered into the <textarea>
element:onkeyup
is a better choice than onkeydown
in this case, as onkeyup
will call the conversion function after the user completes each keystroke, as opposed to before it happens. This way, the output, which is refreshed with each keystroke, always includes the latest typed character. If the conversion is triggered with an onkeydown
handler, the output will exclude the most recent character the user typed. This can be frustrating when, for example, the user has finished typing a sentence but cannot yet see the final punctuation mark, say a period (.
), in the output until typing another character first. This creates the impression of a typo, glitch, or lag when there is none.convert()
function has the following responsibilities:<h1>
or <p>
HTML tag, whichever is most appropriate.<a>
tags, and replace image file names with <img>
elements.html_encode()
convert_text_to_HTML()
convert_images_and_links_to_HTML()
html_encode()
function to HTML encode/sanitize the input. HTML encoding refers to the process of escaping or replacing certain characters in a string input to prevent users from inserting their own HTML into the output. At a minimum, we should replace the following characters:<
with <
>
with >
&
with &
'
with '
"
with "
htmlspecialchars()
, htmlentities()
, and strip_tags()
functions. That said, it is relatively easy to write our own function that does this, which is what we’ll use the html_encode()
function for that we defined earlier:convert_text_to_HTML()
function we defined earlier to wrap each line in their respective HTML tags, which are going to be either <h1>
or <p>
. To determine which tag to use, we will split
the text input on the newline character (\n
) so that the text is processed as an array of lines rather than a single string, allowing us to evaluate them individually.<h1>
tag on it; otherwise, we mark it up in a <p>
tag.convert_images_and_links_to_HTML()
function to encode URLs and images as HTML elements. It’s a good chunk of code, so I’ll drop it in and we’ll immediately start picking it apart together to explain how it all works.convert_text_to_HTML()
function, here we use regular expressions to identify the terms that need to be wrapped and/or replaced with <a>
or <img>
tags. We do this for a couple of reasons:convert_text_to_HTML()
function handles text that would be transformed to the HTML block-level elements <h1>
and <p>
, and, if you want, other block-level elements such as <address>
. Block-level elements in the HTML output correspond to discrete lines of text in the input, which you can think of as paragraphs, the text entered between presses of the Enter key.convert_images_and_links_to_HTML()
function. images
(plural), image
(singular), and link
are reserved words in JavaScript. Consequently, imgs
, img
, and a_tag
were used for naming. Interestingly, these specific reserved words are not listed on the relevant MDN page, but they are on W3Schools.String.prototype.match()
function for each of the two regular expressions, then storing the results for each call in an array. From there, we use the nullish coalescing operator (??
) on each call so that, if no matches are found, the result will be an empty array. If we do not do this and no matches are found, the result of each match()
call will be null
and will cause problems downstream.array_unique()
function.<a>
tag and performing the replacement only if the URL doesn’t match an image. We may be able to avoid having to perform this check by using a more intricate regular expression. The example code deliberately uses regular expressions that are perhaps less precise but hopefully easier to understand in an effort to keep things as simple as possible.<img>
tags that have the src
attribute set to the image file name. For example, my_image.png
in the input is transformed into <img src='my_image.png'>
in the output. We wrap each <img>
tag with an <a>
tag that links to the image file and opens it in a new tab when clicked.<figcaption>
, <cite>
, or similar element. But if, for whatever reason, you are unable to provide explicit attribution, you are at least providing a link to the image source.alt
attribute. The example code I provided does add an alt
attribute in the conversion but does not populate it with a value, as there is no easy way to automatically calculate what that value should be. An empty alt
attribute can be acceptable if the image is considered “decorative,” i.e., purely supplementary to the surrounding text. But one may argue that there is no such thing as a purely decorative image.<pre>
tag as the output element instead of a <div>
:<pre>
element’s textContent
instead of innerHTML
:<textarea>
is parsed line-by-line and encoded into HTML that we format and display inside another element.POST
what’s entered into the <form>
using a PHP script or the like. That would be a great exercise, and if you do it, please share your work with me in the comments because I’d love to check it out.