This page describes how to set up an HTML file for ingest into the Manifold Reader. While some of the more common questions that come up while working in this format are also considered, this page is not meant to be a primer on HTML generally. That kind of nuance and detail is well beyond the scope of this documentation.
For a deeper dive into HTML, we suggest the mdn web docs as a starting point and regular reference.
This walkthrough is intended for users who have credentials to create or edit Projects in the Manifold backend.
Our goal is to make ingesting HTML as low impact on you as possible. In most cases you’ll be able to ingest your HTML file as-is or with only minimal adjustments, as described later on this page in the Standard Adjustments section.
That said, while Manifold is very accommodating of different elements within an HTML file, some things are intentionally ignored by Manifold at the time of ingest:
Specifically, Manifold doesn’t allow scripts or media queries, nor does it support custom fonts, allow anything that will break the systems’s responsive design, or accept these proscribed CSS selectors.
There is some wiggle room—some of these restrictions are softer than others. For more about that, see the Workarounds section below.
How do I edit HTML? What am I actually loading into the system at the time of ingest? Those two questions are the most frequently asked of us about HTML ingests. Let’s consider each in turn here:
You’ll want to avoid word processing applications like Microsoft Word or Google Documents or Word Perfect. Word processors add hidden content to files that help them do all the magic for which they are known. But you need to access the raw content of the file itself when editing HTML, and that’s exactly what plain-text editors are for. Among the Manifold Documentation Team, Sublime Text and Visual Studio Code are the ones we use the most.
There are three possible scenarios you can be faced with when loading HTML into Manifold to create a Text: (1) you have a single HTML file with no associated assets, like an external stylesheet or images or audio or video embeds; (2) you have a single HTML file with one or more media assets; and (3) you have multiple HTML files with or without assets. The sections below detail what you’ll need to do to get your material into Manifold based on which circumstance applies.
If you only have a single file with no local (saved on your device) assets to load, you need only select the HTML file in question at the time of Text ingest. Manifold will ingest that one file directly and create a Text from it.
But you might be asking yourself why we are qualifying a single file with “no local assets” here. And the reason is because you can reference remotely hosted assets from within a single HTML file.
For instance, maybe you have a number of images or videos hosted on your institution’s infrastructure or on a service like Imgur that you want to have appear in the body of your Text. You can use the standard HTML embed codes for those, with the absolute URL values modified to source your remotely hosted content from wherever they already live online, and Manifold will be able to render that content inline with the rest of the Text. You can do likewise for a stylesheet, linking to on a remotely hosted stylesheet to describe how you want your content to render.
If you have multimedia sitting on your machine that you want to load with your HTML, then this approach doesn’t apply, and instead you’ll want to read the section the follows immediately below about ingesting a single file with one or more assets.
When you have a number of assets (like a stylesheet or images, audio, or video files) that you need to load into Manifold along with your HTML file—because you want those elements to appear inline with the surrounding body text—it is not immediately clear how to get all of that into the system.
Fortunately the solution is a pretty simple one: you just need to zip (or compress) your HTML with the associated media assets into a ZIP archive with a
.zip file extension. At the time of ingest, that ZIP archive is what you will select to ingest into Manifold. The text will appear, styled in accordance with any stylesheet in the archive and with any media assets appearing inline with the body of the text—assuming, of course, they are all properly referenced from within the HTML file (for more about that, see the Embeds section further down this page).
Zip It Good!
Most likely you have your HTML file saved in a folder with all your other related assets. Perhaps they are all floating in the same folder (directory) or maybe you have some files organized in sub-folders. When it’s time to compress your files into a ZIP archive, you want to individually select your HTML file and related files or folders and then compress that selection. If you instead just select the folder that contains all of your content generally and zip that, you will likely run into problems with the ingest.
Is your situation one where you have multiple HTML files that you want to load into Manifold at once, each file a piece (chapter, essay, etc.) of a single whole? If so, then what you have before you is a “Manifest” ingest, described in detail in the Create a Manifest Ingest walkthrough.
If, however, you have multiple HTML files that are related but not meant to stand as a single, cohesive Text, then one of the two options described in the sections above are your solution. Presently there is no means to bulk load separate Texts into Manifold in one go.
Manifold will accept for ingestion, as is, the HTML you already have access to or that is created by the export process of an app you are using. However, given the vast range of possibilities that entails, it is hard for to say that the result you see in Manifold after ingestion will be exactly as you hoped for. Often times code will need to be cleaned up or style attributes modified. Barring a close review of your specific content, we can offer a couple of common adjustments that we ourselves use regularly and which are quick and easy to apply:
The content that lives at the top of your HTML document between the
<head> tags is primarily meant to be machine readable and provides information about your file (metadata) that different systems can interpret and display. Your file may have a lot of information in this space or almost none at all. The only required element for the HTML to be valid is a title; everything else that is possible is optional.
There are some attributes that Manifold looks for specifically in this section of your file that it will include as part of the Text record in the system. Specifically, Manifold can accept from this space information about the Text’s
- publication date
- rights or license information
- descriptive copy
- creator names
- contributor names
The template below provides a sample code block you can copy and paste into your own file, swapping out values between straight quote marks for information specific to your content.
Included with the template are comments, which appear between comment opening
<!-- and closing
--> tags. Those comments speak to specific elements and are meant to add context and helpful tips. You can remove them from your file or leave them as is; Manifold will ignore them.
Because of the high degree of variability from one file to another, some even without regard for the HTML spec, Manifold sources the title of an HTML Text from a number of different places, in order of precedence according to the following:
- Meta DC Title Attribute
- HTML Title Tag
- The filename of the HTML document, capitalized and without extension
Manifold will stop looking for a title as soon as it has found one. So if you don’t have a
"dc.title" attribute in the
<head>, Manifold will look next for the HTML
<title> tag. And if there is no such tag, Manifold will take the name of the HTML file itself as the title of the Text.
A Word about Quotation Marks
If you have a title or a description, say, that contain internal quotation marks, make sure they are not standard straight quotation marks but instead true “curly” marks. Otherwise you may get an error upon ingest. Straight quotation marks are meant to surround the values of different attributes. Internal marks of the same kind will prevent the file from being parsed correctly.
This is correct, with a curly mark indicating the possessive:
When you have notes or references in your HTML file that point readers to another section of the document via a link, the form of that link should be a simple hash value.
In the example above, the
href attribute value for the note link begins with a hash mark. That value is telling Manifold to move the focus of the reader from that place in the Text to the note with an
id value of
note-001 (shown in the example below). Often times links of this nature will include the full filename, for example:
filename.html#note-001. Manifold doesn’t like that, so if your file has links to content within the same file (otherwise known as cross-references), remove the filename before the hash mark to ensure the link will function as expected.
Notice how the
id changes to the
href value and vice versa for linked notes. That is how to construct links that point at one another without creating a circular reference. Note also that the class attributes mentioned here are optional and are examples of how you could include note markers in the text and in the endnotes sections as targets for a stylesheet.
For more on the accessibility components of these two examples, see the Accessible Publishing Knowledge Base maintained by DAISY.
For those looking to embed audio/visual elements as well as interactive content directly inline with the body of their text, the following examples provide a framework from which to begin. In all cases the specific
src attributes will need to be adjusted. For embed codes coming from third party sites, the examples here will show you the shape of what you need, so that you can more easily find it on those sites and paste it into the content of your file.
Images rely on
<figure> tags, which also allow for an optional caption. This sample demonstrates how to reference an image saved in the
images folder in the same directory as your HTML file.
If you want to include audio/visual elements inline with your text, you can use the following examples as templates. This first example shows how to reference an MP3 that is saved within a subfolder folder named
audio within the same directory of your HTML file.
The embed code for videos is very much like that for audio. Here is an example of what the code would be for an MP4 video saved within the
video subfolder that you created in the same directory as your HTML:
The Mozilla Web Docs details more about how the audio embed, image embed, and video embed code works and how you might want to adapt them to better meet your needs. Manifold will respect any of the adjustments mentioned in that documentation.
It is also possible to embed other interactive content that already lives online within your HTML file. The following are a suggestive sample base instead of an exhaustive one.
Note that many of these attributes shown here, like width and height and title, are determined by the content from which they are sourced. Likewise the
src values are very specific and will need to be changed from one embed to another.
Manifold was built to support a wide array of users, from publishing professionals to those who are publishing to fulfill the needs of their non-publishing profession. Because of that, Manifold tries to provide some boundaries for content that is being brought into the system at the time of ingest to ensure it presents as prettily as possible for your readers.
More specifically, Manifold ignores certain style selectors (detailed toward the end of the Styles section of the Project page) that may be present in your CSS. If that is the case, you aren’t entirely out of luck. For the following selectors, you can add a stylesheet to your text that targets those selectors and which Manifold will accept.
For example, Manifold will ignore the
line-height attribute you have in your CSS at the time of ingest. But if you add a stylesheet with
line-height, Manifold will honor the value you associate with it there.