Calibre ebook from html




















Filtering style information. The Live preview panel. The Table of Contents view. Checking the spelling of words in the book. Inserting special characters.

The code inspector view. Checking external links. Downloading external resources. Arranging files into folders by type. Importing files in other e-book formats as EPUB. Special features in the code editor. Context sensitive help. A video tour of the calibre E-book editor is available here.

When you first open a book with the Edit book tool, you will be presented with a list of files on the left. These are the individual HTML files, stylesheets, images, etc.

Simply double click on a file to start editing it. One useful feature is Checkpoints. Before you embark on some ambitious set of edits, you can create a checkpoint. Checkpoints will also be automatically created for you whenever you run any automated tool like global search and replace.

Checkpoints are useful for when changes are spread over multiple files in the book. That is the basic work flow for editing books — Open a file, make changes, preview and save. The rest of this manual will discuss the various tools and features present to allow you to perform specific tasks efficiently.

The File browser gives you an overview of the various files inside the book you are editing. The order of text files is the same order that they would be displayed in, if you were reading the book.

All other files are arranged alphabetically. By hovering your mouse over an entry, you can see its size, and also, at the bottom of the screen, the full path to the file inside the book.

Note that files inside e-books are compressed, so the size of the final book is not the sum of the individual file sizes. Many files have special meaning, in the book. These will typically have an icon next to their names, indicating the special meaning.

Similarly, the content. You can rename an individual file by right clicking it and selecting Rename. Renaming a file automatically updates all links and references to it throughout the book. So all you have to do is provide the new name, calibre will take care of the rest.

You can also bulk rename many files at once. This is useful if you want the files to have some simple name pattern. Select the files you want bulk renamed by holding down the Shift or Ctrl key and clicking the files. Then right click and select Bulk rename. Enter a prefix and what number you would like the automatic numbering to start at, click OK and you are done.

The bulk rename dialog also lets you rename files by the order they appear in the book instead of the order you selected them in, useful, for instance to rename all images by the order they appear.

Finally, you can bulk change the file extension for all selected files. Select multiple files, as above, and right click and choose Change the file extension for the selected files.

It can sometimes be useful to have everything in a single file. Be wary, though, putting a lot of content into a single file will cause performance problems when viewing the book in a typical e-book reader. To merge multiple files together, select them by holding the Ctrl key and clicking on them make sure you only select files of one type, either all HTML files or all CSS files and so on.

Then right click and select merge. Note that merging files can sometimes cause text styling to change, since the individual files could have used different stylesheets. You can also select text files and then drag and drop the text files onto another text file to merge the dropped text files into the target text file.

You can re-arrange the order in which text HTML files are opened when reading the book by simply dragging and dropping them in the Files browser. For the technically inclined, this is called re-ordering the book spine. Note that you have to drop the items between other items, not on top of them, this can be a little fiddly until you get used to it. Dropping on top of another file will cause the files to be merged. E-books typically have a cover image.

This image is indicated in the File browser by the icon of a brown book next to the image name. If you want to designate some other image as the cover, you can do so by right clicking on the file and choosing Mark as cover.

In addition, EPUB files has the concept of a titlepage. Be careful that the file you mark contains only the cover information. If it contains other content, such as the first chapter, then that content will be lost if the user ever converts the EPUB file in calibre to another format.

This is because when converting, calibre assumes that the marked title page contains only the cover and no other content. You can delete files by either right clicking on them or by selecting them and pressing the Delete key. Deleting a file removes all references to the file from the OPF file, saving you that chore. You can export a file from inside the book to somewhere else on your computer. This is useful if you want to work on the file in isolation, with specialised tools.

To do this, simply right click on the file and choose Export. Once you are done working on the exported file, you can re-import it into the book, by right clicking on the file again and choosing Replace with file… which will allow you to replace the file in the book with the previously exported file.

You can also copy files between multiple editor instances. Select the files you want to copy in the File browser , then right click and choose, Copy selected files to another editor instance. Then, in the other editor instance, right click in the File browser and choose Paste file from other editor instance. You can add a new image, font, stylesheet, etc. This lets you either import a file by clicking the Import resource file button or create a new blank HTML file or stylesheet by simply entering the file name into the box for the new file.

You can easily replace existing files in the book, by right clicking on the file and choosing replace. This will automatically update all links and references, in case the replacement file has a different name than the file being replaced. Edit book has a very powerful search and replace interface that allows you to search and replace text in the current file, across all files and even in a marked region of the current file. You can search using a normal search or using regular expressions.

To learn how to use regular expressions for advanced searching, see All about using regular expressions in calibre. Type the text you want to find into the Find box and its replacement into the Replace box. You can the click the appropriate buttons to Find the next match, replace the current match and replace all matches.

Using the drop downs at the bottom of the box, you can have the search operate over the current file, all text files, all style files or all files. You can also choose the search mode to be a normal string search or a regular expression search. Remember, to harness the full power of search and replace, you will need to use regular expressions.

See All about using regular expressions in calibre. To save a search simply right click in the Find box and select Save current search. This will present you with a list of search and replace expressions that you can apply.

You can even select multiple entries in the list by holding down the Ctrl key while clicking so as to run multiple search and replace expressions in a single operation. You can do pretty much any text manipulation you like in function mode.

There is also a dedicated tool for searching for text, ignoring any HTML tags in between. Edit book has various tools to help with common tasks. These are accessed via the Tools menu. The second pass analyzes all hyphenated words throughout the document, hyphens are removed if the word exists elsewhere in the document without a match. When enabled, calibre will look for common words and patterns that denote italics and italicize them.

Some documents use a convention of defining text indents using non-breaking space entities. These options are useful primarily for conversion of PDF documents or OCR conversions, though they can also be used to fix many document specific problems. As an example, some conversions can leaves behind page headers and footers in the text. These options use regular expressions to try and detect headers, footers, or other arbitrary text and remove or replace them.

There is a wizard to help you customize the regular expressions for your document. Successful matches will be highlighted in Yellow. The search works by using a Python regular expression.

All matched text is simply removed from the document or replaced using the replacement pattern. The replacement pattern is optional, if left blank then text matching the search pattern will be deleted from the document. You can learn more about regular expressions and their syntax at All about using regular expressions in calibre. Structure detection involves calibre trying its best to detect structural elements in the input document, when they are not properly specified.

For example, chapters, page breaks, headers, footers, etc. As you can imagine, this process varies widely from book to book. Fortunately, calibre has very powerful options to control this. With power comes complexity, but if once you take the time to learn the complexity, you will find it well worth the effort. This can sometimes be slightly confusing, as by default, calibre will insert page breaks before detected chapters as well as the locations detected by the page breaks option.

The reason for this is that there are often location where page breaks should be inserted that are not chapter boundaries.

Also, detected chapters can be optionally inserted into the auto generated Table of Contents. XPath can seem a little daunting to use at first, fortunately, there is a XPath tutorial in the User Manual. Use the debug option described in the Introduction to figure out the appropriate settings for your book. There is also a button for a XPath wizard to help with the generation of simple XPath expressions.

This expression is rather complex, because it tries to handle a number of common cases simultaneously. A related option is Chapter mark , which allows you to control what calibre does when it detects a chapter. By default, it will insert a page break before the chapter. You can have it insert a ruled line instead of, or in addition to the page break.

You can also have it do nothing. One of the great things about calibre is that it allows you to maintain very complete metadata about all of your books, for example, a rating, tags, comments, etc.

This option will create a single page with all this metadata and insert it into the converted e-book, typically just after the cover. Think of it as a way to create your own customised book jacket. Sometimes, the source document you are converting includes the cover as part of the book, instead of as a separate cover. If you also specify a cover in calibre, then the converted book will have two covers. This option will simply remove the first image from the source document, thereby ensuring that the converted book has only one cover, the one specified in calibre.

When the input document has a Table of Contents in its metadata, calibre will just use that. However, a number of older formats either do not support a metadata based Table of Contents, or individual documents do not have one.

In these cases, the options in this section can help you automatically generate a Table of Contents in the converted e-book, based on the actual content in the input document. Using these options can be a little challenging to get exactly right.

This will launch the ToC Editor tool after the conversion. It allows you to create entries in the Table of Contents by simply clicking the place in the book where you want the entry to point. You can also use the ToC Editor by itself, without doing a conversion. Then just select the book you want to edit and click the ToC Editor button.

The first option is Force use of auto-generated Table of Contents. By checking this option you can have calibre override any Table of Contents found in the metadata of the input document with the auto generated one. The default way that the creation of the auto generated Table of Contents works is that, calibre will first try to add any detected chapters to the generated table of contents. You can learn how to customize the detection of chapters in the Structure detection section above.

If you do not want to include detected chapters in the generated table of contents, check the Do not add detected chapters option. If less than the Chapter threshold number of chapters were detected, calibre will then add any hyperlinks it finds in the input document to the Table of Contents.

This often works well: many input documents include a hyperlinked Table of Contents right at the start. The Number of links option can be used to control this behavior.

If set to zero, no links are added. If set to a number greater than zero, at most that number of links is added. However, if there are some additional undesirable entries, you can filter them using the TOC Filter option. This is a regular expression that will match the title of entries in the generated table of contents.

Whenever a match is found, it will be removed. Also read the XPath tutorial , to learn how to construct XPath expressions. Next to each option is a button that launches a wizard to help with the creation of basic XPath expressions.

The following simple example illustrates how to use these options. Not all output formats support a multi level Table of Contents. You should first try with EPUB output. If that works, then try your format of choice.

Suppose you want to use an image as your chapter title, but still want calibre to be able to automatically generate a Table of Contents for you from the chapter titles.

Use the following HTML markup to achieve this:. If you have particularly long chapter titles and want shortened versions in the Table of Contents, you can use the title attribute to achieve this, for example:.

There are two places where conversion options can be set in calibre. These settings are the defaults for the conversion options. Whenever you try to convert a new book, the settings set here will be used by default. You can also change settings in the conversion dialog for each book conversion.

When you convert a book, calibre remembers the settings you used for that book, so that if you convert it again, the saved settings for the individual book will take precedence over the defaults set in Preferences. You can restore the individual settings to defaults by using the Restore defaults button in the individual book conversion dialog.

You can remove the saved settings for a group of books by selecting all the books and then clicking the Edit metadata button to bring up the bulk metadata edit dialog, near the bottom of the dialog is an option to remove stored conversion settings. From the saved conversion settings for each book being converted if any. This can be turned off by the option in the top left corner of the Bulk conversion dialog. Note that the final settings for each book in a Bulk conversion will be saved and re-used if the book is converted again.

Since the highest priority in Bulk Conversion is given to the settings in the Bulk conversion dialog, these will override any book specific settings. So you should only bulk convert books together that need similar settings. The exceptions are metadata and input format specific settings. Since the Bulk conversion dialog does not have settings for these two categories, they will be taken from book specific settings if any or the defaults.

You can see the actual settings used during any conversion by clicking the rotating icon in the lower right corner and then double clicking the individual conversion job. This will bring up a conversion log that will contain the actual settings used, near the top. Here you will find tips specific to the conversion of particular formats. Options specific to particular format, whether input or output are available in the conversion dialog under their own section, for example TXT input or EPUB output.

Just add the file to calibre and click convert. There is a demo. Open the output e-book in the calibre E-book viewer and click the Table of Contents button to view the generated Table of Contents.

For older. If you have a newer version of Word available, you can directly save it as. Another alternative is to use the free OpenOffice. Open your. TXT documents have no well defined way to specify formatting like bold, italics, etc, or document structure like paragraphs, headings, sections and so on, but there are a variety of conventions commonly used. By default calibre attempts automatic detection of the correct formatting and markup based on those conventions.

Analyzes the text file and attempts to automatically determine how paragraphs are defined. This option will generally work fine, if you achieve undesirable results try one of the manual options. Paragraphs end when the next line that starts with an indent is reached:.

Assumes that the document has no formatting, but does use hard line breaks. Punctuation and median line length are used to attempt to re-create paragraphs. Attempts to detect the type of formatting markup being used. If no markup is used then heuristic formatting will be applied. Analyzes the document for common chapter headings, scene breaks, and italicized words and applies the appropriate HTML markup during conversion. Markdown allows for basic formatting to be added to TXT documents, such as bold, italics, section headings, tables, lists, a Table of Contents, etc.

You can learn more about the Markdown syntax at daringfireball. Applies no special formatting to the text, the document is converted to HTML with no other changes. PDF documents are one of the worst formats to convert from. See also github. Amazon does not covert EPUB books when you email them with "convert" in the subject line. You have to send a MOBI file.

It was working in the good old days. Anyhow, according to this: the-digital-reader. It works.. Show 1 more comment. I prefer Calibre solution. Debian Calibre package come with ebook-convert utility. Works fantastic with epub. Just the answer I was looking for. Thanks — mythicalcoder. Is there a way to use ebook-convert recursively? It gets the index. I have tried giving it a list of paths, but that doesn't seem to work.

All you do is: Open the webpages you want to save in different tabs. Open EpubPress Select all the pages you want in your ebook Download The content from the pages gets extracted and stitched together into an ebook. Hope that helps! Harold Harold 1 1 silver badge 3 3 bronze badges. This is exactly what I was looking for. Thank you for creating this! This is a genius project, exactly what I was missing without knowing it. Now I can get rid of the accumulation of tabs I want to read "someday" without cluttering the Kindle with a lot of short articles like before.

Works perfectly and you can even select. No need for kindlegen or Calibre anymore. Padawan Learner Padawan Learner 6 6 bronze badges. Actually, it's available for all current operating systems Here is one way of going about it I could not get rid of it. The html pages must be merged. The result have to be one large HTML file. At the and just have to close the print program. If you work in Firefox, you can use PrintEdit addition to remove some menus,advertising banners and more.

You have to save the html file on your computer. Nikolay Gechev Nikolay Gechev 11 2 2 bronze badges. Sign up or log in Sign up using Google.



0コメント

  • 1000 / 1000