A Quick Guide to Making Microsoft
Iíd first like to thank The Amazing Chris De Herrera for asking me to write this brief spiel about making Microsoft Reader books. And Iíd like to thank Chris for hosting my getting-to-be-sizeable collection of Reader conversions.
When thinking about "where to post?" my first and only thought was www.pocketpcfaq.com. I think itís the single best site about Windows CE and the Pocket PCís Ė and the amount of work and love that obviously went into Chrisís always-evolving site is breath-taking.
Heís a deserved "Microsoft MVP" and a great guy. Thanks again, ChrisÖ
The Big Idea
Despite everything that I say later on, always remember this;
If youíd like, you can stop reading now, go to ReaderWorks Website, download the free "Personal" ReaderWorks software, and have at it. Itís not hard, truly.
So far, Iíve made about 60 Reader books, so how hard can it be? And consider how amazing this is:
On my Jornada, tucked onto my 48 meg CompactFlash card, are many of the greatest works ever written:
And, though not as noteworthy, Iíve also got the Rules of Golf, some good frequency lists (I love radio), the complete NFL schedule for the upcoming season and, of course, the "good parts" from the Kama Sutra Ė which Chris De Herrera probably wisely declined to make available on www.pocketpcfaq.com.
And more still. In my pocket. Thanks to Microsoft, amazing computer hardware, OverDrive Systemsí ReaderWorks software Ė and the many people who tirelessly scanned so that we can joyfully read.
Wanna make some books?
The Process of Reader Bookmaking
First, the painful and nasty part. After youíve downloaded ReaderWorks, download the documentation. Now read it. Sorry. Now open ReaderWorks and, from the Help Menu, choose Help topics.
Now read the Help Topics. Again, sorry.
This should take less than an hour. When youíve finished, youíll know far more than Iíll say here Ė and probably more than I know! Itís a short-term memory thingÖ
Or let me condense the process like this:
The most well-known site for public domain texts is Gutenberg at www.gutenberg.net. You can also go to Yahoo, or my favorite, www.google.com and search for "etext public domain ebooks" or somesuch. Youíll soon find yourself with more URLs and more books than youíll ever have time to convert.
A few good sites are:
The On-Line Books Page at http://digital.library.upenn.edu/books/
Bibliomania at http://www.bibliomania.com/
Books on the English Server at http://eserver.org/books/
The Windows CE Archives of (as they call them) E-Text sites at http://www.pda-archives.com/wince/12.htm
An even better list of sites at http://gort.ucsd.edu/jj/1/book.html
The wonderful Internet Classics Archive at http://classics.mit.edu/
Another very good listing of public domain etexts at http://utenti.tripod.it/libridigitali/publicdomain.htm
A still even better list at http://www.lib.utexas.edu/Libs/PCL/Etext.html
And another at http://dmoz.org/Arts/Literature/Electronic_Text_Archives/ where youíll find many good links, including "The Society for the Appreciation of the Post-Dialogic Novel" where you can ponder their manifesto; which ends with the ringing declaration "That David Foster Wallace's distinction between recursive and referential writing is decidedly valuable, and our belief is that the post-dialogic reflects an essential blend of the two."
And so on. So many sites, so many books, so little time.
Is it Really Public Domain?
As we enter the Media Everywhere Age, the notion of copyright deserves some attention. Weíve got Napster, which allows anyone to gleefully circumvent Ė hell, break! Ė the law. Weíve got Barnes and Noble and other booksellers wanting to sell you eBooks Ė but not wanting you to make copies for all your friends. As you read this, people around the world are ripping CDís, ripping DVDís, converting media files from this format to that, and filling up hard drives and web sites with books, music and video.
Thereís a lot of thievery going on. And thatís exactly what it is: Theft. Itís illegal and, more important, itís morally wrong. Donít do it.
Sadly, life before the monitor will soon be more complex because of the need for Digital Rights Management. DRM deserves more space than I can give it, but the short story is that software will become more complex Ė and harder to use Ė as Microsoft and other companies build new applications, and add new layers in existing apps to prevent you from illegally reproducing media. Not that you would, of courseÖ
Given that, how do you know if something is truly in "the public domain" and eligible to be reproduced and distributed?
(Note: I am not a lawyer. I am not a lawyer)
First, in general, any work thatís was published more than 75 years ago is in the public domain. In general, of course. 1850 A.D. or before, not to worry. Homer has no lawyers anymore.
And, of course, if the author Ė or copyright holder Ė freely puts the work in the public domain, thatís that. Unless, of course, the copyright holder adds: "Öbut, you have to contact me for permission before reproducing" or other caveats.
Thereís also a gray area. When I went to find Hesiodís Works and Days -- itís Greek, itís old, itís obviously in the public domain! -- I found it at Berkeley.edu, with this at the bottom:
Copyright © 1995. All rights reserved.
Document maintained on server:
http://sunsite.berkeley.edu/ by the SunSITE
Can they do that? Copyright something written before Christ took a breath? Beats me. (I am not a lawyer.)
So I wrote the SunSite manager a nice email; and got a nice email back, giving me permission to reproduce for Microsoft Reader, and away I went.
When I made the Reader eBook, I included, at the top, under a "Copyright and Permissions" heading:
Hesiod, the Homeric Hymns and Homerica
When in doubt, inquire about permission. And donít steal. Most writers and musicians have house payments, same as you. And I donít care how much money Madonna has, and neither should you.
For more about copyright, the U.S. Copyright Office has a good "Frequently Asked Questions" page at http://lcweb.loc.gov/copyright/faq.html
Sometimes Itís So Easy
In the first version of this Quick Guide, this is where youíd start to read about Word wrestling.
But thatís not always the case. Itís possible to find lovely and informative web pages, with excellent formatting, tables of contents, footnotes Ė all that Ė which fly into ReaderWorks with no additional effort needed. Really.
Hereís an example. The Rand Corporation has a excellent report, "The Cyber-Posture of the National Information Infrastructure" at http://www.rand.org/publications/MR/MR976/mr976.html.
It has a detailed table of contents, footnotes, bulleted listsÖall that youíd expect in a thorough research paper. From your favorite browser, you can Save As HTML, toss it into ReaderWorks, and press Start.
Out comes a Reader .lit file that looks great on your Pocket PC. The table of contents is clean, properly nested, and functional. The headings look good. The text looks good. The footnotes even work!
And you did nothing but Save As HTML and run it through ReaderWorks. It can happen.
Well, okay. There was one thing. Thereís a section titled Acronyms. In the report, a line looks like this:
DARPA Defense Advanced Research Projects Agency
On the Pocket PC, it looks like this:
DARPADefense Advanced Research Projects Agency
Still, not bad. Iíve done conversions where I .literally (sic) did nothing to the text before turning to ReaderWorks. But, the more likely case is stillÖ
The Word Massage
Ok, youíve got a text and itís truly public domain. Itís probably also a plain text file. If it happens to be "Palm format," file with a .prc or .pdb file extension, you need MobiPocket Publisher from www.mobipocket.com. Itís free. It will convert those Palm files to HTML files, which ReaderWorks (after a spin through Word) will accept.
Microsoft Word Ė or another high-powered word processor or layout program Ė is where youíll format the text. ReaderWorks makes eBooks, but it doesnít format themÖmuch. (Later on, FrontPage gets an informal recommendation from Steve Potash, President of OverDrive Systems as the "tool of choice" for making Reader eBooks.)
When it comes to formatting, the better you are with Search and Replace, the better. In my experience, making books is 90% search and replace.
Why? Because most public domain texts come with hard breaks at the end of every line. This will create a truly ugly Reader book, unless those line breaks are carefully expunged.
The line breaks will either be paragraph marks ( ^p in Word ) or end-of-line marks ( ^l in Word ). To complicate things, Word often "canít see" end-of-line marks Ė though Word craftily displays them as paragraph marks, just to get your hopes up. And since it wonít replace what it canít see, youíve got a problem.
To determine if your task is simple or a bit tedious, Find ^p (the Ďpí must be lowercase). Find again, and see if Word happily skips over what looks to be a bona fide paragraph mark. If so, I do this: save the file as a Word doc, close it, open it, and try again. Still not working? Save as a RTF file, close, open, try again. That should do it.
Now, change all ^l to ^p.
Next select all, make everything Normal style and 12 point type (any smaller and ReaderWorks may make your text smaller than youíd like) and save as a Word doc.
Now the interesting part.
Look at your text to determine if itís what I call "well formed." A well-formed document is the easiest to convert, no matter how long it is. Length isnít what makes for difficulty. Inconsistency in spacing and other formatting weirdness makes for difficulty. Also, tables that are too wide to fit across an eBook display are nightmarish. Plain text tables; work of the devil. Iíve been know to "give up," in fact. Iím not proud.
Ideally, you want a beautiful chain of paragraphs, separated by a consistent number of paragraph marks and a manageable number of chapter and other headings Ė because youíll have to make a style for each heading, so that ReaderWorks can create your table of contents.
Still game? Hereís the fun part. First, save. Next, weíre going to get rid of the paragraph marks between paragraphs. Although this procedure differs from book to book, generally I:
Now, either save so Word can collect itself after that bruising exchange, or Undo some or all of what you just did and try again, in a slightly different manner.
If you donít want spaces between paragraphs, make that final change @@@ into a single ^p followed by your choice of spaces. ReaderWorks converts tabs to spaces, but you might want to give it a little heads up here.
Using Macros in Word
To make this process easier, make a Macro. Use Word help if you need Macro basics. Simply put, you choose "Record," then do a sequence of steps, then Stop Recording. Not hard. The steps listed above, for example, become a Macro that looks something like this:
Now, Iíll admit: This is a very simple Macro. It could do much more. But, for now, remember that any repetitive unchanging sequence is a prime Macro candidate.
After searching and replacing Ė whether by hand or with power-assist Ė you may only now need a little futzing here and there Ė adding or deleting spaces, etc. Ė before the next step.Adding Styles
Although ReaderWorks can accept multiple files, and create tables of contents from filenames, letís merely consider making a book from a single file. If thatís the case, you need to create headings, which ReaderWorks uses to make a table of contents.
Again, Search and Replace is your friend. If every chapter begins with the word CHAPTER, for example, you can search for CHAPTER (with Match Case checked) and replace it with CHAPTER (and the format of heading whatever..3, say).
Instant table of contents. (Another Macro candidate??)
More likely, youíll be doing a bit of work "by hand" to properly format the headings, deciding on levels for the headings, and so on. Remember: youíre not outside getting skin cancer; itís a good thing.
Make sure that the title of your book is not a heading. ReaderWorks takes the filename and puts it in the table of contents. If you make the title a heading, youíll have two titles atop your table of contents. (Been thereÖ)
Ready for ReaderWorks
Done? If so, save your file as a Web Page, and check that the Page Title in the Save As panel is what you wish for the name of your book. (You can change this in ReaderWorks, but do it now, ok?)
You are now ready to meet your maker.Using ReaderWorks
This partís easy. Itís the formatting in Word thatís tedious and often difficult.
In ReaderWorks, you
From the Source Files window, click Add to add your HTML-saved file. Make sure the file isnít still open in Word.
Next, set properties by clicking on the Properties button at left. Add as much information as you can here. Your eBook may live a long and strange life and future generations might like to know if this is a cookbook or a feminist tract.
Next, Click on Table of Contents and run the easy-to-use TOC Wizard. Heck, run it a few times to see which format you prefer.
Now save as, to save your project. That way, if ReaderWorks crashes in the next step (a rare thing), youíve still got your table of contents and property information.
Finally, choose Make eBook from the File menu to make your eBook as a Reader .lit file.
Did I say "Finally"? Actually, you next copy your eBook to your Pocket PC. (Make sure that Reader is closed when you do this Ė on the Jornada, go to Today, click the Task Switcher icon, choose "Close Window," then Microsoft Reader. Or do a soft reset after copying your eBook.)
Open Microsoft Reader, choose your book from the Library, and discover what you forgot to do in Word. Go back to Word, etc. Repeat until pleased with what youíve made.
Comments from Steve Potash
Before sending this off to Chris De Herrera, I ran this piece by Steve Potash, founder and President of Overdrive Systems: the makers of ReaderWorks. (By the way: Nice guy. And Iíve found Overdriveís support to be excellent.)
Steve had a couple of comments. There are:
1. Reader and ReaderWorks supports CSS (most CSS tags but not all) Ė great for eliminating the embedded tagging and can be applied against an entire title or library.
2. FrontPage is the preferred editor for our eBook folks - pretty easy for layout.
Both good points. Thanks, Steve.
CSS means "Cascading Style Sheets." You can learn all the nitty-gritty about CSS by reading the W3C Recommendation "Cascading Style Sheets, level 1" at http://www.w3.org/TR/REC-CSS1. When you get done with that, youíll realize why ReaderWorks doesnít yet support "all" tags.
A good list of links to more CSS information is at http://webopedia.internet.com/TERM/C/Cascading_Style_Sheets.html
FrontPage? Makes sense. If youíre comfortable working in FrontPage, give it a whirl. Report back.Conclusion
Again, despite these honest remarks, making eBooks with ReaderWorks is a fairly simple and greatly rewarding process. A eBook well-made (ok, even poorly made) is a great gift; to yourself and to anyone who uses what you created.
Feel free to contact me with any corrections to or suggestions for this document. And, since you canít change a Reader .lit file unless you have the original source files, contact me if youíd like the word files for any of the books Iíve made. Each can be improved, and Iíd love to see that happen.