[Tfug] OT: E-Book position representation

Sat Dec 20 03:18:07 MST 2008

On Thu, Dec 18, 2008 at 09:12:55PM -0800, Bexley Hall wrote:
><grin>  Fear not!  They will regain their voices once you have
>a finished product -- in an intensity directly proportional to
>the immutability of your implementation!  :-(
>
>(always amazing how little people know about what they WANT;
>yet how *much* they know about what they DON'T want!  :<  )

Oh, how very true.  :)

>> In the current Palm OS incarnation, I went with a strict
>> percentage indicator.  Your current position in the text might
>> be 42.67% and the list of bookmarks might contain "Chapter 1"
>> at 0.35%, for example.
>
>Short answer:  use UTF-16 instead of UTF-8 at *some* level in
>your data representation!  :>

I should have been cleared in the first email.  My program = my format.  :)
(With some exceptions).  The current incarnation reads zTXT documents
(Weasel's native format) and PalmDOC (AKA AportisDoc).  Due mostly to Palm
limitations, both formats are one byte = one character, though you can alter
the codepage used for the character set.

Finding position in these documents isn't too hard because of the
byte/character correspondence, though there are issues from data segmentation,
but that's a platform problem, not a format problem.

I had forgotten about UTF-16.  If I simply decree that this is to be the
format, might that solve many of my problems in one fell swoop?  It's been a
while since I read up on Unicode, but, IIRC, UTF-16 has a constant 16bit
character size, yes?  Unlike UTF-8's variable size.  Hmmm.. would this be
enough?  I mean, I don't need to support *every* language humanity has.  :)

That would get rid of internal location representation issues, efficiency of
location finding and searching, and probably lots of other things too.

><frown>  I wonder at the utility of reporting (and accepting as
>input!?) fractional percentages.  As documents get larger, those
>fractions lose precision (so, why are they better than just "XX%"?)

That was a user request.  Originally it was just whole numbers.  It's limited
to just a hundredths place, but that seems to be a good tradeoff.  That offers
10,000 possible "locations" to be specified and in most cases that's fine.
A user can always just write "5" if that's all the accuracy they need, too.

>Since *you* initially reported the position in these terms,
>you can also store the corresponding "character count" at
>which that "percentage" (offset_t) was achieved.
>
>Of course, the first time you see a book, this is a simple
>calculation:  0% bytes *or* characters!
>
>I recognize seeking to a percentage *byte* offset is considerably
>easier than "walking" to that same *character* offset.  But, you
>can do both:  spawn a task that starts counting characters while
>you, meanwhile, "jump" to the "byte offset" and begin formatting
>the page.  When the other task finishes, it can reposition
>the "cursor" more accurately (I realize this could result in
>a large shift, potentially.  But, if you already have the
>"character offset" metric *stored*, the problem goes away).
>
>This allows a user a moment to look at the page and refresh his
>memory as to why this place is (?) significant.  Since this API
>("UPI"??) would be shared with the bookmark feature -- i.e.,
>not just the "last position stored" -- it is possible that the
>user will decide that this is NOT where he wanted to be.  So,
>he returns to the bookmark menu and makes another choice, etc.
>(you then have to decide the most efficient way of redirecting
>that background task's execution in light of this new goal).
>
>[frankly, I think if you also store a "character offset"
>associated with each "position indication", you don't need this
>extra complexity to be able to give the user a character
>oriented position indication.   <shrug> ]

Oooo... that might be a bit more complexity than is needed, I think.  I see
your point about storing extra data to avoid all this extra work on the reader
side of things... but that only works for predetermined/calculated
information, such as bookmarks added when the book text is first converted
into the native format.  What happens when the user searches for a given
word.  Finding the data is quick, but then one must crawl forward for the
proper character position.  Of course, I can see the document having a table
of offsets/character positions so that crawling can proceed from the nearest
known location.  Thankfully, most of that should be moot by using UTF-16.

>> I had always thought that "page" numbers made no sense in an
>> electronic format unless that format specifically contains pages
>> (PostScript/PDF for example).  But most other reader programs 
>> continue to use that paradigm and users occasionally ask for it
>> or wonder why it is not present.
>
>Agreed.  But, I suspect it gives users an easier way to
>relate to their position than your "more precise" percentage
>indicator.
>
>For example, if I am 123 pages into a 240 page book, I know I
>am "about halfway".  Or, "less than half remains".  Or, "Only
>about 120 pages left", etc.  For those of us who grew up with
>paper, I think this is more in tune with how we *think* of
>reading a book.  "I've only got 6 more pages to go" vs.
>"I've only got 14.57% remaining".  The latter is meaningless
>to me *except8 as a pure "relative position indicator".  It
>doesn't reflect the size of the document so I can't use it
>to gauge my investment, commitment, etc.
>
>We have our own concept of what a "page" is.  Even if the
>document redefines it for us!

Hmmm... that's a good point, but you're also right about the concept of a page
changing.  And it's worse than that.  What exactly *is* a page in an
electronic medium?  The book is just one long stream of text, after all.  Is
it how much can fit on the device's display?  But then what to do when the
user alters the line spacing or font choice?  Suddenly the definition of a
page has changed.  This makes page numbers useless for anything other than a
"you are here" message.  You can't let a user use them to move to a particular
location in a book nor use them as bookmark anchors because they can too
easily change.  It would require a lot of processing to move to a given page
because the beginning offsets of pages are not constant.

Another problem with "pages" is the autoscroll feature.  Personally, I don't
use this, but many users seem to like it a lot.  This is where the text slowly
scrolls up the screen (or line by line, depending on how it's configured).
Where are you when you stop inbetween pages?  You could reposition to the
start of the current page, but, as you mentioned above, this is something
users are quite vocal about not liking.  :)

>E.g., if I read a technical journal with lots of multicolumn
>fine print, my page size notion is much larger than when
>reading a paperback novel.  Reading that same novel in a
>hard cover edition gives me yet another notion of page size.
>Yet, since I know how many pages there are in *this* particular
>document, I can quickly form a gut feel for my progress through
>the document.  "I read about a page per minute", etc.
>
>[N.B. For some types of documents, this might be a valuable
>metric to compute and convey (in some form) to the user.  E.g.,
>"expected time remaining"  <grin> ]

I can see offering it to users as a purely convenience feature.  You are on
"page" 42 of 528.  I worry, though, that offering this detail will result in
many rejected feature requests for movement in the program via page numbers.

>Such a slider needs to have a "transmission" associated with it;
>the user needs to be able to downshift to get more precise
>control as well as upshift to get coarser, more rapid movement.
>With each movement, if you could (ideally) repaint the screen
>so he/she could "get their (relative) bearings"...
>
>[You could use speed of gesture to determine which "gear"
>you are in]

Yes, that's a good idea.  Designing for a stylus isn't too hard (it's very
mouselike).  But these new touch interfaces will require new ideas and a lot
of fine tuning.  Especially problematic as I have no G1 phone at the moment
and I'll definitely need one so I can "feel" how a progress slider operates.
I've got the dev environment and emulator up and running, but dragging with
the mouse and with a finger are not comprable.

>You always need to have available (to yourself?) some scheme
>that operates in the absence of any "document structure".
>E.g., your position in a "pure text" file. 
>
>I would argue for a character based metric instead of the
>"byte offset" approach.  I (personally) think the added
>cost to the developer is far outweighed by the intuitive
>nature of that metric.  I believe you are also far less
>likely to "surprise" (Principle of Least Surprise) users
>if they *see* 10 characters on the screen, the cursor on
>the 3rd character and the position reported as "30%"
>REGARDLESS OF THE PARTICULAR CHARACTERS PRECEDING AND 
>FOLLOWING THE CURSOR.

Yes, I'm going to have to put a lot of thought into the tradeoff between
accuracy and speed.  This wouldn't have been an issue on Palm devices since
they're too slow.  Newer devices are fast enough that it's possible to
calculate many of these values without the user noticing (if you do it
properly, that is).

>[I'm sure you've thought about this.  Just consider that
>you can't control the material that is being *read*.  Are
>you willing to penalize a user who happens to read lots
>of technical documentation with fancy mathematical symbols
>(which don't fit neatly in single bytes) just for the
>sake of implementation ease?  <shrug>]

The hell I can't control it!  Oh, wait... I can't...  :(

>Given some "core" position representation, I think you should
>also take advantage of whatever *other* structural information
>is present in the document to convey to the user his/her
>position *relative* to this framework.

This is another tradeoff, though this time it is over how much time I want to
invest in the desktop "conversion" program.  The format will need to support
such abilities, and the input must be parsed to extract this information.

The old setup was much simpler...  nearly all input was plain text books,
often from Project Gutenberg.  The text could be scanned via regex to generate
bookmarks (from Chapter/Section headings, or whatever).  There is a feature
currently that displays the title of the last bookmark you passed.  So, if you
had bookmarks on all major structural elements, the reader would seem to know
your current heading.  It seems like a decent tradeoff.

The other issue is how much structure does it make sense to store?  A web
browser, for example, can be called upon to display nearly anything and must
be generic enough to do so.  The *target* here is reading books.  Given that
the vast majority of documents read will be books (and those typically have
simple or minimal internal structure) how much effort should be put into
supporting these more advanced but less used features?

>I.e., if the creator of the document went to the trouble of
>including this structural information, assume there is some
>significance to it and try to use it to give the user a
>framework in which to judge his "true position".

Fortunately, most ebook formats currently in use don't have a whole lot of
structure elements to them.  I aim to support more formats this time than what
the Palm version does, but I'm also limited by the desire not to spend many
hours reverse engineering closed formats.  The primary reason Weasel currently
supports PalmDoc files is because another GPL'd program already did the
hardwork of divining the format's dark secrets.

>Since you have a GUI available, you could opt for some
>abstract representation of the document (i.e., one that fits
>on a single "screen") that outlines the structure and shows
>the user's position -- along with those of any additional
>bookmarks? -- using some legend.
>
>This could also serve as a means for letting the user
>position himself in the document -- give him some sort of
>"zoom" control so he can see regions in greater detail
>(this allows the structure of the document to be as
>fine-grained as the author intended without compromising
>the presentation based on the characteristics of the 
>GUI hardware)

Again, is there a need to go beyond bookmarks for this?  They have names,
record position, and are anchored at various places in the text.  When
displayed in order you can get this overview of the document, too.  It all
depends on how well the ebook was constructed.  I've seen a great many PalmDoc
files (and others) that just plain didn't bother to add bookmarks.  Then it
becomes much more necessary to rely on the reader program's facilities for
moving around the document.  Jumping to percentages/pages/etc. or searching
for some string and jumping to that location.

Thanks again for the feedback!

-- 
--John Gruenenfelder    Systems Manager, MKS Imaging Technology, LLC.
Try Weasel Reader for PalmOS  --  http://weaselreader.org
"This is the most fun I've had without being drenched in the blood
of my enemies!"
        --Sam of Sam & Max