[bksvol-discuss] Re: Automatic Stripper problem

  • From: "Jake" <jabrown@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Thu, 26 Aug 2004 12:51:46 -0500

Cind,
Yes, that would be correct. I notice that it strips the word "Chapter" and
sometimes the number because it believes that to be a header since obviously
that is at the top of many pages throughout the book.

Jake
----- Original Message ----- 
From: "Cindy" <popularplace@xxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Thursday, August 26, 2004 12:18 PM
Subject: [bksvol-discuss] Re: Automatic Stripper problem


> Jesse,
>
> The impression I get from the recent posts is that the
> stripper seems to be stripping Chapter titles (be they
> with numbers, e.g., Chapter 1, Chapter one,  or actual
> names of chapters, e.g. The Lost Dog)., not just
> chapter and title headings and page numbers -- even
> when the chapter title is a couple of line spaces
> down.
>
> Is this a mistaken impression on my part? Or did
> something happen to corrupt the stripper program when
> bookshare re-tooled recently -- or perhaps before
> that?
>
> Cindy
>
>
> --- Jesse Fahnestock <Jesse.F@xxxxxxxxxxxx> wrote:
>
> > Hi all, as I see there is some new conversation
> > about normalizing headers and footers, I thought I
> > would repost the guidelines for doing so. Please
> > remember that you are not required to do this. But
> > this is how to do it right if you want to take it
> > on!
> >
> > ---Begin instructions for headers and footers--
> >
> > Volunteers can assist this tool by "normalizing"
> > headers, footers, and
> > page numbers in submitted files where they do not
> > appear consistent.
> > Normalizing such a headers/footers helps but it
> > needs to be a
> > complete job, as normalizing just a few headers
> > could skew the
> > probability of properly recognizing them throughout
> > the book. If you
> > wish to undertake this task, please be sure to:
> >
> > 1) Check line position of text (the first paragraph
> > on a given page
> > should be the header, the last should be the footer)
> > 2) Check that page numbers should have a space on
> > either side,
> > separating them from the header/footer text. If the
> > page number is
> > the first character in a header it does not need a
> > space before it; or if
> > it is the last character in a footer it does not
> > need a space after it.
> > 3) Only change text in the header or footer in order
> > to make it look
> > like all other headers/footers
> > 4) Perform 1-3 on every page.
> >
> > Remember that the automated tool is designed to be
> > effective on most
> > scanned books so that you should undertake this
> > "normalization"
> > process only if you are sure that the headers and
> > footers in the book
> > you are validating are inconsistent and if you are
> > able to normalize all
> > of them throughout the book.
> >
> > --end instructions--
> >
> > jesse.
> >
> > ________________________
> >
> > Jesse Fahnestock
> > Collection Development Coordinator, Bookshare.org
> > www.bookshare.org
> >
> > A Project of The Benetech Initiative - Technology
> > Serving Humanity
> > 480 S. California Ave., Suite 201
> > Palo Alto, CA 94306-1609  USA
> > (650)475-5440 x133
> > (650) 475-1066 FAX
> > jesse@xxxxxxxxxxxx
> > www.benetech.org
> >
> > -----Original Message-----
> > From: bksvol-discuss-bounce@xxxxxxxxxxxxx
> > [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On
> > Behalf Of Jake
> > Sent: den 26 augusti 2004 15:12
> > To: bksvol-discuss@xxxxxxxxxxxxx
> > Subject: [bksvol-discuss] Re: Automatic Stripper
> > problem
> >
> >
> > My guess is that is part of the issue. Many of the
> > books I scan/plan to scan
> > have page numbers at the top of the page except on
> > pages where a new chapter
> > begins, in that case the page numbers are located at
> > the bottom of the page.
> > So my guess is since the word Chapter is found first
> > and on several pages
> > that the program thinks it is a heading and
> > therefore throws it out the
> > window.
> > I'm sure if the program bookshare is using was
> > written by them to add code
> > to skip the word chapter as a heading, but if not
> > then I'd seriously
> > recommend finding a new program that does what we
> > want, not what we don't.
> >
> > Jake
> > ----- Original Message ----- 
> > From: "Kyrath. (AKA Rob)" <kyrath@xxxxxxx>
> > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > Sent: Thursday, August 26, 2004 7:48 AM
> > Subject: [bksvol-discuss] Re: Automatic Stripper
> > problem
> >
> >
> > > Given the aggressive nature of the stripper, what
> > I now intend to do is
> > put
> > > in the actual page number 2 lines above the
> > chapter heading, assuming that
> > > page numbers are on top.  In theory, this should
> > prevent the stripper from
> > > getting her greedy little hands on the chapter
> > headings.  *grin*
> > > However, I wonder how the stripper treats headings
> > in books that have page
> > > numbers at the bottom of the page?
> > > -- Rob
> > >
> > > ----- Original Message ----- 
> > > From: "Jake" <jabrown@xxxxxxxxx>
> > > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > > Sent: Wednesday, August 25, 2004 11:02 PM
> > > Subject: [bksvol-discuss] Re: Automatic Stripper
> > problem
> > >
> > >
> > > > Yes, I recently discovered that the auto
> > stripper pretty much destroyed
> > my
> > > > first accepted submission.
> > > > I understand the reason for getting rid of the
> > headers, but when it gets
> > > rid
> > > > of critical information like Chapter zzz or
> > something, sometimes it is
> > > hard
> > > > to realize that  you are in fact in a new
> > chapter (I have also noticed
> > > this
> > > > with books I've downloaded).
> > > >
> > > > While going back and fixing the messed up titles
> > would be a long and
> > > > tiresome, not to mention cumbersome process, I
> > believe that we need to
> > get
> > > > this problem resolved so that all new
> > submissions are of a better
> > quality.
> > > >
> > > > So, would it be a good idea for me to strip the
> > headers in books before
> > I
> > > > submit them now?
> > > >
> > > > Thanks,
> > > > Jake Brownell
> > > > ----- Original Message ----- 
> > > > From: <socly@xxxxxxxxx>
> > > > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > > > Sent: Wednesday, August 25, 2004 9:23 PM
> > > > Subject: [bksvol-discuss] Re: Automatic Stripper
> > problem
> > > >
> > > >
> > > > > I, too, strip my headers before submitting or
> > uploading -- but what
> > you
> > > > say about the chapter headings worries me. Is
> > this a new problem? I've
> > > been
> > > > putting the
> > > > > page number on the first line, then skipping a
> > couple of lines before
> > > the
> > > > Chapter heading, be it Chapter and a number or
> > an actual title.  I hope
> > > they
> > > > haven't been
> > > > > stripped.  And everyone wants page numbers (if
> > you can't read the book
> > > at
> > > > one sitting, even when you're reading to
> > children, how do you know where
> > > you
> > > > left off?
> > > > > Of what if they want you to go back to a
> > particular page?  I do hope
> > > what
> > > > you found, Dilsia, was an aberration. Maybe
> > Jesse can clear it up for us
> > > > (and publish
> > > > > another list of books being worked on or
> > awaiting approval.)
> > > > >
> > > > > Cindy
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: Pam Quinn <quinns@xxxxxxxxxxxxx>
> > > > > Date: Wed, 25 Aug 2004 21:11:49 -0500
> > > > > To: bksvol-discuss@xxxxxxxxxxxxx
> > > > > Subject: [bksvol-discuss] Re: Automatic
> > Stripper problem
> > > > >
> > > > > > I agree. I manually strip my own headers now
> > before submitting, and
> > > > > > even if everybody didn't do this, I'd rather
> > see the headers left in
> > > > > > than to lose information that the automatic
> > stripper takes out. They
> > > > > > just don't work the way that they should. Oh
> > boy; here we go,
> > talking
> > > > > > about strippers again.
> > > > > >
> > > > > > Pam
> > > > > >
> > > > > >
> > > > > > On Wed, 25 Aug 2004 19:55:51 -0400, you
> > wrote:
> > > > > >
> > > > > > >Hi List:
> > > > > > >
> > > > > > >One of my books was accepted today. I
> > downloaded the book to find
> > out
> > > > if the chapter headings were stripped. I had
> > skipped a couple of blank
> > > lines
> > > > before
> > > > > each chapter number. Sure enough, all chapter
> > headings are gone as
> > well
> > > as
> > > > other important headings. Apparently the trick
> > of skipping a couple
> > lines
> > > > before each
> > > > > chapter heading is not working any more, if it
> > ever did. Does the
> > > > automatic stripper always have to be applied?
> > Personally I always strip
> > > > headers of books that I
> > > > > submit or validate. Another book that I
> > validated all the numbers were
> > > > stripped. The page numbers are important for
> > this particular book
> > because
> > > > it's a choose
> > > > > your own adventure which tells you to turn to
> > certain pages at
> > different
> > > > points in the story. I find it very annoying
> > that even the chapter
> > > headings
> > > > are stripped. I can
> > > > > understand the titles being stripped.  In my
> > humble opinion, I rather
> > > have
> > > > the page numbers be left in. It gives me an idea
> > how far I am into the
> > > book.
> > > > But at least
> > > > > the chapter heading
> > > > > >  s should
> > > > > > >definitely be preserved. Any suggestions on
> > how to preserve the
> > > chapter
> > > > headings?
> > > > > > >
> > > > > > >*****
> > > > > > >Grace
> > > > > > >
> > > > > > >MSN: gcpires@xxxxxxxxxxx
> > > > > >
> > > > > >
> > > > > -- 
> > > > >
> > _______________________________________________
> > > > > Find what you are looking for with the Lycos
> > Yellow Pages
> > > > >
> > > >
> > >
> >
>
http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp?SRC=lycos10
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
> >
>
>
>
>
> __________________________________
> Do you Yahoo!?
> New and Improved Yahoo! Mail - Send 10MB messages!
> http://promotions.yahoo.com/new_mail
>


Other related posts: