[bksvol-discuss] Re: Automatic Stripper problem

  • From: "Jake" <jabrown@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Thu, 26 Aug 2004 12:49:45 -0500

Interesting sounding work around, if it works, let me know.

Jake
----- Original Message ----- 
From: "Sarah Van Oosterwijck" <curiousentity@xxxxxxxxxxxxx>
To: <bksvol-discuss@xxxxxxxxxxxxx>
Sent: Thursday, August 26, 2004 11:50 AM
Subject: [bksvol-discuss] Re: Automatic Stripper problem


> Well, that works for getting rid of them, but the point is that it does
> nothing for the preservation of things that are not headers, like the
> chapter numbers and titles.  I don't understand why removing them is more
> important than keeping the book intact, and I completely agree with Jake
> that the stripper should be programmed to recognize the word chapter and
> leave it and anything after it alone.  It would also be best if it left
> roman numerals and all other numbers alone.  If not the page number they
> rarely occur in headers, so that would not limit the stripping ability of
> the program/script.
>
> I have been normalizing my headers instead of removing them, and also
> copying headers to the pages with chapter titles or numbers on them.  Of
> course I add the correct page number to the header.  I am hoping that they
> get eaten instead of the chapter titles, but nothing I have submitted has
> been approved since I started this procedure, so I am not sure if it
works.
> I have to admit some impatience to find out. :-)
> This problem is not an occasional one, it happens with almost every book,
as
> far as I can tell.
>
> Sarah Van Oosterwijck
> http://home.earthlink.net/~netentity
>
> ----- Original Message -----
> From: "Jesse Fahnestock" <Jesse.F@xxxxxxxxxxxx>
> To: <bksvol-discuss@xxxxxxxxxxxxx>
> Sent: Thursday, August 26, 2004 10:24 AM
> Subject: [bksvol-discuss] Re: Automatic Stripper problem
>
>
> > Hi all, as I see there is some new conversation about normalizing
headers
> and footers, I thought I would repost the guidelines for doing so. Please
> remember that you are not required to do this. But this is how to do it
> right if you want to take it on!
> >
> > ---Begin instructions for headers and footers--
> >
> > Volunteers can assist this tool by "normalizing" headers, footers, and
> > page numbers in submitted files where they do not appear consistent.
> > Normalizing such a headers/footers helps but it needs to be a
> > complete job, as normalizing just a few headers could skew the
> > probability of properly recognizing them throughout the book. If you
> > wish to undertake this task, please be sure to:
> >
> > 1) Check line position of text (the first paragraph on a given page
> > should be the header, the last should be the footer)
> > 2) Check that page numbers should have a space on either side,
> > separating them from the header/footer text. If the page number is
> > the first character in a header it does not need a space before it; or
if
> > it is the last character in a footer it does not need a space after it.
> > 3) Only change text in the header or footer in order to make it look
> > like all other headers/footers
> > 4) Perform 1-3 on every page.
> >
> > Remember that the automated tool is designed to be effective on most
> > scanned books so that you should undertake this "normalization"
> > process only if you are sure that the headers and footers in the book
> > you are validating are inconsistent and if you are able to normalize all
> > of them throughout the book.
> >
> > --end instructions--
> >
> > jesse.
> >
> > ________________________
> >
> > Jesse Fahnestock
> > Collection Development Coordinator, Bookshare.org
> > www.bookshare.org
> >
> > A Project of The Benetech Initiative - Technology Serving Humanity
> > 480 S. California Ave., Suite 201
> > Palo Alto, CA 94306-1609  USA
> > (650)475-5440 x133
> > (650) 475-1066 FAX
> > jesse@xxxxxxxxxxxx
> > www.benetech.org
> >
> > -----Original Message-----
> > From: bksvol-discuss-bounce@xxxxxxxxxxxxx
> > [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of Jake
> > Sent: den 26 augusti 2004 15:12
> > To: bksvol-discuss@xxxxxxxxxxxxx
> > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> >
> >
> > My guess is that is part of the issue. Many of the books I scan/plan to
> scan
> > have page numbers at the top of the page except on pages where a new
> chapter
> > begins, in that case the page numbers are located at the bottom of the
> page.
> > So my guess is since the word Chapter is found first and on several
pages
> > that the program thinks it is a heading and therefore throws it out the
> > window.
> > I'm sure if the program bookshare is using was written by them to add
code
> > to skip the word chapter as a heading, but if not then I'd seriously
> > recommend finding a new program that does what we want, not what we
don't.
> >
> > Jake
> > ----- Original Message -----
> > From: "Kyrath. (AKA Rob)" <kyrath@xxxxxxx>
> > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > Sent: Thursday, August 26, 2004 7:48 AM
> > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> >
> >
> > > Given the aggressive nature of the stripper, what I now intend to do
is
> > put
> > > in the actual page number 2 lines above the chapter heading, assuming
> that
> > > page numbers are on top.  In theory, this should prevent the stripper
> from
> > > getting her greedy little hands on the chapter headings.  *grin*
> > > However, I wonder how the stripper treats headings in books that have
> page
> > > numbers at the bottom of the page?
> > > -- Rob
> > >
> > > ----- Original Message -----
> > > From: "Jake" <jabrown@xxxxxxxxx>
> > > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > > Sent: Wednesday, August 25, 2004 11:02 PM
> > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> > >
> > >
> > > > Yes, I recently discovered that the auto stripper pretty much
> destroyed
> > my
> > > > first accepted submission.
> > > > I understand the reason for getting rid of the headers, but when it
> gets
> > > rid
> > > > of critical information like Chapter zzz or something, sometimes it
is
> > > hard
> > > > to realize that  you are in fact in a new chapter (I have also
noticed
> > > this
> > > > with books I've downloaded).
> > > >
> > > > While going back and fixing the messed up titles would be a long and
> > > > tiresome, not to mention cumbersome process, I believe that we need
to
> > get
> > > > this problem resolved so that all new submissions are of a better
> > quality.
> > > >
> > > > So, would it be a good idea for me to strip the headers in books
> before
> > I
> > > > submit them now?
> > > >
> > > > Thanks,
> > > > Jake Brownell
> > > > ----- Original Message -----
> > > > From: <socly@xxxxxxxxx>
> > > > To: <bksvol-discuss@xxxxxxxxxxxxx>
> > > > Sent: Wednesday, August 25, 2004 9:23 PM
> > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> > > >
> > > >
> > > > > I, too, strip my headers before submitting or uploading -- but
what
> > you
> > > > say about the chapter headings worries me. Is this a new problem?
I've
> > > been
> > > > putting the
> > > > > page number on the first line, then skipping a couple of lines
> before
> > > the
> > > > Chapter heading, be it Chapter and a number or an actual title.  I
> hope
> > > they
> > > > haven't been
> > > > > stripped.  And everyone wants page numbers (if you can't read the
> book
> > > at
> > > > one sitting, even when you're reading to children, how do you know
> where
> > > you
> > > > left off?
> > > > > Of what if they want you to go back to a particular page?  I do
hope
> > > what
> > > > you found, Dilsia, was an aberration. Maybe Jesse can clear it up
for
> us
> > > > (and publish
> > > > > another list of books being worked on or awaiting approval.)
> > > > >
> > > > > Cindy
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: Pam Quinn <quinns@xxxxxxxxxxxxx>
> > > > > Date: Wed, 25 Aug 2004 21:11:49 -0500
> > > > > To: bksvol-discuss@xxxxxxxxxxxxx
> > > > > Subject: [bksvol-discuss] Re: Automatic Stripper problem
> > > > >
> > > > > > I agree. I manually strip my own headers now before submitting,
> and
> > > > > > even if everybody didn't do this, I'd rather see the headers
left
> in
> > > > > > than to lose information that the automatic stripper takes out.
> They
> > > > > > just don't work the way that they should. Oh boy; here we go,
> > talking
> > > > > > about strippers again.
> > > > > >
> > > > > > Pam
> > > > > >
> > > > > >
> > > > > > On Wed, 25 Aug 2004 19:55:51 -0400, you wrote:
> > > > > >
> > > > > > >Hi List:
> > > > > > >
> > > > > > >One of my books was accepted today. I downloaded the book to
find
> > out
> > > > if the chapter headings were stripped. I had skipped a couple of
blank
> > > lines
> > > > before
> > > > > each chapter number. Sure enough, all chapter headings are gone as
> > well
> > > as
> > > > other important headings. Apparently the trick of skipping a couple
> > lines
> > > > before each
> > > > > chapter heading is not working any more, if it ever did. Does the
> > > > automatic stripper always have to be applied? Personally I always
> strip
> > > > headers of books that I
> > > > > submit or validate. Another book that I validated all the numbers
> were
> > > > stripped. The page numbers are important for this particular book
> > because
> > > > it's a choose
> > > > > your own adventure which tells you to turn to certain pages at
> > different
> > > > points in the story. I find it very annoying that even the chapter
> > > headings
> > > > are stripped. I can
> > > > > understand the titles being stripped.  In my humble opinion, I
> rather
> > > have
> > > > the page numbers be left in. It gives me an idea how far I am into
the
> > > book.
> > > > But at least
> > > > > the chapter heading
> > > > > >  s should
> > > > > > >definitely be preserved. Any suggestions on how to preserve the
> > > chapter
> > > > headings?
> > > > > > >
> > > > > > >*****
> > > > > > >Grace
> > > > > > >
> > > > > > >MSN: gcpires@xxxxxxxxxxx
> > > > > >
> > > > > >
> > > > > --
> > > > > _______________________________________________
> > > > > Find what you are looking for with the Lycos Yellow Pages
> > > > >
> > > >
> > >
> >
>
http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp
> ?SRC=lycos10
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>


Other related posts: