[bksvol-discuss] Scanning Chat Summary Available

  • From: "Linda Adams" <ladams@xxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Mon, 20 Aug 2007 15:37:27 -0600

Hi, everyone.  Attached are the Word and text versions of the chat session on 
getting good scans.  

Linda Adams
June 12, 2007

GETTING GOOD SCANS

** Best Scanning Results for Blind People Wanting to Scan

Use either OpenBook scanning software from Freedom Scientific 
(www.freedomscientific.com) or Kurzweil 1000.  Most people prefer Kurzweil over 
OpenBook because it gives more flexibility in settings and editing features.  
Kurzweil has been upgraded more often then OpenBook.  These two programs are 
especially designed for blind users.  

** Types of Books, Their Good Points, and Their Problems

1.  Try to get hard cover books or the nicer trade paperbacks.  Try not to get 
the ones that say Mass Market.  The hard cover and trade paperback books have 
nicer print.  Paperbacks are easier in terms of the spines.  

2.  If you get the Mass Market books, they work if you do a bit of preparation 
work with your scanning program of choice.  Increase the resolution or dots per 
inch.  Try 400 dots per inch instead of the default setting of 300 dots per 
inch.  Experiment with different reader engines if you have multiple choices.  
There are different options in Kurzweil and OpenBook.  The scans will tend to 
come out pretty well, but Mass Market books tend to yellow more quickly and 
deteriorate more quickly because they are not as high quality, but you can 
usually get a good scan out of those.  Grayscale gives you a better scan with 
Mass Market paperbacks or when the print is of poor quality.  The 400 
resolution setting is good for books with faded ink.  

3.  Photocopied Material:  Usually photocopied material is of poor quality, and 
if you need this material for your job or school, you will need a sighted 
person to look it over to check its accuracy.  

** Scanning Steps

1.  Get a dry cloth and lightly dust the scanning surface off.  If you 
determine that the scanner glass is dirty enough to need more than a dry cloth, 
use plain water.  Use a cloth made of the material of a cloth diaper.  It 
doesn't have lint or a muslin towel that doesn't shed.  You could also use a 
microfilament cloth for cleaning the scanner.  These are sold at computer 
stores.  Preferably do not use any solution.  Try not to use even water because 
the glass in your scanner is in layers, not solid, and water or solution can 
get stuck between the layers.  A lint-free cloth will remove ink from the 
glass.  

2.  Open the book and fan the pages out.  Run your fingers along the spine on 
the inside of the book to limber it up, especially for paperbacks, so that you 
can lay the book flat.  This is especially important if you are going to do 
two-page scanning because you have to be able to get all of the text close to 
the spine scanned and recognized.  If you own the book, don't be afraid to work 
the spine over until it lets you flatten the book completely.  Bend the spine 
back all the way.  Of course if the book is borrowed, you have to be more 
gentle.  

3.  Brush the pages off with your fingers for Mass Market paperbacks to get the 
ink dust off them so that this dust will not coat the scanning surface.  This 
will give you a better scan.  

4.  Press the book down flat on the scanning surface.  If it has some give or 
bounces back up away from the glass when you release it, hold the spine down as 
you scan to make sure that there is no space between the book and the scanning 
surface.  

5.  Open to the center of the book, and if you have Kurzweil, use the optimize 
setting for the scan for two or three pages of the book to see if any setting 
adjustments need to be made.  Kurzweil does a fairly good job picking the 
optimal settings to scan a particular book unless the print quality is 
exceptionally bad.  Once optimization is complete, go through maybe five or six 
scans to see if those particular settings are worth keeping because sometimes 
you may find that you are on a page that is particularly bad, you are scanning 
a page with a table, or you are scanning a page with a photograph.  Rely on 
results from five or six different pages or scans to be sure that your scans 
and results are consistent.  

6.  Tinker with the settings such as brightness and resolution until you get 
what you feel is the best scan that you are going to get.  

7.  For Kurzweil users, Grayscale is the best thing to try when optimization 
did not produce the quality that you wanted.  Grayscale will make your scans 
slower, though.  Grayscale gives the best page representation as opposed to 
dynamic thresholding or automatic contrast.  

8.  There is no magic setting that works for everything.  You can start in 
Kurzweil with the default settings and see if they work.  If that doesn't work, 
optimize.  If that doesn't work, change some of the settings yourself.  

9.  In Kurzweil, if the statistics say 95 percent confidence level or less, 
rescan the page to try for a better scan.  Otherwise, you will have to struggle 
with many errors on the page.  

10.  Do ranked spelling or spell check.  

** Scanner Settings:  OpenBook Paperbacks
  
1.  Turn off the despeckle feature.  
2.  Turn off the Language Analyst feature.  
3.  Turn off light text on black background.  

** OpenBook Settings

For decreasing quality with successive scans, hold the pages down firmly, 
especially the spine.  Your hands and fingers tend to get tired, and you tend 
to lose the grip on the pages.  Sometimes the pages move.  Once you have 
optimized those settings, things should be the same unless the print quality 
changes.  You don't need to optimize every two pages.  You will get the best 
scans for the widest variety of material if you use the default setting in 
OpenBook, which is Scan for Accuracy in the Windows menu.  If you are using the 
classic menus, it is the Automatic Contrast setting.  This setting is 
equivalent to the Kurzweil's dynamic thresholding.  For some material, you will 
get better results if you use Custom Scan.  Experiment in that setting.  Most 
of the time with Custom Scan, the results will be worse or no better, so use 
the default settings and then only experiment with them if they don't work.  
Another reason to use the default settings is that if you use Custom Scan and 
the print changes, if the contrast changes in the book, then it may not match 
the custom settings that you scanned.  Using Auto Contrast will automatically 
compensate for scanning differences in the text, and it will either eliminate 
or cut errors way down.  If you scan two pages at once, use the two-page 
scanning setting so that each page will be recognized as a separate page.  
Using the two-page setting keeps material from one page being mixed with 
material from another page.  If lumps start to form as you are scanning and 
accumulating pages, press down and in the direction of where the lump is, 
either on the left or the right.  You may also need to pick up the book and 
smooth out the pages so that no creases start to form.  These two problems will 
progressively decrease the quality of your scans.  



** Kurzweil:  Trade Paperbacks 
1.  Turn off speckle removal.  
2.  Many times, especially in a book series, you can optimize a paperback and 
keep those settings for other paperbacks.  
 

** Kurzweil:  Hard Cover Books 

1.  You can use the speckle removal more for hard cover books, or use it for 
thick pages.  Speckles go behind the text as decorations.  These make books and 
papers more distinctive and attractive.  If you have the setting on, it 
confuses the OCR engine many times.  Sometimes when it removes speckles, it can 
also remove text that you want to be left in place, so try to use this setting 
sparingly.  

2.  Whenever you get a new hard cover book, you will probably have to optimize 
the scanning; no two hard cover books seem to take the same settings.  

** Kurzweil Settings

In Kurzweil versions before Version 11, Kurzweil's default threshold setting is 
static thresholding.  In Version 11, the default setting is dynamic 
thresholding.  

** Settings for Mass Market Paperbacks

1.  Use Grayscale instead of the default Dynamic threshold setting.  

2.  In terms of brightness, use settings between 60 and 70.  The higher in that 
category, the better.  

3.  For most paperbacks, you can scan two pages at a time, so set the scanner 
to recognize two pages per scan.  

4.  For paperbacks, most of the time, you will not need to turn on the column 
setting.  It is helpful to have columnization off for books that don't have any 
columns so that you don't get into trouble with tables not being recognized.  

5.  You can set your front margin for 4.7 in two-page scanning to make the 
scanning go faster.  

6.  You can sometimes remove the lid from the scanner and then just lay it down 
on the scanner over something thick.  

7.  Optimize the beginning of the book where the title page and copyright 
information is, then change your optimization for the rest of the book as these 
settings will usually be different.  To keep from having to do two 
optimizations, however, in most cases, you can choose Grayscale for title and 
copyright pages, and they will come out fine.  For Bookshare, the preliminary 
pages are not as important; just be sure that the copyright information is 
correct.  

** Fine Reader, OmniPage, and Premiere Systems

For people who cannot afford OpenBook or Kurzweil, the plain commercial version 
of FineReader works with JAWS.  You can use FineReader 7 and 8 with JAWS.  You 
can use OmniPage 9, 10, and 12.  Obviously there is a learning curve, and you 
get a little spoiled with the blind friendly features of both OpenBook and 
Kurzweil, but if money is an issue, FineReader is certainly an excellent 
choice.  It is the engine that most of us use in Kurzweil and OpenBook.  The 
FineReader people have done a lot to make their software very flexible and 
versatile.  You can have it scan and put your books directly into Microsoft 
Word.  It speaks well.  The latest version of OmniPage is 15, and the latest 
version of FineReader is Version 8.  FineReader's interface with screenreaders 
is a little more simple to navigate, whether you use JAWS, Window Eyes, System 
Access in VDA or Thunder.  OmniPage also has a decent interface.  There are a 
few tweaks and options that you need to remember.  Pratik Patel will tell you 
about these if you are interested.  When it comes to settings, they are fairly 
similar to what you might expect in Kurzweil.  Only a few of Kurzweil's 
settings are proprietary.  Most of them rely on the OCR (optical character 
recognition) engines provide.  The settings may be called something slightly 
different; for example, Automatic Contrast and Dynamic Thresholding, but they 
mean very much the same thing.  Both of the products have very good 
documentation.  These will teach you a lot about their settings.  Premiere Text 
Programming Cloner is very accessible and affordable, from $49 to $149.  It is 
a very simple program, but the results that you get are very inferior to any of 
the other programs that it is not recommended for scanning books for Bookshare. 
 They have upgraded with a few different pieces of scanning software that could 
be tested for quality.  They have demos, so try them out yourself to see what 
you think.  Premiere Assistive Technology's products are mainly designed for 
those who are learning disabled.  There are a few products from this company 
designed for blind people that are accessible to screenreaders, but their main 
focus is on learning disabilities.  There is heavy focus on tracking, lighting, 
thesaurus, so you may find that a lot of scanning features do not necessarily 
work as well for blind and visually impaired people.  

** Optic Book 3600

It is a book-edge scanner.  It is fantastic for books that have large pages and 
you are not able to place two pages at a time on your scanner.  It is good for 
books that you are getting from a library and you don't want to break the 
binding.  It is the fastest scanner seen within the price range of between $200 
and $250.  The software is not really accessible.  Unfortunately you have to 
install the Optic Book software in order to operate it.  Kurzweil will bypass 
that software in order to access Optic Book.  There are optimal settings that 
you can use in Kurzweil with Optic Book such as the WIA drivers that are Twain 
that will make your experience even better.  The first time when you set it up, 
you will need a little sighted assistance if you want to set the lamp to go off 
after 15 minutes.  The usual setting is for the lamp to go off after one 
minute.  You will definitely want that setting changed because one big down 
side to this particular scanner is that once the lamp goes off, if you want to 
warm it up, it takes about 30 seconds to warm up.    That is very important.  
With the 15-minute setting, it will go off after 15 minutes of inactivity.  You 
get a good scan within about five to six seconds.  Get the latest drivers from 
their site because they install easier.  The latest ones are dated February of 
2007.  You want the 3600 scanner.  You don't want the Plus.  The other scanners 
have software that we can't use and don't need.  Plug the scanner in first.  XP 
will come up with the New Hardware Wizard, and I pointed it at the driver 
folder, and away it went.  The Twain settings will work, but use the WIA 
settings because you can't pause the scanner with the Twain.  It gives better 
results through Kurzweil than Epson.  This scanner was specifically made for 
scanning books, not for graphics but for text.  Larry Lumpkin has a set of 
instructions that he got from Nick Dotson at Kurzweil.  Go into Windows 
Explorer, find Kurzweil Educational Systems, then find Diacs, and then there is 
a program in there called ScanConf.  You choose the scanner.  It has settings 
regarding document feeder, duplex, holding the Twain source between scans (set 
that to never).  The scanners are either in their Twain version or their WIA 
version, and you can set the settings for the WIA version.  Within Kurzweil 
itself, you go to the Scanner Settings and choose Optic Book 3600 WIA.  The 
scanner itself does not come with a document feeder.  This scanner should work 
with any version of Kurzweil back to at least version 8.  Even version 7 works. 
 Economically, you should upgrade to the latest version of Kurzweil and keep 
using your old scanner before you try to purchase this scanner.  The only 
drawback to this scanner is that it takes about a minute to warm up when you 
turn it on, and it doesn't scan pictures or other things.  Grayscale is 
noticeably slower with the Optic Book.  If you use Grayscale, don't keep images 
in your files.  Do not enable the Keep Images in Recognized file.  

** Getting Your Books Approved and Not Rejected

1.  Periodically check for scanning clarity.  Every 15 or 20 pages, look at the 
last page of your file and see if the settings are still producing accurate 
results.  Is the material still clear?  When you are finished scanning the 
book, page down through all the pages and make sure that you have all of the 
pages in the book.  

2.  Prevention of Two Pages Sticking Together:  Sometimes it is easy for two 
pages to stick together and you skip one by accident.  Pay attention to the 
thickness of each page as you are scanning a specific book.  If pages stick 
together, they may likely feel more thick, a little more rigid, or they may 
feel different in some way.  As you turn the page, rub your finger and thumb of 
one hand on opposite sides of the page as you hold it so that if two pages are 
stuck together, you will increase the chance that they will separate.  Read at 
the top of each page to see if the scanner picked up the number on that page.  
If the scanner is reading your book's page numbers consistently, you will know 
whether you skipped a page that way.  Some page numbers are at the bottom of 
books, so check the top and bottom of every page for these page numbers.  If 
you are using OpenBook, use the setting Scan and Read; don't scan ahead so that 
you can keep track of page numbers. If no page numbers in your book are 
announced, you can scan ahead because you won't hear page numbers anyway.  In 
Kurzweil, there is an Operator Page setting to tell Kurzweil what page you are 
on, so when you find the page in your book that begins content, you can tell 
Kurzweil that that is page one.  Use the setting that says Keep Blank Pages.  
This is especially important for books that have blank pages between chapters.  
This also helps you to keep track of page numbers.  This helps Bookshare as 
well with page numbers.  

3.  Check for the reason that a page has garbled text.  Maybe one of the pages 
becomes very garbled so that validators have to find book submitters to get the 
original page that goes in the book.  If the validators or Bookshare staff 
cannot locate the submitter, Bookshare has to reject the book.  If a page has 
gibberish, rescan it to see if it can be better or whether the garbled material 
is a picture.  Once you get a good scan from many pages of your book, save 
those settings in Kurzweil for that book.  

** Learn by Experimenting

Everyone's opinion is different.  Try the suggestions here and see what works 
best for you.  By being wiling to experiment, you will develop a system of your 
own that will be efficient, accurate, and save you many hours of work and 
frustration in your scanning.  


Other related posts: