Hi Mary. Oh, I have done a small amount of scanning before. I have also
validated books, some of which have had no work done on cleaning them. I
would mostly agree with you that the junk characters are scanning errors
and not in the original. My point though is that this is not always the
case. Sometimes publishers do make mistakes, so you don't know that the
"perfect" printed copy is really perfect.
As far as old research archives, you are correct in that we could pay for
access to each article as well as anyone. I was simply thinking that it
would be nice. I think you mentioned using Kurzweil previously. If so,
you can easily recognize images. Just either print the image to the
virtual printer or open the image directly. This works very well for .tif
files. I think other OCR packages have this feature also.
As far as your comment on book scanning, no I do not scan books. It takes
more effort than I care for. I can validate instead. Also, my Kurzweil
seems to constantly crash so I doubt if it would make it through an entire
book if it had to.
Most libraries do not have any OTR books and if they do they are most
likely not current. A bunch of them have been released in the pas tcouple
of years but even for the public they are expensive. We are talking about
$35 per book. I really cannot justify that and I doubt if my library has
them. No, I am not going to look because I couldn't scan them anyway.