[bksvol-discuss] Re: Bookshare.org and PDF

  • From: "Tracy Carcione" <carcione@xxxxxxxxxx>
  • To: bksvol-discuss@xxxxxxxxxxxxx
  • Date: Mon, 26 Mar 2007 09:45:37 -0400 (EDT)

When you say it's easily converted to Daisy, do you also mean BRF?  I do
not want braille to be the poor forgotten stepchild it often seems to be.
Tracy

> Lisa brought my attention some of the discussion about PDF happening on
> one of the Bookshare.org lists.  I just gave a presentation with Adobe
> at CSUN, and have posted the presentation below.
>
>
>
> The short story is that we should be able to turn tagged PDF into DAISY
> very easily.  And, that untagged PDF is still a big problem.  Some of
> the books we're getting from publishers are coming in tagged PDF, so
> this is an attractive approach for us.
>
>
>
> Jim
>
>
>
>
> Accessible PDF to DAISY/NIMAS Conversion
>
>
> Jim Fruchterman, Benetech
>
>
> Andres Gonzalez, Mike Wirth, Adobe Systems Inc.
>
>
> March, 2007
>
>
>
>
>
> *            Bookshare and the PDF/XML Need
>
>
> *            Adobe's Acrobat Accessibility Work
>
>
> *            Technology Demonstration
>
>
>
>
>
> I. Bookshare and the PDF/XML Need
>
>
> Bookshare.org
>           A library of digital text
>
>
> *   31,000 accessible XML Books
>
>
> *   95% come from volunteer scanning
>
>
> *     Repository is growing at a rate of 400-500 books a month
>
>
> *   Increasing numbers coming directly from publishers and authors
>
>
> *     Better quality than scanned
>
>
> *     International permissions
>
>
> *     Format conversion challenge
>
>
>
>
>
> DAISY format
>           Digital Audio-based Information System
>
>
> *    The DAISY XML standard is our core format
>
>
> *    Books are read on a computer using synthetic speech
>
>
> *    NISO/DAISY 3.0 XML specification enables text-based navigation,
> such as page numbers, and paragraphs
>
>
> *    Think of it as a web page (HTML) plus a couple of extra tags (page
> numbers, chapters)
>
>
> *    NIMAS (the new K-12 accessible textbook standard in the U.S.) is
> based on DAISY
>
>
>
>
>
>
>
>
>
>
>
> Need to convert PDF to XML
>           Transforming visual to accessible
>
>
> *   Most publishers are able to create PDF of their books
>
>
> *   Goal is a smooth transformation from accessible PDF to DAISY
>
>
> *    Turning PDF books and documents into highly accessible DAISY
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> PDF Structure and Accessibility
>          Tagged PDF
>
>
> *    Became part of the PDF specs in Acrobat 5 motivated by:
>
>
> *     eBooks
>
>
> *     Accessibility
>
>
> *    Adds document structure and logical order to PDF:
>
>
> *     Pages, paragraphs, tables
>
>
> *     Reading order
>
>
> *    Can be semantically rich
>
>
> *    Preserves PDF visual fidelity and portability
>
>
> Creating Tagged PDF
>             Tagging PDF
>
>
> *   MakeAccessible
>
>
> *     Automatically adds tags to an existing untagged PDF
>
>
> *   TouchUp
>
>
> *     Allows authors to add and correct tagging
>
>
> *   Accessibility checker
>
>
> *     Checks for common tagging problems and provides suggestions how to
> fix them
>
>
> *   PDFMaker
>
>
> *     Creates tagged PDF from other authoring applications.
>
>
>
>
>
> Tagged PDF to DAISY Conversion
>             PDDOM
>
>
> *   XML DOM-like representation of PDF
>
>
> *   Provides programmatic access to the tag structure of the PDF file
>
>
> *   Cornerstone for Acrobat's
>
>
> *     Assistive technologies support
>
>
> *     PDF conversion to XML
>
>
> Tagged PDF to DAISY Conversion
>             Acrobat SaveAsXML relies on PDDOM
>
>
>
>
> *   Scriptable XML parsing engine to produce different types of
> XML-based outputs
>
>
> *   Had to be extended to produce DAISY
>
>
> *     To include page numbers
>
>
> *      Layout and formatting information
>
>
> *     XML post-processing of SaveAsXML output produces DAISY
>
>
>
>
>
> Demonstration of Current Prototype Technology
>                 Demo of a real eBook from a major publisher
>
>
> *    Starting point: A tagged PDF novel
>
>
> *     Taliesin by Stephen Lawhead
>
>
> *    Using Acrobat, Save As XML
>
>
> *     We have modified to save DAISY tags
>
>
> *     Converts PDF tags into equivalent XML
>
>
> *    Next, need to add key metadata
>
>
> *     For DAISY, four main fields
>
>
> *    Final, create DAISY ebook
>
>
> *    Show it in gh Player
>
>
> Conclusion
>
>
>
> *     PDF to XML Technology works well on most tagged PDF documents
>
>
> *     Needs to be wrapped into a user-friendly package
>
>
> *     Web-based service for schools and qualified users
>
>
> *     Tool for Acrobat users
>
>
> *     Need to complete work and testing
>
>
> *     Complex books (textbooks) will still take human intervention to
> create fully compliant NIMAS
>
>
> *     Existence of this capability should serve to drive increased
> creation of accessible PDF
>
>
> Find out More
>
>
> Adobe Accessibility   Bookshare.org
>
>
> www.adobe.com/accessibility <http://www.adobe.com/accessibility>
> www.bookshare.org <http://www.bookshare.org/>
>
>
>
>
>
> Mike Wirth,       Lisa Friendly
>
>
> mwirth@xxxxxxxxx <mailto:mwirth@xxxxxxxxx>         650-644-3420
>
>
>                Lisa.f@xxxxxxxxxxxx <mailto:Lisa.f@xxxxxxxxxxxx>
>
>
>
>
>
> Andres Gonzalez              Jim Fruchterman
>
>
> andgonza@xxxxxxxxx <mailto:andgonza@xxxxxxxxx>     jim@xxxxxxxxxxxx
> <mailto:jim@xxxxxxxxxxxx>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of 
available commands, put the word 'help' by itself in the subject line.

Other related posts: