Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!

Debenu Quick PDF Library - PDF SDK Community Forum Homepage
Forum Home Forum Home > For Users of the Library > General Discussion
  New Posts New Posts RSS Feed - How to get internal structure of the PDF file?
  FAQ FAQ  Forum Search   Register Register  Login Login

How to get internal structure of the PDF file?

 Post Reply Post Reply
Author
Message
saravanan6 View Drop Down
Beginner
Beginner


Joined: 10 Jan 12
Location: India
Status: Offline
Points: 1
Post Options Post Options   Thanks (0) Thanks(0)   Quote saravanan6 Quote  Post ReplyReply Direct Link To This Post Topic: How to get internal structure of the PDF file?
    Posted: 10 Jan 12 at 5:36AM
Hi All,

    I would like to know if there is any tool available for getting internal structure(XML BASED) of the PDF file likewise Open XML representation for MS-OFFICE 2007?

Please enlighten me on this...?


Thanks & Regards,
P.SARAVANAN

Back to Top
Ingo View Drop Down
Moderator Group
Moderator Group
Avatar

Joined: 29 Oct 05
Status: Offline
Points: 3524
Post Options Post Options   Thanks (0) Thanks(0)   Quote Ingo Quote  Post ReplyReply Direct Link To This Post Posted: 10 Jan 12 at 7:46AM
Hi!
 
You can create such a tool WITH QuickPDF but QP doesn't offer
a ready made functionality for this purpose.
You should try PDFCosEdit:
The demo mode already allows to browse through the pdf-objects.
Please keep in mind that pdf is for presentation and so the objects
are one after the other and not well structured.
 
Cheers and welcome here,
Ingo
 
Back to Top
edvoigt View Drop Down
Senior Member
Senior Member
Avatar

Joined: 26 Mar 11
Location: Berlin, Germany
Status: Offline
Points: 111
Post Options Post Options   Thanks (0) Thanks(0)   Quote edvoigt Quote  Post ReplyReply Direct Link To This Post Posted: 10 Jan 12 at 9:08AM
Hi,

the structures inside PDF (directories, arrays, objects) are at most no xml. Only some parts (XMP, XFA) are XML. To investigate a PDF is rather complicated, because the structure is only in generell tree-like. On the other hand it is possible (and saves resources) to have more than one relation to objects form different places in the PDF.  So the first idea, to represent a PDF-structure by a tree is not showing the reality. In truth it is a graph and therefore it is not possible without transforming (doubling/inherit parts) the structure to get it in xml.

So I think, there is no such tool, as you want it - without you make it. It would be possible with QuickPDF (GetPageContent..., GetObject...), but it is more than a half-hour-job, depending on how deep you want to look inside.

To see internals there is a dialog based tool for free, the enfocus-browser:
quotation from http://www.enfocus.com/product.php?id=4530:
To the knowledgeable user, the Enfocus Browser offers functionality to get information on all types of objects (Dictionaries, Arrays, Streams, ...) in the PDF. Starting with the "Info" and "Root" dictionaries, the application resolves all indirect object references while you dig deeper into the data structure. If enabled, relevant parts of the data from a PDF file can even be altered, offering very low-level editing capabilities.

Another program gives you a look inside the PDF (but with a different goal) is there to find: http://blog.zeltser.com/post/3235995383/pdf-stream-dumper-malicious-file-analysis . It is made to look deeper in, but in dialog too.

In hope it helps,

Werner


Edited by edvoigt - 10 Jan 12 at 9:09AM
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down

Forum Software by Web Wiz Forums® version 11.01
Copyright ©2001-2014 Web Wiz Ltd.

Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. AboutContactBlogSupportOnline Store