Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
GetContentStreamToVariant - strange |
Post Reply |
Author | ||
mLipok
Senior Member Joined: 23 Apr 14 Location: Poland, Zabrze Status: Offline Points: 453 |
Post Options
Thanks(0)
Posted: 17 May 14 at 10:29AM |
|
I have a document from the bank - the confirmation of the order. When I retrieves text using ExtractFilePageText, I receive 1701 bytes of text. But when I use: BinaryToString ($oQP.GetContentStreamToVariant()) I receive only a short piece: 0x710A3130203020302031302030202D312E3532373620636D0A302E312030203020302E312030203020636D0A2F517569636B504446584F663538326338643320446F0A510A Below string represtnation of this binary data: q 10 0 0 10 0 -1.5276 cm 0.1 0 0 0.1 0 0 cm /QuickPDFXOf582c8d3 Do Q Remarks: When I use ContentStreamCount then I get in results = 1 QUESTIONS: What am I doing wrong? Why function: GetContentStreamToVariant is not getting all the PDF content? |
||
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
|
Hi,
you have to select the relevant content stream you wanna have first: Cheers, Ingo |
||
Cheers,
Ingo |
||
mLipok
Senior Member Joined: 23 Apr 14 Location: Poland, Zabrze Status: Offline Points: 453 |
Post Options
Thanks(0)
|
|
If I understand you that you mean something like this: $oQP.LoadFromFile($sFileName, '') Local $sOutputFor $i = 1 to $oQP.ContentStreamCount() $oQP.SelectContentStream($i) $sOutput &= BinaryToString($oQP.GetContentStreamToVariant()) & @CRLF Next ClipPut($sOutput) result: q 10 0 0 10 0 -1.5276 cm 0.1 0 0 0.1 0 0 cm /QuickPDFXOf582c8d3 Do Q Do I understand it? Or still I doing something wrong? This single PDF file I can send to email (contains only one page of my private data) |
||
Ingo
Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3524 |
Post Options
Thanks(0)
|
|
What about reading the description about SelectContentStream first?
There you can read "...This function selects one of the selected page's content stream parts....". The selected pages... you've selected no page i think. And i don#t know the dev-language you're using but i fear in Output will always be only the very last content. Cheers, Ingo
|
||
Cheers,
Ingo |
||
mLipok
Senior Member Joined: 23 Apr 14 Location: Poland, Zabrze Status: Offline Points: 453 |
Post Options
Thanks(0)
|
|
it is: Concatenation assignment http://www.autoitscript.com/autoit3/docs/intro/lang_operators.htm
" So how yous see it: adds a new string without deleting the previous content But now I still think the problem is here: " When I use $oQP.ContentStreamCount() then I get in results = 1 " So there is only one Stream on this Page. And this file had only one Page. |
||
AndrewC
Moderator Group Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
Post Options
Thanks(0)
|
|
mLipok, That is a perfectly valid conten stream. The PDF content is contained in an XObject. CapturePage and DrawCapturedPage make use of XObjects. XObects are like macros and can be nested, rotated and scaled before being drawn. This is all documented in the PDF Specifications. Also you can call QP.CombineContentStreams to multiple content streams into a single contentstream. This will not help you in this case though. Andrew. |
||
mLipok
Senior Member Joined: 23 Apr 14 Location: Poland, Zabrze Status: Offline Points: 453 |
Post Options
Thanks(0)
|
|
It is puzzling why ExtractFilePageText() function gets the full content on this page?
|
||
AndrewC
Moderator Group Joined: 08 Dec 10 Location: Geelong, Aust Status: Offline Points: 841 |
Post Options
Thanks(0)
|
|
mLipok,
It is because the text extraction routines recursively parses the XObjects and takes care of all the scaling and translation that may occur. Proper text extraction is a complex process. Our text extraction is actually using the page rendering code to extract the text. There is no simple way to extract text for all PDF files. There are many different ways to draw text in a PDF. Andrew.
|
||
mLipok
Senior Member Joined: 23 Apr 14 Location: Poland, Zabrze Status: Offline Points: 453 |
Post Options
Thanks(0)
|
|
Thanks for the clarification.
I still have one concept. I have to first do the tests. Then if this will have any useful information. then I will share them.
|
||
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store