GetPageText trunc the text with unicode char 65533
Printed From: Debenu Quick PDF Library - PDF SDK Community Forum
Category: For Users of the Library
Forum Name: I need help - I can help
Forum Description: Problems and solutions while programming with the Debenu Quick PDF Library and Debenu PDF Viewer SDK
URL: http://www.quickpdf.org/forum/forum_posts.asp?TID=2007
Printed Date: 29 Sep 24 at 1:15AM Software Version: Web Wiz Forums 11.01 - http://www.webwizforums.com
Topic: GetPageText trunc the text with unicode char 65533
Posted By: bart_bender
Subject: GetPageText trunc the text with unicode char 65533
Date Posted: 20 Oct 11 at 2:39PM
Hello, I'm extracting the Text from a PDF document with GetPageText method and this return a truncated text http://demos.sdm.es/marketing/images/doc.pdf - http://demos.sdm.es/marketing/images/doc.pdf
The last char is 65533 unicode o 63 in ascii
I'm using the DLL version in a vb .net proyect
any idea?
Thanks in advance Best regards
|
Replies:
Posted By: Ingo
Date Posted: 20 Oct 11 at 10:30PM
Hi Bart!
Where's the problem? I've extracted your form completely and it's all okay. I've removed the adress ... I've done it with version 7.26:
FORMA DE PAGO: TRANSFERENCIA REF. DE PAGO : REF.: /001164/756661 NIF:A23453445 9364651 444,24 8 ESTERIL PL.1,5 160 1,870 299,20 0,967 144,48 PUNTO VERDE x 100 1,147 1,84 1,84 349 FRESA RACION PAK 6 4 5,180 20,72 2,608 10,29 PUNTO VERDE x 100 3,421 0,14 0,14 658 CACAOLAT SYS RAC 6 24 5,180 124,32 2,608 61,73 PUNTO VERDE x 100 3,421 0,82 0,82 TOTAL DE PUNTO VERDE *** 2,80 *** 146,32 72,98 04,00% 5,85 07,00% 5,11 447,04 227,74 10,96 230,26 GIRONA CONTRAVALOR CR NACIONAL II KM 7O8,2 EN PTA. P.P. 38312
|
Posted By: bart_bender
Date Posted: 21 Oct 11 at 12:49PM
Hello Ingo,
Thanks for your answer.
I'm using QuickPDFDLL0811.dll I got this result with the argument 0
EMISOR: GIRONA CR NACIONAL II KM 7O8,2 17181 AIGUAVIVA TEL.:972478000 PAG.: 1 YOIGO FECHA FACTURA: 02-01-2009 NOMBRE RAZON SOCIAL NUM. FACTURA : ( ) 0075123534 DIRECCION P. SUMINISTRO: 02-01-09 08006 BARCELONA FECHA DE PAGO: 05-04-09 BARCELONA FORMA DE PAGO: TRANSFERENCIA REF. DE PAGO : REF.: /001164/756661 NIF:A23453445
9364651 444,24
8 ESTERIL PL.1,5 160 1,870 299,20 0,967 144,48 PUNTO VERDE x 100 1,147 1,84 1,84 349 FRESA RACION PAK 6 4 5,180
|
Posted By: AndrewC
Date Posted: 24 Oct 11 at 8:03AM
There was a truncating bug in 8.11 that has been fixed in 8.12. The text is extracting correctly in 8.12 using GetPageText 0, 3 and 7. 8.12 is a free upgrade for owners of 8.11. The bug was caused by the recent major Unicode changes to the 8.11 version from 7.26.
Interestingly your PDF is 705mm wide by 998mm high which seems a big big for invoice. This currently affects the output of 8.12 with option 7.
Andrew.
|
Posted By: bart_bender
Date Posted: 24 Oct 11 at 2:19PM
Thanks for your help Andrew
|
Posted By: bart_bender
Date Posted: 24 Oct 11 at 3:07PM
Hello Andrew
I'm testing the new version and i got the text truncated with the 8.12 equals 8.11
The Size that return dll.QuickPDFStringResultLength(instanceID) is half that real text
In the 8.12 version is necesary to change two lines in the RC method for the Visual Basic Class
Private Function SR(ByVal data As IntPtr) As String 1-->>> Dim size As Integer = dll.QuickPDFStringResultLength(instanceID) <<<-- Dim result As Byte() = New Byte(size - 1) {} Marshal.Copy(data, result, 0, size) 2-->>> Return Encoding.Default.GetString(result) <<<-- End Function
1 ->> Dim size As Integer = dll.QuickPDFStringResultLength(instanceID) * 2 2 ->> Return Encoding.Unicode.GetString(result)
Best Regards
|
Posted By: AndrewC
Date Posted: 26 Oct 11 at 3:25AM
Thanks for the fixes. I will test it and then get these changes into the VB code. I was doing all my testing with C# and it is working correctly.
The C# code for your reference is
private string SR(IntPtr data) { int size = dll.QuickPDFStringResultLength(instanceID); byte[] result = new byte[size * 2]; Marshal.Copy(data, result, 0, size * 2); return Encoding.Unicode.GetString(result); }
so in VB it probably should be
Private Function SR(ByVal data as IntPtr) As String Dim size As Integer = dll.QuickPDFStringResultLength(instanceID) Dim result As Byte() = New Byte(size * 2 - 1) {} // as per post below Marshal.Copy(data, result, 0, size * 2) Return Encoding.Unicode.GetString(result) End Function
Andrew.
|
Posted By: bart_bender
Date Posted: 26 Oct 11 at 10:54AM
Hello, Change the code line Dim result As Byte() = New Byte(size * 2) {} to Dim result As Byte() = New Byte(size * 2 - 1) {}
|
|