Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
extract text |
Post Reply ![]() |
Author | |
kavaler ![]() Beginner ![]() ![]() Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
![]() ![]() ![]() ![]() ![]() Posted: 27 Jan 10 at 2:08PM |
Hello
I scanned the document Has save it by name doc1.pdf How can I from Delphi take the text from doc1.pdf? |
|
![]() |
|
JanN ![]() Senior Member ![]() Joined: 29 Oct 05 Location: Germany Status: Offline Points: 116 |
![]() ![]() ![]() ![]() ![]() |
Hi,
QuickPdf is not able to extract text from image-only pdf files. Therefor you will need special OCR tools like OmniPage or Abbyy. |
|
![]() |
|
kavaler ![]() Beginner ![]() ![]() Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
![]() ![]() ![]() ![]() ![]() |
Can you tell me what I should do for this purpose?
|
|
![]() |
|
JanN ![]() Senior Member ![]() Joined: 29 Oct 05 Location: Germany Status: Offline Points: 116 |
![]() ![]() ![]() ![]() ![]() |
If I were you, I would buy OmniPage Professional (version is available for around 100$ on the net). It is able to recognize the text in scanned documents and convert them to searchable pdf files or to text files.
|
|
![]() |
|
kavaler ![]() Beginner ![]() ![]() Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
![]() ![]() ![]() ![]() ![]() |
How can I use it (OmniPage Proffessional) from Delphi?
|
|
![]() |
|
JanN ![]() Senior Member ![]() Joined: 29 Oct 05 Location: Germany Status: Offline Points: 116 |
![]() ![]() ![]() ![]() ![]() |
Google is your friend... ;)
OmniPage Professional is a standalone product. You can configure it to grab files from a specified folder and output the convertet files to another. Then you can work with those resulting files in Delphi. |
|
![]() |
|
Shotgun Tom ![]() Senior Member ![]() ![]() Joined: 14 Aug 09 Location: Phoenix, AZ Status: Offline Points: 53 |
![]() ![]() ![]() ![]() ![]() |
Be aware that stand alone products, like OmniPage, are not redistributable. That means if you are just doing this for your own use then they will work fine.
If, however, you are creating a program for others you'll need to obtain a SDK that may be distributed with your product. The SDK's are quite abit more expensive. You would be looking at prices that range between $400 and $3000.
Tom
|
|
![]() |
|
Ingo ![]() Moderator Group ![]() ![]() Joined: 29 Oct 05 Status: Offline Points: 3529 |
![]() ![]() ![]() ![]() ![]() |
Hi Tom!
There's a better (less expensive) method... The delphi-solution from kavaler and an additional OmniPage Pro ;-) Cheers, Ingo |
|
![]() |
|
dsola ![]() Team Player ![]() Joined: 28 Oct 05 Location: Croatia Status: Offline Points: 34 |
![]() ![]() ![]() ![]() ![]() |
Hi,
Try this http://jocr.sourceforge.net/index.html It's free and results for me were satisfactory. If the document is scanned well and have "normal" fonts maybe this will be enough. If You need working example just ask. |
|
registered QuickPDF user
|
|
![]() |
|
kavaler ![]() Beginner ![]() ![]() Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
![]() ![]() ![]() ![]() ![]() |
hello
Can you tell me how can I use it in Delhpi? |
|
![]() |
|
dsola ![]() Team Player ![]() Joined: 28 Oct 05 Location: Croatia Status: Offline Points: 34 |
![]() ![]() ![]() ![]() ![]() |
Hi,
I'll post delphi code shortly but for now try this. This is content of test.cmd file rem begin djpeg -grayscale -pnm YourPictureName.jpg YourPictureName.pnm gocr -i YourPictureName.pnm -o YourExtractedText.txt rem end test.cmd, djpeg.exe, gocr.exe, YourPictureName.jpg are together in the same directory. With this You can test if this method satisfies Your needs. |
|
registered QuickPDF user
|
|
![]() |
|
dsola ![]() Team Player ![]() Joined: 28 Oct 05 Location: Croatia Status: Offline Points: 34 |
![]() ![]() ![]() ![]() ![]() |
Here is part of Delphi code
// ttt.pbm - image for OCR (withh TFreeBitmap JPG can be converted to PBM or useing djpeg.exe) // ttt.txt - result of OCR procedure TOIBCaptchaKiller.Do_OCR; var StartupInfo : TStartupInfo; ProcessInfo : TProcessInformation; Res:boolean; cmdLine:array[0..512] of char; lpExitCode: DWORD; begin FillChar (StartupInfo, SizeOf(StartupInfo), 0); StartupInfo.cb := SizeOf(StartupInfo); StartupInfo.wShowWindow := SW_SHOWNORMAL;//SW_HIDE if CreateProcess (nil,Pchar('gocr -i ttt.pbm -o ttt.txt'),nil,nil,FALSE, 0,nil,nil, StartupInfo, ProcessInfo) then begin GetExitCodeProcess(ProcessInfo.hProcess, lpExitCode); while lpExitCode = STILL_ACTIVE do begin // sve dok se test applikacija ne ugasi nejdi nikud sleep(100); Application.ProcessMessages; GetExitCodeProcess(ProcessInfo.hProcess, lpExitCode); end; CloseHandle(ProcessInfo.hProcess); CloseHandle(ProcessInfo.hThread); end; end; |
|
registered QuickPDF user
|
|
![]() |
|
kavaler ![]() Beginner ![]() ![]() Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
![]() ![]() ![]() ![]() ![]() |
hello
I can't understand how may I to use it in Delphi this is my e-mail islam261@gmail.com adress please send me example(s) about this |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store