Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
extract text |
Post Reply
|
| Author | |
kavaler
Beginner
Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
Post Options
Thanks(0)
Quote Reply
Topic: extract textPosted: 27 Jan 10 at 2:08PM |
|
Hello
I scanned the document Has save it by name doc1.pdf How can I from Delphi take the text from doc1.pdf? |
|
![]() |
|
JanN
Senior Member
Joined: 29 Oct 05 Location: Germany Status: Offline Points: 116 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Jan 10 at 2:31PM |
|
Hi,
QuickPdf is not able to extract text from image-only pdf files. Therefor you will need special OCR tools like OmniPage or Abbyy. |
|
![]() |
|
kavaler
Beginner
Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Jan 10 at 4:02PM |
|
Can you tell me what I should do for this purpose?
|
|
![]() |
|
JanN
Senior Member
Joined: 29 Oct 05 Location: Germany Status: Offline Points: 116 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Jan 10 at 4:13PM |
|
If I were you, I would buy OmniPage Professional (version is available for around 100$ on the net). It is able to recognize the text in scanned documents and convert them to searchable pdf files or to text files.
|
|
![]() |
|
kavaler
Beginner
Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Jan 10 at 4:21PM |
|
How can I use it (OmniPage Proffessional) from Delphi?
|
|
![]() |
|
JanN
Senior Member
Joined: 29 Oct 05 Location: Germany Status: Offline Points: 116 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Jan 10 at 4:33PM |
|
Google is your friend... ;)
OmniPage Professional is a standalone product. You can configure it to grab files from a specified folder and output the convertet files to another. Then you can work with those resulting files in Delphi. |
|
![]() |
|
Shotgun Tom
Senior Member
Joined: 14 Aug 09 Location: Phoenix, AZ Status: Offline Points: 53 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Jan 10 at 5:42PM |
|
Be aware that stand alone products, like OmniPage, are not redistributable. That means if you are just doing this for your own use then they will work fine.
If, however, you are creating a program for others you'll need to obtain a SDK that may be distributed with your product. The SDK's are quite abit more expensive. You would be looking at prices that range between $400 and $3000.
Tom
|
|
![]() |
|
Ingo
Moderator Group
Joined: 29 Oct 05 Status: Offline Points: 3530 |
Post Options
Thanks(0)
Quote Reply
Posted: 27 Jan 10 at 9:04PM |
|
Hi Tom!
There's a better (less expensive) method... The delphi-solution from kavaler and an additional OmniPage Pro ;-) Cheers, Ingo |
|
![]() |
|
dsola
Team Player
Joined: 28 Oct 05 Location: Croatia Status: Offline Points: 34 |
Post Options
Thanks(0)
Quote Reply
Posted: 05 Feb 10 at 12:50PM |
|
Hi,
Try this http://jocr.sourceforge.net/index.html It's free and results for me were satisfactory. If the document is scanned well and have "normal" fonts maybe this will be enough. If You need working example just ask. |
|
|
registered QuickPDF user
|
|
![]() |
|
kavaler
Beginner
Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
Post Options
Thanks(0)
Quote Reply
Posted: 06 Feb 10 at 9:48AM |
|
hello
Can you tell me how can I use it in Delhpi? |
|
![]() |
|
dsola
Team Player
Joined: 28 Oct 05 Location: Croatia Status: Offline Points: 34 |
Post Options
Thanks(0)
Quote Reply
Posted: 08 Feb 10 at 8:23AM |
|
Hi,
I'll post delphi code shortly but for now try this. This is content of test.cmd file rem begin djpeg -grayscale -pnm YourPictureName.jpg YourPictureName.pnm gocr -i YourPictureName.pnm -o YourExtractedText.txt rem end test.cmd, djpeg.exe, gocr.exe, YourPictureName.jpg are together in the same directory. With this You can test if this method satisfies Your needs. |
|
|
registered QuickPDF user
|
|
![]() |
|
dsola
Team Player
Joined: 28 Oct 05 Location: Croatia Status: Offline Points: 34 |
Post Options
Thanks(0)
Quote Reply
Posted: 09 Feb 10 at 7:13AM |
|
Here is part of Delphi code
// ttt.pbm - image for OCR (withh TFreeBitmap JPG can be converted to PBM or useing djpeg.exe) // ttt.txt - result of OCR procedure TOIBCaptchaKiller.Do_OCR; var StartupInfo : TStartupInfo; ProcessInfo : TProcessInformation; Res:boolean; cmdLine:array[0..512] of char; lpExitCode: DWORD; begin FillChar (StartupInfo, SizeOf(StartupInfo), 0); StartupInfo.cb := SizeOf(StartupInfo); StartupInfo.wShowWindow := SW_SHOWNORMAL;//SW_HIDE if CreateProcess (nil,Pchar('gocr -i ttt.pbm -o ttt.txt'),nil,nil,FALSE, 0,nil,nil, StartupInfo, ProcessInfo) then begin GetExitCodeProcess(ProcessInfo.hProcess, lpExitCode); while lpExitCode = STILL_ACTIVE do begin // sve dok se test applikacija ne ugasi nejdi nikud sleep(100); Application.ProcessMessages; GetExitCodeProcess(ProcessInfo.hProcess, lpExitCode); end; CloseHandle(ProcessInfo.hProcess); CloseHandle(ProcessInfo.hThread); end; end; |
|
|
registered QuickPDF user
|
|
![]() |
|
kavaler
Beginner
Joined: 19 Jan 10 Location: Baku Status: Offline Points: 13 |
Post Options
Thanks(0)
Quote Reply
Posted: 11 Feb 10 at 10:12AM |
|
hello
I can't understand how may I to use it in Delphi this is my e-mail islam261@gmail.com adress please send me example(s) about this |
|
![]() |
|
Post Reply
|
|
|
Tweet
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store