Extract web / email links

Post Reply

Author	Message Topic Search Topic Options Post Reply Create New Topic Printable Version Translate Topic
ZarkoGajic Members Profile Find Members Posts Beginner Joined: 18 Mar 09 Location: Croatia Status: Offline Points: 19	Post Options Post Reply Quote ZarkoGajic Report Post Thanks(0) Quote Reply Topic: Extract web / email links Posted: 18 May 10 at 2:01PM
	Hi, What would be the easiest way to extract web links like "http://" or "www.site.com" and email links like "mailto:mail@domain.com" or "mail@domain.com" from an existing PDF document? The GetAnnotStrProperty(111) would retrieve the annotation link value. I am looking for a way to extract those "web-like" links that a PDF reader would represent as web links and ask to open a web browser or start the default email client. -zarko
	-zarko gajic

Ingo Members Profile Find Members Posts Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3530	Post Options Post Reply Quote Ingo Report Post Thanks(0) Quote Reply Posted: 18 May 10 at 3:31PM
	Hi Zarko! You need this page: http://www.quickpdflibrary.com/help/quickpdf/AnnotationsAndHotspotLinks.php Additional there was an older newsletter from Rowan or Karl explaining how to separate these links. I think you should go on the official supportpages. There you'll find these things. Cheers, Ingo

ZarkoGajic Members Profile Find Members Posts Beginner Joined: 18 Mar 09 Location: Croatia Status: Offline Points: 19	Post Options Post Reply Quote ZarkoGajic Report Post Thanks(0) Quote Reply Posted: 18 May 10 at 3:35PM
	Ingo, Thanks. I'm aware of the Annotations related function. I'm looking for the best way to extract text and look for "web-alike" patterns :)
	-zarko gajic

dsola Members Profile Find Members Posts Team Player Joined: 28 Oct 05 Location: Croatia Status: Offline Points: 34	Post Options Post Reply Quote dsola Report Post Thanks(0) Quote Reply Posted: 27 May 10 at 3:57PM
	Hi, brute force ? GetPageText or direct access equivalent method and then search. If all links have same font or colour search would be simple. Pozdrav iz Nove Davor
	registered QuickPDF user

ZarkoGajic Members Profile Find Members Posts Beginner Joined: 18 Mar 09 Location: Croatia Status: Offline Points: 19	Post Options Post Reply Quote ZarkoGajic Report Post Thanks(0) Quote Reply Posted: 27 May 10 at 4:09PM
	Davore, thanks ... that's how it was done :)
	-zarko gajic

Ingo Members Profile Find Members Posts Moderator Group Joined: 29 Oct 05 Status: Offline Points: 3530	Post Options Post Reply Quote Ingo Report Post Thanks(0) Quote Reply Posted: 27 May 10 at 6:10PM
	Hi! I don't think that GetPageText will work in every case. With GetPageText you'll get things like "please klick on this link to enter the website" but you won't get the real link behind. You can do this things by yourself, too. I've made an Unencryption and search for things like http and so on in the real file-content. Cheers, Ingo

Post Reply
Tweet

Forum Jump

Forum Permissions View Drop Down

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot delete your posts in this forum
You cannot edit your posts in this forum
You cannot create polls in this forum
You cannot vote in polls in this forum