Do you own a Debenu Quick PDF Library version 7, 8, 9, 10, 11, 12, 13 or iSEDQuickPDF license? Upgrade to Debenu Quick PDF Library 14 today!
![]() |
Problems Merging Forms |
Post Reply ![]() |
Author | |
bogey ![]() Senior Member ![]() ![]() Joined: 30 Nov 05 Location: United States Status: Offline Points: 50 |
![]() ![]() ![]() ![]() ![]() Posted: 30 Jan 06 at 2:38PM |
I need to get a better understanding of how PDF files are merged. We have been having terrible problems with form kits that are generated by QuickPDF since we started adding lots of fields. We have had a process in place for years that combined multiple PDFs into a single download pdf. It had worked very well using the merging in memory. Since then we have added fillable fields to many of the PDF files, some with a couple hundred fields. The process kept getting slower as files sizes increased. I rewrote the process recently to merge using the MergeFileListFast() method, but it is still not as fast as we expected. The biggest problem is that if we have different forms with fields that are named the same, once they are merged, the second occurance of the field is no longer fillable. These merged files actually crash Acrobat 5. I need a better understanding of exactly what is done during a merge. Is is simple contatenation of the files? Or are they really merged into a single PDF "container". Is there a process that can be done to better combine the files so fields work as though they are in a single document? I can provide sample PDF files if someone has a way to analyze them and tell me what is damaged by the process.
|
|
![]() |
|
swb1 ![]() Debenu Quick PDF Library Expert ![]() Joined: 05 Dec 05 Location: United States Status: Offline Points: 102 |
![]() ![]() ![]() ![]() ![]() |
Send Me a couple of files and a quick example of the code you use to perform the merge. I will watch the process in the debugger. I may not be able to tell how damage is occurring but I should be able to tell you what part of the process take the most time. Steve |
|
![]() |
|
bogey ![]() Senior Member ![]() ![]() Joined: 30 Nov 05 Location: United States Status: Offline Points: 50 |
![]() ![]() ![]() ![]() ![]() |
A package is on the way...
|
|
![]() |
|
bogey ![]() Senior Member ![]() ![]() Joined: 30 Nov 05 Location: United States Status: Offline Points: 50 |
![]() ![]() ![]() ![]() ![]() |
Here is an update: When I have multiple PDFs with same named field on them, and they are merged, the field on pages following the 1st occurance is disabled and causes acrobat5 to crash. We have discovered if the PDF is opened in Acrobat, remove security, and Extract All the pages, the subsequent form fields are then active and fillable. The Extract PAges seems to "reform" the PDF file. The status bad indicates "Fixing Up Form Field" during the process. Does anyone know what this is fixing, and is there an equivalent QP process? I have tried the CheckObjects() and ExtractPAges(), but the do not seem to have the desired effect.
|
|
![]() |
|
swb1 ![]() Debenu Quick PDF Library Expert ![]() Joined: 05 Dec 05 Location: United States Status: Offline Points: 102 |
![]() ![]() ![]() ![]() ![]() |
Ken, I have found that if I use MergeFileList rather than MergeFileListFast the resulting file works in Acrobat Reader v5.1 , though the processing take about twice as long. Here is a WAG (Wild Ass Guess) about what I think may be happening: It looks as though MergeFileList steps through a number of older, well tested, routines that copy PDF objects and their formats to new PDF objects, assembles them and saves them to a new stream. Ultimately saving that new stream to a new PDF file. I believe that the newer Function: MergeFileListFast simply stacks up the existing PDF objects with no concern for the properties of those objects. I suspect that this means that MergeFileListFast will put objects with redundant properties in the output stream that would not typically be allowed in a single PDF stream. Based on your posts and some others I have read, it seems that redundant Field Objects are of greater concern than say Font Objects or Text Objects. Once again, just a guess. There is a lot of code here that I don’t yet understand, however, stepping through it, I can see why MergeFileList takes longer and I can also see how it could be more thorough and more tested. Thus if functional reliability is important then MergeFileList is the better call. If you can guarantee that the documents You are merging will not have redundant filed objects you can probably get away with MergeFielListFast. Perhaps someday I will understand this stuff well enough to fix MergeFielListFast (assuming that it’s actually broken) however unless there is a lot of $ in it, that day is long way off. Regards Steve |
|
![]() |
|
bogey ![]() Senior Member ![]() ![]() Joined: 30 Nov 05 Location: United States Status: Offline Points: 50 |
![]() ![]() ![]() ![]() ![]() |
Here is what has been discovered with lots of testing: There are basically 4 ways to merge PDF files and they yield different results. MergeDocument() in memory, mergeFiles(), mergeFileList(), and mergeFileListFast(). MergeDocument() in memory is the fast and correctly appends redundant field objects correctly. MergeFiles() and MergeFileList() are very slow, and have mixed reults with redundant fields. MergeFilesFast() is fast, but does not handle redundant fields well at all. After any of these calls, if you issue a SetNeedsAppearance() call, it will trigger acrobat to "fix up form fields" on load of the PDF, allowing redundant field objects to properly resolve. However, this is not a solution if you want the program to populate field data before delivery because the "fix up fields" is run after the file is delivered to the viewer. The solution I have come up with is using the mergeDocument() method, which is fairly fast but very CPU intensive. But it leaves the PDF in a state in which you can properly pump data into it. Right now this is working, but I will be closely monitoring the CPU usage for heavy loads. Thanks to all who contributed, expecially the contributor who mentioned the SetNeedsAppearance() method in another post. |
|
![]() |
|
swb1 ![]() Debenu Quick PDF Library Expert ![]() Joined: 05 Dec 05 Location: United States Status: Offline Points: 102 |
![]() ![]() ![]() ![]() ![]() |
Ken, Good work. Useful information. In software development there are two things I rarely worry about. 1.) Disk space. By the time you are close to filling up you 200GB drive the 2TB drives will $98.99 plus tax at CompUSA 2.) CPU Usage. Same reason as the drives. It’s probably a bad practice however I don’t spend a lot of time optimizing code when hardware will have solved most of my problems by time I’m ready to release. |
|
![]() |
|
bogey ![]() Senior Member ![]() ![]() Joined: 30 Nov 05 Location: United States Status: Offline Points: 50 |
![]() ![]() ![]() ![]() ![]() |
The problem with high CPU utilization is that the MergePDF process is being run from our production content server. All content is fed live from the server, and high utlitization can really slow down delivery of form kits. Then users refresh and start the process over again, only compounding the problem. If it takes 10 seconds to build the kit, it is too long. But on the other hand, any CPU cycle not used is forever lost. Same goes for bandwidth. |
|
![]() |
|
shelby ![]() Beginner ![]() Joined: 13 Jan 06 Location: United States Status: Offline Points: 13 |
![]() ![]() ![]() ![]() ![]() |
Since we seem to be in philosophical mode, I will add:
An interim page load with a pretty graphic linked to the page and an explanation a slight delay should keep the user busy to decrease the apparant wait and at least the frustration. (Also, I always hide the submit button after it is clicked to control the population of users that just need some slow-down juice.)
It is called the Law of Diminishing Returns. Five more hours of programming and debugging takes the job from 10 seconds to 5 seconds. Good for the internal - "dang I am good",
![]() I know, you hard core CPU burners out there won't agree - but a little user interface goes a long way - even with people that can "pop jiffy-pop pop-corn on their cpu"
![]() |
|
To Err is to try. To fail is to err and not try again
|
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
Copyright © 2017 Debenu. Debenu Quick PDF Library is a PDF SDK. All rights reserved. About — Contact — Blog — Support — Online Store