Tuesday, November 28, 2006

Merge PDF files with iText

The previous post described the use of iText to create a simple PDF. This post describes the use of iText to merge multiple PDF documents. The sample code here contains a method that will take the a list of PDF files as input, along with an OutputStream to which the merged PDF file will be written. This example also implements paging, i.e. along with merging the PDF files the merged PDF file also adds page numbers to the generated PDF file. The following is a brief description of the steps involved in merging PDF files using iText.
Skip to Sample code
  1. Create a PdfReader for each input PDF file
    PdfReader pdfReader = new PdfReader(pdf);
  2. Create a document object to represent the PDF.
    Document document = new Document();
  3. Create a PdfWriter for the target OutputStream
    PdfWriter writer = PdfWriter.getInstance(document, outputStream);
  4. Select a font with which the page numbers will be written
    BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
  5. Create a a PdfContentByte object to hold the data of the merged PDF
    PdfContentByte cb = writer.getDirectContent();
  6. Add individual pages from the source to the target.
    document.newPage();
    pageOfCurrentReaderPDF++;
    currentPageNumber++;
    page1 = writer.getImportedPage(pdfReader, pageOfCurrentReaderPDF);
    cb.addTemplate(page1, 0, 0);
  7. Add page text at the bottom of the page
    cb.beginText();
    cb.setFontAndSize(bf, 9);
    cb.showTextAligned(PdfContentByte.ALIGN_CENTER, "Page " + currentPageNumber + " of " + totalPages, 520, 5, 0);
    cb.endText();
The following is the complete source code merging PDF files.
public class MergePDF {
public static void main(String[] args) {
try {
List<InputStream> pdfs = new ArrayList<InputStream>();
pdfs.add(new FileInputStream("c:\\pdf\\2.pdf"));
pdfs.add(new FileInputStream("c:\\pdf\\3.pdf"));
OutputStream output = new FileOutputStream("c:\\pdf\\merge.pdf");
MergePDF.concatPDFs(pdfs, output, true);
} catch (Exception e) {
e.printStackTrace();
}
}

public static void concatPDFs(List<InputStream> streamOfPDFFiles, OutputStream outputStream, boolean paginate) {

Document document = new Document();
try {
List<InputStream> pdfs = streamOfPDFFiles;
List<PdfReader> readers = new ArrayList<PdfReader>();
int totalPages = 0;
Iterator<InputStream> iteratorPDFs = pdfs.iterator();

// Create Readers for the pdfs.
while (iteratorPDFs.hasNext()) {
InputStream pdf = iteratorPDFs.next();
PdfReader pdfReader = new PdfReader(pdf);
readers.add(pdfReader);
totalPages += pdfReader.getNumberOfPages();
}
// Create a writer for the outputstream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);

document.open();
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF
// data

PdfImportedPage page;
int currentPageNumber = 0;
int pageOfCurrentReaderPDF = 0;
Iterator<PdfReader> iteratorPDFReader = readers.iterator();

// Loop through the PDF files and add to the output.
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();

// Create a new page in the target for each source page.
while (pageOfCurrentReaderPDF < pdfReader.getNumberOfPages()) {
document.newPage();
pageOfCurrentReaderPDF++;
currentPageNumber++;
page = writer.getImportedPage(pdfReader, pageOfCurrentReaderPDF);
cb.addTemplate(page, 0, 0);

// Code for pagination.
if (paginate) {
cb.beginText();
cb.setFontAndSize(bf, 9);
cb.showTextAligned(PdfContentByte.ALIGN_CENTER, "" + currentPageNumber + " of " + totalPages, 520, 5, 0);
cb.endText();
}
}
pageOfCurrentReaderPDF = 0;
}
outputStream.flush();
document.close();
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document.isOpen())
document.close();
try {
if (outputStream != null)
outputStream.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
}
MergePDF.java

46 comments:

  1. hey, thanks for the write up, I found it very useful. Only problem is, I need the bookmarks to copy as well, which PdfWriter doesn't seem to support, whereas PdfCopy does. With PdfCopy as my writer, however, I can't seem to use this teqnique to insert page numbers, as it complains about the document having no pages. Do you know any possible workaround whereby I can both copy bookmarks and insert page numbers?

    ReplyDelete
  2. this seems to resize the merged pdf's slightly smaller than the originals. Any Idea why?

    ReplyDelete
  3. When you say smaller, is it the file size or the area of text? If it is the latter, then you should check if you are setting additional properties for the new document such as margin etc.

    ReplyDelete
  4. hi, When I merge 2 pdf, the output merge pdf has data displaying in bigger size and the top there is lot of space before content. Let me know how can we put 2 or more pdf with same font/size and the data display to be same as the pdf's we merged.

    ReplyDelete
  5. hi, I'm not able to compile ur sample code. can u please suggest on the requirements.

    ReplyDelete
  6. I found your code snippet really useful and added as a reference at
    http://www.installationwiki.org/JasperReports

    ReplyDelete
  7. Hi,

    If there are any issues with respect to alignment
    or size of the text,while merging pdfs using itext,it has to
    do with the PageSize.

    While creating Document,specify the pagesize instead of default constructor.
    i also had similar problem
    and it worked when i gave?

    Document document = new Document(PageSize.A2);

    Check below link:

    http://www.informit.com/articles/article.aspx?p=420686&seqNum=3&rl=1

    ReplyDelete
  8. Hi,

    I have acroform and xmlform that i've previously injected data (with abobe API). When reading that file with PDFReader with iText API, all the injected data disappared. The merge work well, but I can't not see my data, unless I use stamper to fill my form. Is there a work around for this?

    ReplyDelete
  9. Thanks, this concept helped me merge a pdf letter with header information in my application.

    ReplyDelete
  10. Hi,
    I went through with your code, which will help a lot.
    Can you Help me?. My problem is , Saving pdf files as blog file in DB.

    While creating new pdf, want to add those pdf files with the other content.

    My problem is, reading the content but while displaying, am using two FileOutputStream. I know which is not allowed with iText.

    Is it possible to add blob content while creating new pdf file ?

    ReplyDelete
  11. Wonderful post. Helped a lot.

    Thanks,
    Vikrant.

    ReplyDelete
  12. Absolutely superb article. Thank you very much for your help.

    One pointer for anybody that might have one issue that I had. My PDF's were originally in landscape format but itext converted them to portrait making for some odd looking pdf's!!

    In order to fix this add this to your Document constructor:

    Document document = new Document(PageSize.A4.rotate());

    This will rotate the pages from portrait to landscape.

    ReplyDelete
  13. My imports are all messed up. Which imports did you use?

    ReplyDelete
  14. Great post. Only drawback is that it doesn't do any compression. I'm converting letters from html to pdf. When I convert a html file containing 100 letters into one pdf the file size is about 600K. When I create the PDF's one at a time and merge them it balloons to 9Meg.

    ReplyDelete
  15. Hi i am using the above code but instead of FileInputStream i am passing bytes converted from another outputstream to concatPDFs method . I am getting a blank pdf as output?Do you have any idea about that..

    ReplyDelete
  16. Very useful & time saving snippet of code. thanks

    ReplyDelete
  17. Hai...The code was very useful.But i have one doubt.My pdf file is larger in size than A4 size and so it is rotated to right in 90 degrees.I tried with specifying the size of the page but it didnt work.How can I resolve the issue?

    ReplyDelete
  18. Great code! Thanks, it helped a lot. Only one problem, my first pdf was compressed into A4 size (it was designed to be landscape, multiple-pages, but forced to fit a single A4 page with IGNORE-PAGINATION in IReport), and the other pdf file was naturally A4 portrait.

    When I merged them, the first pdf was cut, but the second pdf (on second page) was cool. I tried changing the document size to A3, both pages were displayed right, but it's just not the rightfully requested page size (has to be A4). How should I handle this?

    Thanks very much!

    ReplyDelete
  19. Great article! I really appreciate your help.

    ReplyDelete
  20. The article was really helpful thks u so much :)

    ReplyDelete
  21. Yes - that's really a helper. Thanks! I will keep reading.

    ReplyDelete
  22. Hi i am using the above code but instead of FileInputStream i am passing bytes converted from another outputstream to concatPDFs method . I am getting a blank pdf as output?Do you have any idea about that..

    ReplyDelete
  23. very nice and very helpful!
    the only problem that i have is that my PDF Pages are a mix of Horizontal and Vertical Pages and Horizontal pages are cutted idf i dont use the PageSize.A4.rotate... and if i do so the veritical ones looks not good too..
    so i need how to detect if the page is rotated or not to add it to the new doc with the right rotation.
    does anyone knows how???
    thanks a lot!

    ReplyDelete
  24. The example is great, although it does not work if the PDFs contains different page sizes. If you don't need the paginate, the code above is better and simpler:

    PdfReader reader1 = new PdfReader("1PDF.pdf");
    PdfReader reader2 = new PdfReader("2PDF.pdf");
    PdfCopyFields copy = new PdfCopyFields(new FileOutputStream("concatenatedPDF.pdf"));
    copy.addDocument(reader1);
    copy.addDocument(reader2);
    copy.close();

    Thanks,

    ReplyDelete
  25. Hi, I want to merge 3 pdf files but I don have them in c:. I have 3 URLs.
    Is possible to merge the 3 pdf files using only the URLs?
    Note: The pdf files have diferent fonts in the paragraphs and I would like to keep the same fonts in the resultin output pdf file.
    Thanks.

    ReplyDelete
  26. This comment has been removed by the author.

    ReplyDelete
  27. hi Abhi.

    very useful code.thanks for this post it's simpley gr8......... :)

    i have one doubt

    I want to merge more than 2 pdf files but I dont have them in any drive ,I have URLs for the same.
    Is possible to merge those pdf files using only the URLs?
    Note: The pdf files have diferent fonts in the paragraphs and I would like to keep the same fonts in the resultin output pdf file.

    i think my question is very much similar to previous one so any body hav any solution for the same.

    Thanks
    Kiran

    ReplyDelete
  28. hi Abhi.

    very useful code.thanks for this post it's simpley gr8......... :)

    i have one doubt

    symbol : method add(java.io.FileInputStream)
    location: interface java.util.List com.lowagie.text.pdf.codec.Base64.InputStream
    pdfs.add(myinputStream);
    1 error
    BUILD FAILED (total time: 0 seconds)

    ReplyDelete
  29. The code was very useful. a small suggestion, after PDF reader creation the file inputstream should be closed interms of further actions to be done on to the individual files. for example "deletion of individual files".

    // Create Readers for the pdfs. while (iteratorPDFs.hasNext()) { InputStream pdf = iteratorPDFs.next(); PdfReader pdfReader = new PdfReader(pdf); readers.add(pdfReader); totalPages += pdfReader.getNumberOfPages();
    pdf.close();
    }

    ReplyDelete
  30. Hi, this code is useful...
    unfurtunately i need to to this WITHOUT java 5.
    Can you help me?
    Thanks.

    ReplyDelete
  31. Thanks for sharing. It has helped a great deal. Do you have any ideas regarding merging PDFs with signatures?

    ReplyDelete
  32. Hi..Great Help..It helped me merge two pdfs..

    I have a problem at hand.

    Now in the code you have shared,you have hard coded the source folder path,but I need to procure the folder path dynamically using a session or something like that ..

    How can I achieve this?could you pls help me on this.

    Regards
    Kp

    ReplyDelete
  33. Hi..Great Help..It helped me merge two pdfs..

    I have a problem at hand.

    Now in the code you have shared,you have hard coded the source folder path,but I need to procure the folder path dynamically using a session or something like that ..

    ReplyDelete
    Replies
    1. use properties put a path name in dat file and get the path from loading property

      Delete
  34. The code work fine only problem is that the top margin is more than that of original pdf is there any way to make it same

    ReplyDelete
  35. This comment has been removed by the author.

    ReplyDelete
  36. In order to take account of PDF files with both portrait and landscape pages the following code should be used :

    document.setPageSize(pdfReader.getPageSizeWithRotation(pageOfCurrentReaderPDF));
    document.newPage();

    ReplyDelete
  37. I am merging the pdf files which are password protected. Now, when I am using your code this gives me an exception saying:
    "Exception in concatPDFs() Bad user password"
    Please suggest.

    ReplyDelete
  38. Hi,the below code will also do the same operation,with all the fields editable.And since its just copying form one pdf to output pdf(our resulting pdf),sizing problem or anyother problem wont come.


    public class MyMergePDF2 {

    /**
    * @param args
    * @throws IOException
    * @throws DocumentException
    */
    public static void main(String[] args) {

    //creating list of readers of pdf files
    try{
    PdfReader reader1 = new PdfReader("D://A.pdf");
    PdfReader reader2 = new PdfReader("D://B.pdf");

    List pdfReaderList = new ArrayList

    //Adding readers to the files
    pdfReaderList.add(reader1);
    pdfReaderList.add(reader2);

    // Creating the PdfCopyFields
    // This copy field object will take the reader object and copy the pages to its object(copy)
    // So just by adding a document we can create a PDF with your merged pdf

    PdfCopyFields copy = new PdfCopyFields(new FileOutputStream(
    "D://Output.pdf"));

    copy.open();

    if (null != pdfReaderList && !pdfReaderList.isEmpty()) {
    Iterator iter = pdfReaderList.iterator();
    while (iter.hasNext()) {
    String pageNOs = "";
    PdfReader pdfReader = (PdfReader) iter.next();
    int noOfPages = pdfReader.getNumberOfPages();
    if (noOfPages > 0) {
    pageNOs = getNumderOfPages(noOfPages);
    }
    copy.addDocument(pdfReader, pageNOs);
    }
    }
    copy.close();
    }catch(DocumentException e){
    e.printStackTrace();
    } catch (FileNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    } catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    }

    /**
    * Function to get page numbers in string with comma separated
    * @param noOfPages
    * @return
    */
    private static String getNumderOfPages(int noOfPages) {
    String pageNOs = "";
    boolean flag = false;
    for (int i = 0; i < noOfPages; i++) {

    if (flag == true) {
    Integer c = (Integer) i;
    pageNOs = pageNOs.concat("," + c.toString());
    }
    if (flag == false) {
    Integer c = (Integer) i;
    pageNOs = c.toString();
    flag = true;
    }
    }
    return pageNOs;
    }
    }

    ReplyDelete
  39. I am merging the pdf files which are password protected. Now, when I am using your code this gives me an exception saying:
    "Exception in concatPDFs() Bad user password"
    Please suggest.

    ReplyDelete
  40. hi i have n numbers of pdfs in one folder and m number of pdfs in other folder both m and n numbers are same how to merge with m and n as seen in posted example only two pdf can merge is there any logic to merge m and n no files please let me know .

    ReplyDelete