Apache poi java pdf

Java Apache POI Excel save as PDF

How can I convert/save excel file to pdf ? I’m using java play framework to generate some excel files and now the requirement changes to pdf . I don’t want to recode everything. Is there a way to convert to pdf ? The excel files I’m generating are from a template; I read the excel template file, write changes, and save as new excel file. That way, the template is unchanged. It contains border, image, and other formatting.

5 Answers 5

You would need the following Java libraries and associated JAR files for the program to work. POI v3.8 iText v5.3.4

Try this Example to convert XLS to PDF

The complete Java code that accepts Excel spreadsheet data as an input and transforms that to a PDF table data is provided below:

 import java.io.FileInputStream; import java.io.*; import org.apache.poi.hssf.usermodel.HSSFWorkbook; import org.apache.poi.hssf.usermodel.HSSFSheet; import org.apache.poi.ss.usermodel.*; import java.util.Iterator; import com.itextpdf.text.*; import com.itextpdf.text.pdf.*; public class excel2pdf < public static void main(String[] args) throws Exception< FileInputStream input_document = new FileInputStream(new File("C:\\excel_to_pdf.xls")); // Read workbook into HSSFWorkbook HSSFWorkbook my_xls_workbook = new HSSFWorkbook(input_document); // Read worksheet into HSSFSheet HSSFSheet my_worksheet = my_xls_workbook.getSheetAt(0); // To iterate over the rows IteratorrowIterator = my_worksheet.iterator(); //We will create output PDF document objects at this point Document iText_xls_2_pdf = new Document(); PdfWriter.getInstance(iText_xls_2_pdf, new FileOutputStream("Excel2PDF_Output.pdf")); iText_xls_2_pdf.open(); //we have two columns in the Excel sheet, so we create a PDF table with two columns //Note: There are ways to make this dynamic in nature, if you want to. PdfPTable my_table = new PdfPTable(2); //We will use the object below to dynamically add new data to the table PdfPCell table_cell; //Loop through rows. while(rowIterator.hasNext()) < Row row = rowIterator.next(); IteratorcellIterator = row.cellIterator(); while(cellIterator.hasNext()) < Cell cell = cellIterator.next(); //Fetch CELL switch(cell.getCellType()) < //Identify CELL type //you need to add more code here based on //your requirement / transformations case Cell.CELL_TYPE_STRING: //Push the data from Excel to PDF Cell table_cell=new PdfPCell(new Phrase(cell.getStringCellValue())); //feel free to move the code below to suit to your needs my_table.addCell(table_cell); break; >//next line > > //Finally add the table to PDF document iText_xls_2_pdf.add(my_table); iText_xls_2_pdf.close(); //we created our pdf file.. input_document.close(); //close xls > > 

i hope this will help you

Читайте также:  Java current jar name

Facing this issue: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)

Add on to assylias’s answer

The code from assylias above was very helpful to me in solving this problem. The answer from santhosh could be great if you don’t care about the resulting PDF looking exactly like your excel pdf export would look. However, if you are, say, filling out an excel template using Apache POI an then trying to export that while preserving its look and not writing a ton of code in iText just to try to get close to that look, then the VBS option is quite nice.

I’ll share a Java version of the kotlin assylias has above in case that helps anyone. All credit to assylias for the general form of the solution.

In Java:

try < //create a temporary file and grab the path for it Path tempScript = Files.createTempFile("script", ".vbs"); //read all the lines of the .vbs script into memory as a list //here we pull from the resources of a Gradle build, where the vbs script is stored System.out.println("Path for vbs script is: '" + Main.class.getResource("xl2pdf.vbs").toString().substring(6) + "'"); Listscript = Files.readAllLines(Paths.get(Main.class.getResource("xl2pdf.vbs").toString().substring(6))); // append test.xlsm for file name. savePath was passed to this function String templateFile = savePath + "\\test.xlsm"; templateFile = templateFile.replace("\\", "\\\\"); String pdfFile = savePath + "\\test.pdf"; pdfFile = pdfFile.replace("\\", "\\\\"); System.out.println("templateFile is: " + templateFile); System.out.println("pdfFile is: " + pdfFile); //replace the placeholders in the vbs script with the chosen file paths for (int i = 0; i < script.size(); i++) < script.set(i, script.get(i).replaceAll("XL_FILE", templateFile)); script.set(i, script.get(i).replaceAll("PDF_FILE", pdfFile)); System.out.println("Line " + i + " is: " + script.get(i)); >//write the modified code to the temporary script Files.write(tempScript, script); //create a processBuilder for starting an operating system process ProcessBuilder pb = new ProcessBuilder("wscript", tempScript.toString()); //start the process on the operating system Process process = pb.start(); //tell the process how long to wait for timeout Boolean success = process.waitFor(timeout, minutes); if(!success) < System.out.println("Error: Could not print PDF within " + timeout + minutes); >else < System.out.println("Process to run visual basic script for pdf conversion succeeded."); >> catch (Exception e)
Option Explicit Dim objExcel, strExcelPath, objSheet strExcelPath = "XL_FILE" Set objExcel = CreateObject("Excel.Application") objExcel.WorkBooks.Open strExcelPath Set objSheet = objExcel.ActiveWorkbook.Worksheets(1) objSheet.ExportAsFixedFormat 0, "PDF_FILE",0, 1, 0, , , 0 objExcel.ActiveWorkbook.Close objExcel.Application.Quit 

Источник

Trying to make simple PDF document with Apache poi

I see the internet is riddled with people complaining about apache’s pdf products, but I cannot find my particular usecase here. I am trying to do a simple Hello World with apache poi. Right now my code is as follows:

public ByteArrayOutputStream export() throws IOException < //Blank Document XWPFDocument document = new XWPFDocument(); //Write the Document in file system ByteArrayOutputStream out = new ByteArrayOutputStream();; //create table XWPFTable table = document.createTable(); XWPFStyles styles = document.createStyles(); styles.setSpellingLanguage("English"); //create first row XWPFTableRow tableRowOne = table.getRow(0); tableRowOne.getCell(0).setText("col one, row one"); tableRowOne.addNewTableCell().setText("col two, row one"); tableRowOne.addNewTableCell().setText("col three, row one"); //create second row XWPFTableRow tableRowTwo = table.createRow(); tableRowTwo.getCell(0).setText("col one, row two"); tableRowTwo.getCell(1).setText("col two, row two"); tableRowTwo.getCell(2).setText("col three, row two"); //create third row XWPFTableRow tableRowThree = table.createRow(); tableRowThree.getCell(0).setText("col one, row three"); tableRowThree.getCell(1).setText("col two, row three"); tableRowThree.getCell(2).setText("col three, row three"); PdfOptions options = PdfOptions.create(); PdfConverter.getInstance().convert(document, out, options); out.close(); return out; >
 public ResponseEntity convertToPDFPost(@ApiParam(value = "DTOs passed from the FE" ,required=true ) @Valid @RequestBody ExportEnvelopeDTO exportDtos) < if (exportDtos.getProdExportDTOs() != null) < try < FileOutputStream out = new FileOutputStream("/Users/kornhaus/Desktop/test.pdf"); out.write(exporter.export().toByteArray()); out.close(); >catch (IOException e) < e.printStackTrace(); >return new ResponseEntity(responseFile, responseHeaders, HttpStatus.OK); > return new ResponseEntity(HttpStatus.INTERNAL_SERVER_ERROR); > > 
org.apache.poi.xwpf.converter.core.XWPFConverterException: java.io.IOException: Unable to parse xml bean 

I have no clue what’s causing this, where to even look for this kind of documentation. I have been coding a decade plus and never had such difficulty with what should be a simple Java library. Any help would be great.

Not clear why you are downvoting my answer. It is tested and works. Of course it is only a draft to show the principle. It is not a ready to use code. And to reduce the memory usage the created Word document could be saved as a temporary file instead holding it’s bytes in memory.

1 Answer 1

The main problem with this is that those PdfOptions and PdfConverter are not part of the apache poi project. They are developed by opensagres and first versions were badly named org.apache.poi.xwpf.converter.pdf.PdfOptions and org.apache.poi.xwpf.converter.pdf.PdfConverter . Those old classes were not updated since 2014 and needs version 3.9 of apache poi to be used.

But the same developers provide fr.opensagres.poi.xwpf.converter.pdf, which is much more current and works using the latest stable release apache poi 3.17 . So we should using this.

But since even those newer PdfOptions and PdfConverter are not part of the apache poi project, apache poi will not testing those with their releases. And so the default *.docx documents created by apache poi lacks some content which PdfConverter needs.

  1. There must be a styles document, even if it is empty.
  2. There must be section properties for the page having at least the page size set.
  3. Tables must have a table grid set.

To fulfilling this we must add some code additionally in our program. Unfortunately this then needs the full jar of all of the schemas ooxml-schemas-1.3.jar as mentioned in Faq-N10025.

And because we need changing the underlaying low level objects, the document must be written so underlaying objects will be committed. Else the XWPFDocument which we hand over the PdfConverter will be incomplete.

import java.io.*; import java.math.BigInteger; //needed jars: fr.opensagres.poi.xwpf.converter.core-2.0.1.jar, // fr.opensagres.poi.xwpf.converter.pdf-2.0.1.jar, // fr.opensagres.xdocreport.itext.extension-2.0.1.jar, // itext-2.1.7.jar import fr.opensagres.poi.xwpf.converter.pdf.PdfOptions; import fr.opensagres.poi.xwpf.converter.pdf.PdfConverter; //needed jars: apache poi and it's dependencies // and additionally: ooxml-schemas-1.3.jar import org.apache.poi.xwpf.usermodel.*; import org.apache.poi.util.Units; import org.openxmlformats.schemas.wordprocessingml.x2006.main.*; public class XWPFToPDFConverterSampleMin < public static void main(String[] args) throws Exception < XWPFDocument document = new XWPFDocument(); // there must be a styles document, even if it is empty XWPFStyles styles = document.createStyles(); // there must be section properties for the page having at least the page size set CTSectPr sectPr = document.getDocument().getBody().addNewSectPr(); CTPageSz pageSz = sectPr.addNewPgSz(); pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5" pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11" // filling the body XWPFParagraph paragraph = document.createParagraph(); //create table XWPFTable table = document.createTable(); //create first row XWPFTableRow tableRowOne = table.getRow(0); tableRowOne.getCell(0).setText("col one, row one"); tableRowOne.addNewTableCell().setText("col two, row one"); tableRowOne.addNewTableCell().setText("col three, row one"); //create CTTblGrid for this table with widths of the 3 columns. //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths. //values are in unit twentieths of a point (1/1440 of an inch) //first column = 2 inches width table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); //other columns (2 in this case) also each 2 inches width for (int col = 1 ; col < 3; col++) < table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); >//create second row XWPFTableRow tableRowTwo = table.createRow(); tableRowTwo.getCell(0).setText("col one, row two"); tableRowTwo.getCell(1).setText("col two, row two"); tableRowTwo.getCell(2).setText("col three, row two"); //create third row XWPFTableRow tableRowThree = table.createRow(); tableRowThree.getCell(0).setText("col one, row three"); tableRowThree.getCell(1).setText("col two, row three"); tableRowThree.getCell(2).setText("col three, row three"); paragraph = document.createParagraph(); //trying picture XWPFRun run = paragraph.createRun(); run.setText("The picture in line: "); InputStream in = new FileInputStream("samplePict.jpeg"); run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30)); in.close(); run.setText(" text after the picture."); paragraph = document.createParagraph(); //document must be written so underlaaying objects will be committed ByteArrayOutputStream out = new ByteArrayOutputStream(); document.write(out); document.close(); document = new XWPFDocument(new ByteArrayInputStream(out.toByteArray())); PdfOptions options = PdfOptions.create(); PdfConverter converter = (PdfConverter)PdfConverter.getInstance(); converter.convert(document, new FileOutputStream("XWPFToPDFConverterSampleMin.pdf"), options); document.close(); > > 

Using XDocReport

Another way would be using the newest version of opensagres/xdocreport as described in Converter only with ConverterRegistry:

import java.io.*; import java.math.BigInteger; //needed jars: xdocreport-2.0.1.jar, // odfdom-java-0.8.7.jar, // itext-2.1.7.jar import fr.opensagres.xdocreport.converter.Options; import fr.opensagres.xdocreport.converter.IConverter; import fr.opensagres.xdocreport.converter.ConverterRegistry; import fr.opensagres.xdocreport.converter.ConverterTypeTo; import fr.opensagres.xdocreport.core.document.DocumentKind; //needed jars: apache poi and it's dependencies // and additionally: ooxml-schemas-1.3.jar import org.apache.poi.xwpf.usermodel.*; import org.apache.poi.util.Units; import org.openxmlformats.schemas.wordprocessingml.x2006.main.*; public class XWPFToPDFXDocReport < public static void main(String[] args) throws Exception < XWPFDocument document = new XWPFDocument(); // there must be a styles document, even if it is empty XWPFStyles styles = document.createStyles(); // there must be section properties for the page having at least the page size set CTSectPr sectPr = document.getDocument().getBody().addNewSectPr(); CTPageSz pageSz = sectPr.addNewPgSz(); pageSz.setW(BigInteger.valueOf(12240)); //12240 Twips = 12240/20 = 612 pt = 612/72 = 8.5" pageSz.setH(BigInteger.valueOf(15840)); //15840 Twips = 15840/20 = 792 pt = 792/72 = 11" // filling the body XWPFParagraph paragraph = document.createParagraph(); //create table XWPFTable table = document.createTable(); //create first row XWPFTableRow tableRowOne = table.getRow(0); tableRowOne.getCell(0).setText("col one, row one"); tableRowOne.addNewTableCell().setText("col two, row one"); tableRowOne.addNewTableCell().setText("col three, row one"); //create CTTblGrid for this table with widths of the 3 columns. //necessary for Libreoffice/Openoffice and PdfConverter to accept the column widths. //values are in unit twentieths of a point (1/1440 of an inch) //first column = 2 inches width table.getCTTbl().addNewTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); //other columns (2 in this case) also each 2 inches width for (int col = 1 ; col < 3; col++) < table.getCTTbl().getTblGrid().addNewGridCol().setW(BigInteger.valueOf(2*1440)); >//create second row XWPFTableRow tableRowTwo = table.createRow(); tableRowTwo.getCell(0).setText("col one, row two"); tableRowTwo.getCell(1).setText("col two, row two"); tableRowTwo.getCell(2).setText("col three, row two"); //create third row XWPFTableRow tableRowThree = table.createRow(); tableRowThree.getCell(0).setText("col one, row three"); tableRowThree.getCell(1).setText("col two, row three"); tableRowThree.getCell(2).setText("col three, row three"); paragraph = document.createParagraph(); //trying picture XWPFRun run = paragraph.createRun(); run.setText("The picture in line: "); InputStream in = new FileInputStream("samplePict.jpeg"); run.addPicture(in, Document.PICTURE_TYPE_JPEG, "samplePict.jpeg", Units.toEMU(100), Units.toEMU(30)); in.close(); run.setText(" text after the picture."); paragraph = document.createParagraph(); //document must be written so underlaaying objects will be committed ByteArrayOutputStream out = new ByteArrayOutputStream(); document.write(out); document.close(); // 1) Create options DOCX 2 PDF to select well converter form the registry Options options = Options.getFrom(DocumentKind.DOCX).to(ConverterTypeTo.PDF); // 2) Get the converter from the registry IConverter converter = ConverterRegistry.getRegistry().getConverter(options); // 3) Convert DOCX 2 PDF InputStream docxin= new ByteArrayInputStream(out.toByteArray()); OutputStream pdfout = new FileOutputStream(new File("XWPFToPDFXDocReport.pdf")); converter.convert(docxin, pdfout, options); docxin.close(); pdfout.close(); > > 

October 2018: This code works using apache poi 3.17 . It cannot work using apache poi 4.0.0 due to changings in apache poi which were not taken in account until now in fr.opensagres.poi.xwpf.converter as well as in fr.opensagres.xdocreport.converter .

February 2019: Works for me now using the newest apache poi version 4.0.1 and the newest version 2.0.2 of fr.opensagres.poi.xwpf.converter.pdf and consorts.

June 2021: Works using apache poi version 4.1.2 and the newest version 2.0.2 of fr.opensagres.poi.xwpf.converter.pdf and consorts. Cannot work using apache poi version 5.0.0 because XDocReport needs ooxml-schemas which apache poi 5 does not support anymore.

April 2022: Works using apache poi version 5.2.2 and the newest version 2.0.3 of fr.opensagres.poi.xwpf.converter.pdf and consorts.

Источник

How to convert Excel to PDF using apache POI or PDFBox

I am created an Excel file using Apache POI. Now I want to convert it into PDF using Apache POI itself or PDFBox. As per requirements, I can’t use itext to convert excel to PDF or any other API other than Apache POI or PDFBox.

My suggestion is that you do a bit more research before giving up and posting a question here. There are a ton of resources out there already which probably cover your problem.

Itext is a library to generate pdfs. Pdfbox is a library to generate pdfs. Have you tried taking an example using itext and replacing the itext code therein by equivalent pdfbox code yet? If not, why not? That is an obvious approach given your requirements and findings.

The apache poi project does not provide any PDF export. So your main task will be read the table data out of Excel using apache poi and write them into a PDF using PDFBox . So your main question will be: How to create Table using Apache PDFBox?

«Creating Dynamic table is totally difficult» — maybe that’s why you’re paid as a developer, instead of paying for a software that can do it? Developing software is more than just clicking on stuff and copy & paste. From the different products on top of PDFBox, choose the one that comes closest to what you want and improve it. Historically, innovation has often been done by people who were frustrated by an existing product and became active instead of just complaining.

Источник

Оцените статью