- Работа с таблицей Excel из Java
- Apache POI — the Java API for Microsoft Documents
- 4 March 2022 — CVE-2022-26336 — A carefully crafted TNEF file can cause an out of memory exception in Apache POI poi-scratchpad versions prior to 5.2.0
- 10+16+18 December 2021- Log4j vulnerabilities CVE-2021-44228, CVE-2021-45046 and CVE-2021-45105
- 13 January 2021 — CVE-2021-23926 — XML External Entity (XXE) Processing in Apache XMLBeans versions prior to 3.0.0
- 20 October 2019 — CVE-2019-12415 — XML External Entity (XXE) Processing in Apache POI versions prior to 4.1.1
- 26 March 2019 — XMLBeans 3.1.0 available
- 11 January 2019 — Initial support for JDK 11
- Mission Statement
- Why should I use Apache POI?
- Components
- Contributing
- Jxls 2.13.0 is released!
- Features
Работа с таблицей Excel из Java
Собственно возникла проблема — обработать данные из таблицы и на их основе получить другую таблицу.
- Макрос — единственной проблемой является VBA, на изучение которого времени нет совершенно, да и не нравится его синтаксис
- Приложение на C# тут вроде все хорошо, но к машине на которой будет выполняться данное приложение сразу предъявляется много дополнительных требований:
- .NET Framework
- Установленный офис
- установленная основная сборка взаимодействия (PIA) для приложения Office
- связка Java и библиотека Apache POI—на этом способе я и хочу остановиться подробнее
- POI 3.5 beta 5, and Office Open XML Support (2009-02-19)—идет работа над поддержкой формата Office 2007
- POI 3.2-FINAL Released (2008-10-19) — последний стабильный релиз
Я расскажу о работе с версией 3.2
Основным классом для работы с таблицей Excel является класс HSSFWorkbook пакета org.apache.poi.hssf.usermodel, представляющий книгу Excel.
Для чтения книги из файла можно применить следующий код:
public static HSSFWorkbook readWorkbook(String filename) < try < POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(filename)); HSSFWorkbook wb = new HSSFWorkbook(fs); return wb; >catch (Exception e) < return null; >>
Метод возвращает объект класса HSSFWorkbook если все удачно и null в другом случае.
Для сохранения изменений можно применить следующий метод:
public static void writeWorkbook(HSSFWorkbook wb, String fileName) < try < FileOutputStream fileOut = new FileOutputStream(fileName); wb.write(fileOut); fileOut.close(); >catch (Exception e) < //Обработка ошибки >>
Метод записывает книгу wb в файл fileName
- По имени
HSSFSheet sheet= wb.getSheet(«Лист 3») - По номеру (нумерация начинается с 0)
HSSFSheet sheet= wb.getSheet(0) - Создание нового листа
HSSFSheet sheet= wb.createSheet([«имя листа»])
- По индексу (индексация начинается с 0)
HSSFRow row = sheet.getRow(index) - Через итератор
Iterator rowIter = sheet.rowIterator(); while (rowIter.hasNext())
- По индексу ячейки (индексация начинается с 0)
HSSFCell cell = row.getCell(0); - Через итератор
Iterator cellIter = row.cellIterator(); while (cellIter.hasNext())
- Логическое значение
boolean b = cell.getBooleanCellValue();
cell.setCellValue(b); - Дата
Date date = cell.getDateCellValue();
cell.setCellValue(date); - Числовое значение
double d = cell.getNumericCellValue();
cell.setCellValue(d); - Строковое значение
String str = cell.getRichStringCellValue().getString();
cell.setCellValue(new HSSFRichTextString(str)); - Формула
String formula = cell.getCellFormula();
cell.setCellFormula(formula);
Этих знаний достаточно чтобы обрабатывать простые таблицы.
Библиотека также предоставляет богатые возможности по форматированию ячеек, по их слиянию, заморозке и т.д.
Подробное описание функций можно найти на их сайте.
Данный способ прежде всего ценен тем, что не требует установки самого офиса и пакета PIA.
Apache POI — the Java API for Microsoft Documents
The Apache POI team is pleased to announce the release of 5.2.3. Several dependencies were updated to their latest versions to pick up security fixes and other improvements.
A summary of changes is available in the Release Notes. A full list of changes is available in the change log. People interested should also follow the dev list to track progress.
See the downloads page for more details.
POI requires Java 8 or newer since version 4.0.1.
4 March 2022 — CVE-2022-26336 — A carefully crafted TNEF file can cause an out of memory exception in Apache POI poi-scratchpad versions prior to 5.2.0
Description:
A shortcoming in the HMEF package of poi-scratchpad (Apache POI) allows an attacker to cause an Out of Memory exception. This package is used to read TNEF files (Microsoft Outlook and Microsoft Exchange Server). If an application uses poi-scratchpad to parse TNEF files and the application allows untrusted users to supply them, then a carefully crafted file can cause an Out of Memory exception.
Mitigation:
Affected users are advised to update to poi-scratchpad 5.2.1 or above which fixes this vulnerability. It is recommended that you use the same versions of all POI jars.
10+16+18 December 2021- Log4j vulnerabilities CVE-2021-44228, CVE-2021-45046 and CVE-2021-45105
The Apache POI PMC has evaluated the security vulnerabilities reported for Apache Log4j.
POI 5.1.0 and XMLBeans 5.0.2 only have dependencies on log4j-api 2.14.1. The security vulnerabilities are not in log4j-api — they are in log4j-core.
If any POI or XMLBeans user uses log4j-core to control their logging of their application, we strongly recommend that they upgrade all their log4j dependencies to the latest version (currently v2.20.0) — including log4j-api.
13 January 2021 — CVE-2021-23926 — XML External Entity (XXE) Processing in Apache XMLBeans versions prior to 3.0.0
Description:
When parsing XML files using XMLBeans 2.6.0 or below, the underlying parser created by XMLBeans could be susceptible to XML External Entity (XXE) attacks.
This issue was fixed a few years ago but on review, we decided we should have a CVE to raise awareness of the issue.
Mitigation:
Affected users are advised to update to Apache XMLBeans 3.0.0 or above which fixes this vulnerability. XMLBeans 4.0.0 or above is preferable.
20 October 2019 — CVE-2019-12415 — XML External Entity (XXE) Processing in Apache POI versions prior to 4.1.1
Description:
When using the tool XSSFExportToXml to convert user-provided Microsoft Excel documents, a specially crafted document can allow an attacker to read files from the local filesystem or from internal network resources via XML External Entity (XXE) Processing.
Mitigation:
Apache POI 4.1.0 and before: users who do not use the tool XSSFExportToXml are not affected. Affected users are advised to update to Apache POI 4.1.1 which fixes this vulnerability.
Credit: This issue was discovered by Artem Smotrakov from SAP
26 March 2019 — XMLBeans 3.1.0 available
The Apache POI team is pleased to announce the release of XMLBeans 3.1.0. Featured are a handful of bug fixes.
The Apache POI project has unretired the XMLBeans codebase and is maintaining it as a sub-project, due to its importance in the poi-ooxml codebase.
A summary of changes is available in the Release Notes. People interested should also follow the POI dev list to track progress.
The XMLBeans JIRA project has been reopened and feel free to open issues.
POI 4.1.0 uses XMLBeans 3.1.0.
XMLBeans requires Java 6 or newer since version 3.0.2.
11 January 2019 — Initial support for JDK 11
We did some work to verify that compilation with Java 11 is working and that all unit-tests pass.
Mission Statement
The Apache POI Project’s mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft’s OLE 2 Compound Document format (OLE2). In short, you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java. Apache POI is your Java Excel solution (for Excel 97-2008). We have a complete API for porting other OOXML and OLE2 formats and welcome others to participate.
OLE2 files include most Microsoft Office files such as XLS, DOC, and PPT as well as MFC serialization API based file formats. The project provides APIs for the OLE2 Filesystem (POIFS) and OLE2 Document Properties (HPSF).
Office OpenXML Format is the new standards based XML file format found in Microsoft Office 2007 and 2008. This includes XLSX, DOCX and PPTX. The project provides a low level API to support the Open Packaging Conventions using openxml4j.
For each MS Office application there exists a component module that attempts to provide a common high level Java api to both OLE2 and OOXML document formats. This is most developed for Excel workbooks (SS=HSSF+XSSF). Work is progressing for Word documents (WP=HWPF+XWPF) and PowerPoint presentations (SL=HSLF+XSLF).
The project has some support for Outlook (HSMF). Microsoft opened the specifications to this format in October 2007. We would welcome contributions.
As a general policy we collaborate as much as possible with other projects to provide this functionality. Examples include: Cocoon for which there are serializers for HSSF; Open Office.org with whom we collaborate in documenting the XLS format; and Tika / Lucene, for which we provide format interpretors. When practical, we donate components directly to those projects for POI-enabling them.
Why should I use Apache POI?
A major use of the Apache POI api is for Text Extraction applications such as web spiders, index builders, and content management systems.
So why should you use POIFS, HSSF or XSSF?
You’d use POIFS if you had a document written in OLE 2 Compound Document Format, probably written using MFC, that you needed to read in Java. Alternatively, you’d use POIFS to write OLE 2 Compound Document Format if you needed to inter-operate with software running on the Windows platform. We are not just bragging when we say that POIFS is the most complete and correct implementation of this file format to date!
You’d use HSSF if you needed to read or write an Excel file using Java (XLS). You’d use XSSF if you need to read or write an OOXML Excel file using Java (XLSX). The combined SS interface allows you to easily read and write all kinds of Excel files (XLS and XLSX) using Java. Additionally there is a specialized SXSSF implementation which allows to write very large Excel (XLSX) files in a memory optimized way.
Components
The Apache POI Project provides several component modules some of which may not be of interest to you. Use the information on our Components page to determine which jar files to include in your classpath.
Contributing
So you’d like to contribute to the project? Great! We need enthusiastic, hard-working, talented folks to help us on the project, no matter your background. So if you’re motivated, ready, and have the time: Download the source from the Subversion Repository, build the code, join the mailing lists, and we’ll be happy to help you get started on the project!
Please read our Contribution Guidelines. When your contribution is ready submit a patch to our Bug Database.
Jxls 2.13.0 is released!
Jxls is a small Java library to make generation of Excel reports easy. Jxls uses a special markup in Excel templates to define output formatting and data layout.
Excel generation is required in many Java applications that have some kind of reporting functionality.
Java has a few libraries for creating Excel files e.g. Apache POI.
Those libraries are great but quite low-level as they require a developer to write a lot of Java code even to create a simple Excel file.
Usually one has to manually set each cell formatting and data for the spreadsheet. Depending on the complexity of the report layout and data formatting the Java code can become quite complex and difficult to debug and maintain.
In addition not all Excel features are supported and can be manipulated with libraries API (e.g. limited support for macros, graphs etc). The suggested workaround for unsupported features is to create an object manually in an Excel template and fill in the template with data after that. Jxls takes this approach to a higher level.
When working with Jxls one just needs to define the required report formatting and data layout in an Excel template file and then run Jxls engine to fill in the template with data. A developer needs to write just a little bit of Java code to trigger Jxls engine processing of the template.
Features
- XML and binary Excel format output (depends on underlying low-level Java-to-Excel implementation)
- Java collections output by rows and by columns
- Conditional output
- Expression language in report definition markup
- Multiple sheets output
- Native Excel formulas
- Parameterized formulas
- Grouping support
- Merged cells support
- Area listeners to adjust excel generation
- Excel comments mark-up for command definition
- XML mark-up for command definition
- Custom Command definition
- Streaming for fast output and less memory consumption
- Streaming for selected sheets (SelectSheetsForStreamingPoiTransformer)
- Table support
Copyright © 2023. All rights reserved.