-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improve][Connector-V2] Change read excel util from POI to EasyExcel #8064
base: dev
Are you sure you want to change the base?
Conversation
@@ -54,7 +55,7 @@ public class ExcelReadStrategyTest { | |||
|
|||
@Test | |||
public void testExcelRead() throws IOException, URISyntaxException { | |||
testExcelRead("/excel/test_read_excel.xlsx"); | |||
// testExcelRead("/excel/test_read_excel.xlsx"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why disable this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the test excel used in the commented out code, and the date string that needs to be converted is 2024/1/31, and the format is
{mso-generic-font-family:auto;
mso-font-charset:134;
mso-number-format:"yyyy/m/d"; }
In POI, we can get the correct data type according to the format of the cell, but in EasyExcel, we can only get the string, and the conversion of the string to the Date type does not conform to the defined YYYYY/MM/dd format, which causes the test case to fail, so I commented out this one test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should find some way to make sure the old behavior not changed. Or add an option to let user to choose use POI or EasyExcel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll find a way to deal with it
@@ -15,8 +15,9 @@ | |||
* limitations under the License. | |||
*/ | |||
|
|||
package org.apache.seatunnel.connectors.seatunnel.file.writer; | |||
package org.apache.seatunnel.connectors.seatunnel.file.Reader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package org.apache.seatunnel.connectors.seatunnel.file.Reader; | |
package org.apache.seatunnel.connectors.seatunnel.file.reader; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/apache/seatunnel/runs/33188901598 @dwave Please open ci workflow |
…m that accuracy is missing due to time type parsing.
380e82b
to
eca2c5d
Compare
Okay, it's already opened |
...nnectors-v2/connector-file/connector-file-base/src/test/resources/excel/test_read_excel.conf
Outdated
Show resolved
Hide resolved
|
||
<dependency> | ||
<groupId>com.alibaba</groupId> | ||
<artifactId>easyexcel</artifactId> | ||
<version>${easyexcel.version}</version> | ||
</dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or easyexcel-plus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will give it a try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or easyexcel-plus?
easyexcel-plus was only on GitHub last night, and I haven't seen it in the maven repository yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we all know, easyexcel is no longer maintained. It doesn't seem good to introduce it at this time. We can try other alternatives, such as fastexcel. There are also reports online that it is faster than easyexcel. What do you think? cc @hailin0
I tried using fastexcel, but there is a problem with its xls support for excel97-2003
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, let's add an option to configure the excel parse engine, default POI, support POI and easyexcel at now. So we can implement other engine in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, let's add an option to configure the excel parse engine, default POI, support POI and easyexcel at now. So we can implement other engine in the future.
Will there be any conflict between poi versions?
.../src/main/java/org/apache/seatunnel/connectors/seatunnel/file/excel/ExcelReaderListener.java
Outdated
Show resolved
Hide resolved
…/main/java/org/apache/seatunnel/connectors/seatunnel/file/excel/ExcelReaderListener.java Co-authored-by: corgy-w <[email protected]>
#8040
Purpose of this pull request
Does this PR introduce any user-facing change?
How was this patch tested?
Check list
New License Guide
release-note
.