Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix nrows when skip_empty_area=False; increase performance for some c… #37

Merged
merged 1 commit into from
Sep 12, 2023

Conversation

dimastbk
Copy link
Owner

@dimastbk dimastbk commented Sep 8, 2023

  1. Fix nrows attribute of to_python when skip_empty_area is False
  2. Increase performance, when nrows is small and skip_empty_area is False. Below is a benchmark for reading files from https://github.com/MarkPflug/Benchmarks/tree/main/source/Benchmarks/Data using pandas.read_excel (first - all file, second - first 10 rows).

Before:

[75.00%] ··· io.excel.ReadExcel.time_read_excel                                                                   1/10 failed
[75.00%] ··· ========== ============ ============ ============ ============ =========
             --                                      ext                             
             ---------- -------------------------------------------------------------
               engine       ods          xlsx         xlsm         xlsb        xls   
             ========== ============ ============ ============ ============ =========
              calamine   1.62±0.01s   1.47±0.01s    1.47±0s      926±5ms     956±3ms 
              default      failed      5.31±0s     5.31±0.01s   3.60±0.01s   1.31±0s 
             ========== ============ ============ ============ ============ =========
             For parameters: 'default', 'ods'
             
             
             asv: benchmark timed out (timeout 60.0s)

[100.00%] ··· io.excel.ReadExcelNRows.time_read_excel                                                    1/10 failed
[100.00%] ··· ========== ========= ============= ============= ============= ==========
              --                                      ext                              
              ---------- --------------------------------------------------------------
                engine      ods         xlsx          xlsm          xlsb        xls    
              ========== ========= ============= ============= ============= ==========
               calamine   921±8ms     761±9ms       763±8ms       221±3ms     232±10ms 
               default     failed   13.8±0.09ms   13.8±0.07ms   62.2±0.08ms   794±2ms  
              ========== ========= ============= ============= ============= ==========
              For parameters: 'default', 'ods'
              
              
              asv: benchmark timed out (timeout 60.0s)

After:

[75.00%] ··· io.excel.ReadExcel.time_read_excel                                                                                                                           1/10 failed
[75.00%] ··· ========== ============ ============ ============ ========== =========
             --                                     ext                            
             ---------- -----------------------------------------------------------
               engine       ods          xlsx         xlsm        xlsb       xls   
             ========== ============ ============ ============ ========== =========
              calamine   1.60±0.01s   1.51±0.01s   1.50±0.01s   954±20ms   930±4ms 
              default      failed     5.31±0.01s   5.35±0.01s   3.63±0s    1.32±0s 
             ========== ============ ============ ============ ========== =========
             For parameters: 'default', 'ods'
             
             
             asv: benchmark timed out (timeout 60.0s)

[100.00%] ··· io.excel.ReadExcelNRows.time_read_excel                                                                                                                      1/10 failed
[100.00%] ··· ========== ========= ============ ============ ============ =========
              --                                    ext                            
              ---------- ----------------------------------------------------------
                engine      ods        xlsx         xlsm         xlsb        xls   
              ========== ========= ============ ============ ============ =========
               calamine   795±2ms    638±1ms      639±2ms     98.4±0.3ms   109±1ms 
               default     failed   14.4±0.1ms   14.3±0.4ms   62.8±0.4ms   810±4ms 
              ========== ========= ============ ============ ============ =========
              For parameters: 'default', 'ods'
              
              
              asv: benchmark timed out (timeout 60.0s)

@dimastbk dimastbk merged commit a690b3b into master Sep 12, 2023
26 checks passed
@dimastbk dimastbk deleted the fix-nrows branch September 12, 2023 06:16
@dimastbk dimastbk mentioned this pull request Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant