Skip to content

Releases: StarRocks/starrocks

Release notes 2.1.0

28 Feb 04:45
1864de0
Compare
Choose a tag to compare

New Features

  • [Preview] StarRocks now supports Iceberg external tables.
  • [Preview] The pipeline engine is now available. It is a new execution engine designed for multicore scheduling. The query parallelism can be adaptively adjusted without the need to set the parallel_fragment_exec_instance_num parameter. This also improves performance in high concurrency scenarios.
  • The CTAS (Create Table As Select) function is supported, making ETL and table creation easier.
  • SQL fingerprint is supported. SQL fingerprint is generated in audit.log, which facilitates the location of slow queries.

Improvements

  • Compaction is optimized. A flat table can contain up to 10,000 columns.
  • The performance of first-time scan and page cache is optimized. Random I/O is reduced to improve first-time scan performance. The improvement is more noticeable if first-time scan occurs on SATA disks. StarRocks' page cache can store original data, which eliminates the need for bitshuffle encoding and unnecessary decoding. This improves the cache hit rate and query efficiency.
  • Schema change is supported in the primary key model. You can add, delete, and modify bitmap indexes by using Alter table.
  • [Preview] The size of a string can be up to 1 MB.
  • JSON load performance is optimized. You can load more than 100 MB JSON data in a single file.
  • Bitmap index performance is optimized.
  • The performance of StarRocks Hive external tables is optimized. Data in the CSV format can be read.
  • DEFAULT CURRENT_TIMESTAMP is supported in the create table statement. #1193
  • StarRocks supports the loading of CSV files with multiple delimiters.

BugFix
The following bugs are fixed:

  • Auto __op mapping does not take effect if jsonpaths is specified in the command used for loading JSON data. #3405
  • BE nodes fail because the source data changes during data loading using Broker Load. #3481
  • Some SQL statements report errors after materialized views are created. #2975
  • The routine load does not work due to quoted jsonpaths. #2488
  • Query concurrency decreases sharply when the number of columns to query exceeds 200.

Behavior Changes

  • The API for disabling a Colocation Group is changed from DELETE /api/colocate/group_stable to POST /api/colocate/group_unstable.

Others

  • flink-connector-starrocks is now available for Flink to read StarRocks data in batches. This improves data read efficiency compared to the JDBC connector.

Release notes 2.0.2

02 Mar 04:42
f096d7f
Compare
Choose a tag to compare

Improvement

  • Memory usage is optimized. Users can specify the label_keep_max_num parameter to control the maximum number of loading jobs to retain within a period of time. This prevents full GC caused by high memory usage of FE during frequent data loading. #2410

BugFix
The following bugs are fixed:

  • BE nodes fail when the column decoder encounters an exception. #3510
  • Auto __op mapping does not take effect when jsonpaths is specified in the command used for loading JSON data. #3405
  • BE nodes fail because the source data changes during data loading using Broker Load. #3481
  • Some SQL statements report errors after materialized views are created. #3053
  • Query may fail if an SQL clause contains a predicate that supports global dictionary for low-cardinality optimization and a predicate that does not. #3421

Release notes 2.0.1

22 Jan 02:15
f0de9ec
Compare
Choose a tag to compare

Release date: Jan 21, 2022

Improvement

  • Add implicit_cast in hive external table (#2829)
  • Use read/write lock avoid cpu cost too much when collect metrics(#2901)
  • optimize some statistics.

Bugfix

  • Fix Global dictionary when replica is inconsistent. (#2700) (#2765)
  • Add exec_mem_limit for stream load (#2693)
  • Fix OOM when loading into primary key model. (#2743)(#2777)
  • Fix BE hang when query external mysql table (#2881)

Release notes 2.0.0-GA

04 Jan 13:43
Compare
Choose a tag to compare
Pre-release

Release date: Jan 4, 2022

New Feature

  • External Table
    • [Experimental Function]Support for Hive external table on S3
    • DecimalV3 support for external table #425
  • Implement complex expressions to be pushed down to the storage layer for computation, thus gaining performance gains
  • Primary Key is officially released, which supports Stream Load, Broker Load, Routine Load, and also provides a second-level synchronization tool for MySQL data based on Flink-cdc

Improvement

  • Arithmetic operators optimization
    • Optimize the performance of dictionary with low cardinality #791
    • Optimize the scan performance of int for single table #273
    • Optimize the performance of count(distinct int) with high cardinality #139 #250 #544#570
    • Execution level optimization and refinement Group by 2 int / limit / case when / not equal
    • Optimize Group by 2 int / limit / case when / not equal in implementation-level
  • Memory management optimization
    • Refactor the memory statistics and control framework to accurately count memory usage and completely solve OOM
    • Optimize metadata memory usage
    • Solve the problem of large memory release stuck in execution threads for a long time
    • Add process graceful exit mechanism and support memory leak check #1093

Bugfix

  • Fix the problem that the Hive external table is timeout to get metadata in a large amount.
  • Fix the problem of unclear error message of materialized view creation.
  • Fix the implementation of like in vectorization engine #722
  • Repair the error of parsing the predicate is in alter table #725
  • Fix the problem that the curdate function can not format the date.

Release notes 1.19.5

20 Dec 12:53
a356769
Compare
Choose a tag to compare

Improvement

  • Improve shuffle bucket plan (#2184)
  • Add a num_segment threshold when loading multiple big files (#2067)

Bugfix

Release notes 1.19.4

10 Dec 10:11
da84cf9
Compare
Choose a tag to compare

Imporvement

  • support cast(varchar as bitmap) (#1941)
  • Modify Hive external table scan scheduling strategy (#1394) (#1807)

Bugfix

  • Fix cross join lose predicate in JoinAssociativityRule (#1918)
  • Fix cast to decimal(0,0) bug (#1709) (#1738)
  • Fix replicate join in plan fragment builder (#1727)
  • Fix several planner cost calculation

Release notes 1.19.3

02 Dec 04:16
e65759c
Compare
Choose a tag to compare

Improvement

Upgrade jprotobuf to enhance security (#1506)

Major Bugfix

Fix some CBO bad cases and correctness issues.
Fix grouping sets with same column bug(#1395) (#1119)
Fix some date function issues (#1385) (#1627)
Fix streaming aggregation issue(#1584

Release notes 1.19.2

02 Dec 03:24
b04a782
Compare
Choose a tag to compare

Improvement

  • bucket shuffle join support right join and full outer join(#1209) (#1234)

Major Bugfix

  • Support push down predicate through repeat node(#1410) (#1417)
  • Fix routine load data losing bug when FE master changed (#1074) (#1272)
  • Fix create view failed with union (#1083)
  • Fix some Hive external table stability issues(#1408)
  • Fix select group by view error (#1231)

Release notes 1.19.1

02 Nov 10:56
65e87c3
Compare
Choose a tag to compare

Improvement

  • Optimize the performance of show frontends. # 507 # 984
  • Add monitoring of slow queries and meta logs. # 502 # 891
  • Optimize the fetching of Hive external metadata to achieve parallel fetching.# 425 # 451

BugFix

  • Fix the problem of Thrift protocol compatibility, so that the Hive external table can be connected with Kerberos. # 184 # 947 # 995 # 999
  • Fix several bugs in view creation. # 972 # 987# 1001
  • Fix the problem that FE cannot be upgraded in grayscale. # 485 # 890

Release notes 1.19.0

03 Nov 04:55
6a8f072
Compare
Choose a tag to compare

New Feature

  • Implement Global Runtime Filter, which can enable runtime filter for shuffle join.
  • CBO Planner is enabled by default, improved colocated join, bucket shuffle, statistical information estimation, etc.
  • [Experimental Function] Primary Key model release: To better support real-time/frequent update features, StarRocks has added a new table type: primary key model. The model supports Stream Load, Broker Load, Routine Load, JSON import, and also provides a second-level synchronization tool for MySQL data based on Flink-cdc.
  • [Experimental Function] Support write function for external tables. Support writing data to another StarRocks cluster table by external tables to solve the read/write separation requirement and provide better resource isolation.

Improvement

  • Performance optimization.
    • count distinct int statement
    • group by int statement
    • or statement
  • Optimize disk balance algorithm. Data can be automatically balanced after adding disks to a single machine.
  • Support partial column export.
  • Optimize show processlist to show specific SQL.
  • Support multiple variable settings in SET_VAR .
  • Improve the error reporting information, including table_sink, routine load, creation of materialized view, etc.

Bugfix

  • Fix the issue that the dynamic partition table cannot be created automatically after the data recovery operation is completed. # 337
  • Fix the problem of error reported by row_number function after CBO is opened.
  • Fix the problem of FE stuck due to statistical information collection
  • Fix the problem that set_var takes effect for session but not for statements.
  • Fix the problem that select count(*) returns abnormality on the Hive partition external table.