Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table and column information to query errors #14388

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

bziobrowski
Copy link
Contributor

At the moment exceptions thrown during query execution are often missing basic table and column information, making it hard diagnose issues.

For example, if dat is a string column containing dates then issuing following query :

select key , max(dat) as last_seen_date 
from table
group by key
limit 10

might return
NumberFormatException: For input string: "2021-10-01T09:01:00-0800".

This PR adds table/alias and column/expression information to conversion errors, e.g.

Error when processing table.dat: NumberFormatException: For input string: "2021-10-01T09:01:00-0800".

+ " LIMIT 200 "
+ ") "
+ " GROUP BY time_col ";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to add test function to registry but make it work only if a system flag is set, e.g. enableTestFunctions, than to rely on fragile mocks.

* under the License.
*/
package org.apache.pinot.core.common;

Copy link
Contributor Author

@bziobrowski bziobrowski Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one's a just a simple exception that makes it possible to fetch important context information spread throught the stack.

I'd be good to discuss how to integrate the approach with other exceptions and re-organize exceptions thrown by the engine. There are many more places that don't report conversion errors properly (e.g. plenty of parse calls for function arguments that won't report which argument they failed for).

@Jackie-Jiang @gortiz @yashmayya @vrajat

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really nice improvement! IMO we could add similar custom runtime exceptions for different classes of processing errors as a subsequent enhancement. Although in that case it might be nice to name this class something more specific than PinotRuntimeException, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree we need to have our own family of exceptions, but I think it will need more discussion.

My initial suggestion is to reserve PinotRuntimeException name for a marker exception that extends RuntimeException, then add another for queries (something like QueryException) and then

---
config:
  look: handDrawn
  theme: neutral
---
classDiagram
    RuntimeException
    note for RuntimeException "From JDK"

    PinotRuntimeException
    note for PinotRuntimeException "All new Pinot exceptions"

    QException
    note for QException "All new Pinot exceptions"

    QException <|-- ParsingQException
    QException <|-- OptimizationQException

    ExecutionQException
    note for ExecutionQException "Exceptions thrown by Operators"

    RuntimeException <|-- PinotRuntimeException
    PinotRuntimeException <|-- QException
    QException <|-- ExecutionQException
    ExecutionQException <|-- ColumnExecutionQException
    
    class ExecutionQException {
        +stage
    }

    class ColumnExecutionQException {
        +table
        +column
    }
    note for ColumnExecutionQException "Execution exception dealing with a column"
Loading

@codecov-commenter
Copy link

codecov-commenter commented Nov 5, 2024

Codecov Report

Attention: Patch coverage is 35.88710% with 159 lines in your changes missing coverage. Please review.

Project coverage is 63.76%. Comparing base (59551e4) to head (4d1e4f2).
Report is 1288 commits behind head on master.

Files with missing lines Patch % Lines
.../core/operator/docvalsets/RowBasedBlockValSet.java 16.66% 50 Missing and 5 partials ⚠️
...erator/docvalsets/FilteredRowBasedBlockValSet.java 12.96% 43 Missing and 4 partials ⚠️
.../aggregation/function/TestAggregationFunction.java 12.96% 46 Missing and 1 partial ⚠️
...a/org/apache/pinot/core/common/DataBlockCache.java 80.00% 4 Missing ⚠️
...pache/pinot/core/common/PinotRuntimeException.java 85.00% 1 Missing and 2 partials ⚠️
...ore/operator/docvalsets/ProjectionBlockValSet.java 50.00% 2 Missing ⚠️
...perator/query/NonScanBasedAggregationOperator.java 90.90% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14388      +/-   ##
============================================
+ Coverage     61.75%   63.76%   +2.01%     
- Complexity      207     1555    +1348     
============================================
  Files          2436     2662     +226     
  Lines        133233   146118   +12885     
  Branches      20636    22350    +1714     
============================================
+ Hits          82274    93173   +10899     
- Misses        44911    46058    +1147     
- Partials       6048     6887     +839     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 55.41% <35.88%> (-6.30%) ⬇️
java-21 63.66% <35.88%> (+2.03%) ⬆️
skip-bytebuffers-false 63.73% <35.88%> (+1.98%) ⬆️
skip-bytebuffers-true 63.64% <35.88%> (+35.91%) ⬆️
temurin 63.76% <35.88%> (+2.01%) ⬆️
unittests 63.76% <35.88%> (+2.01%) ⬆️
unittests1 55.49% <35.88%> (+8.60%) ⬆️
unittests2 34.09% <0.00%> (+6.36%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* under the License.
*/
package org.apache.pinot.core.common;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really nice improvement! IMO we could add similar custom runtime exceptions for different classes of processing errors as a subsequent enhancement. Although in that case it might be nice to name this class something more specific than PinotRuntimeException, WDYT?

Comment on lines +56 to +71
StringBuilder message = new StringBuilder("Error when processing");
if (_tableName != null) {
message.append(" ").append(_tableName);
}
if (_columnName != null) {
if (_tableName != null) {
message.append(".");
} else {
message.append(" ");
}
message.append(_columnName);
}
message.append(": ");
message.append(super.getMessage());

return message.toString();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be clearer to change this to something like Error when processing column "abc" in table "xyz"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to suggestions. We've to keep in mind that table could actually mean alias and column - an expression - e.g. max(x) so it should be generic but still readable :).
I guess we can add more useful bits of context data.

@@ -127,6 +128,9 @@ public float[] getFloatValuesSV() {
try (InvocationScope scope = Tracing.getTracer().createScope(ProjectionBlockValSet.class)) {
recordReadValues(scope, DataType.FLOAT, true);
return _dataBlockCache.getFloatValuesForSVColumn(_column);
} catch (RuntimeException re) {
//add column information, then later enrich it with table name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then later enrich it with table name

I presume this means we don't have access to the table name information in this class currently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think table name was missing when I debugged some cases.
Let me double check .

@@ -135,6 +139,9 @@ public double[] getDoubleValuesSV() {
try (InvocationScope scope = Tracing.getTracer().createScope(ProjectionBlockValSet.class)) {
recordReadValues(scope, DataType.DOUBLE, true);
return _dataBlockCache.getDoubleValuesForSVColumn(_column);
} catch (RuntimeException re) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the error handling only added for float and double here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably because it's really hard to trigger those conversions in both engines :)
You're right, I'll add them and try to test with stub aggregate.

return ((Number) value).doubleValue();
} else {
return Double.parseDouble(value.toString());
private static Double toDouble(Comparable<?> value, DataSource dataSource) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just pass in the column name directly here I suppose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so but I'm not sure how sensitive the place and tried to minimize potential overhead.

@@ -129,7 +129,11 @@ public int[] getIntValuesForSVColumn(String column) {
intValues = new int[_length];
putValues(FieldSpec.DataType.INT, column, intValues);
}
_dataFetcher.fetchIntValues(column, _docIds, _length, intValues);
try {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nit: Have you considered adding the catch block in

instead since there is a try block already ?

@@ -122,12 +123,13 @@ public void runJob() {
// NOTE: We need to handle Error here, or the execution threads will die without adding the execution
// exception into the query response, and the main thread might wait infinitely (until timeout) or
// throw unexpected exceptions (such as NPE).
PinotRuntimeException pe = PinotRuntimeException.create(t).withTableName(_queryContext.getTableName());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue is that other types of exceptions like NPE may get missed when looking at stack traces. I know that the NPE will still appear in the stack trace etc. However as on-call our senses are trained to certain exception types in the top message. Is there an example of how the error looks like ? Maybe you can simulate by forcibly throwing an NPE ?

values[matchedRowId] = ((Number) _rows.get(rowId)[_colId]).longValue();
} else {
values[matchedRowId] = Long.parseLong((String) _rows.get(rowId)[_colId]);
try {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cant comment at the right line no. In line 136, column names maybe useful in Precondition checks as well. I dont know how often they are triggered though.

Copy link
Contributor

@gortiz gortiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name PinotRuntimeException is too generic and therefore cannot be used to identify an error associated with a column during query execution. We need to think about other areas of the code like ingeston or system management that can also fail with runtime exceptions.

I'm not asking to create the whole exception hierarchy I suggested in https://github.com/apache/pinot/pull/14388/files#r1829241167, but at least we need to rename the exception to something like ColumnExecutionQueryException (or some acronym like ColExecQException).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants