Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CIDR function to PPL (#3036) #3110

Merged
merged 40 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
bd264a0
Add stub implementation of `cidr`:
currantw Oct 22, 2024
c2ecdf7
Initial implementation of `isAddressInRange` method.
currantw Oct 23, 2024
bfcbbf5
Add `BinaryPredicateOperator` tests for `cidr`. Update `isAddressInRa…
currantw Oct 23, 2024
093696a
Fix a couple documentation typos.
currantw Oct 23, 2024
f0464d3
Add documentation for `cidr` function.
currantw Oct 23, 2024
ce8d190
Add documentation for `cidr` function.
currantw Oct 23, 2024
6d56ac9
Add license header to `IPUtils`.
currantw Oct 23, 2024
96aaa89
Revert `final` and `toString` cleanup.
currantw Oct 23, 2024
04af98d
Update to use `CidrExpression`:
currantw Oct 29, 2024
f2ca434
Refactor to replace `CidrExpression` with `IPFunctions`. Plus prelimi…
currantw Oct 29, 2024
efe26b8
Address review comments - minor code cleanup.
currantw Oct 29, 2024
824c463
Fix failing `IPFunction` unit tests.
currantw Oct 29, 2024
3ffac2e
Reformat new files with Spotless.
currantw Oct 29, 2024
128bb4b
Fix typos in `UnaryPredicateOperator`.
currantw Oct 29, 2024
3f267a7
Rename `cidr` to `cidrmatch` for consistency with Spark (see issue #7…
currantw Oct 29, 2024
d473fc7
Minor documentation to better align with Spark `cidrmatch` behaviour.
currantw Oct 29, 2024
9ba705a
Refactor implementation to use `com.github.seancfoley:ipaddress:5.4.2…
currantw Oct 30, 2024
c95152e
Update OpenSearchIpType to covert to string type.
currantw Oct 31, 2024
9c7f5b6
Run Spotless.
currantw Oct 31, 2024
b7fb7f7
Minor cleanup.
currantw Oct 31, 2024
4b60ae1
Update typos to fix failing unit tests, plus run Spotless.
currantw Oct 31, 2024
08af013
Fix remaining failing unit tests.
currantw Oct 31, 2024
4b05c24
Address minor review comments.
currantw Nov 7, 2024
17decc8
Apply Spotless
currantw Nov 7, 2024
e8b6f4e
Cleanup: remove a few redundant local variables.
currantw Nov 7, 2024
ac02e67
More Spotless
currantw Nov 8, 2024
8c17404
Add missing unit test
currantw Nov 8, 2024
20c9672
Get rid of wildcard imports.
currantw Nov 12, 2024
fb0be94
Fix more wildcard import statements
currantw Nov 12, 2024
3af85af
Add miss test data for doctest
currantw Nov 12, 2024
aa9cb85
Fix some more failing integration tests
currantw Nov 13, 2024
cff2418
Fix failing doc test
currantw Nov 14, 2024
8324c4c
Address minor review comments.
currantw Nov 19, 2024
e758077
Revert changes to cast IP to string.
currantw Nov 21, 2024
677bdc8
Fix failing tests and add TODOs for #3145 (add support for IP address…
currantw Nov 22, 2024
43d05d6
Update `ip.rst` for consistency with Spark
currantw Nov 28, 2024
23219cb
Update `ip.rst` for consistency with Spark
currantw Nov 28, 2024
d3be058
Rename "function" and "operator" utility classes to be plural.
currantw Dec 2, 2024
c216763
Minor documentation updates
currantw Dec 2, 2024
1b608c6
Apply spotless
currantw Dec 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions core/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ dependencies {
api group: 'com.google.code.gson', name: 'gson', version: '2.8.9'
api group: 'com.tdunning', name: 't-digest', version: '3.3'
api project(':common')
implementation "com.github.seancfoley:ipaddress:5.4.2"
currantw marked this conversation as resolved.
Show resolved Hide resolved

testImplementation('org.junit.jupiter:junit-jupiter:5.9.3')
testImplementation group: 'org.hamcrest', name: 'hamcrest-library', version: '2.1'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,8 @@ public class QueryEngineException extends RuntimeException {
public QueryEngineException(String message) {
super(message);
}

public QueryEngineException(String message, Throwable cause) {
super(message, cause);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,12 @@

/** Semantic Check Exception. */
public class SemanticCheckException extends QueryEngineException {

public SemanticCheckException(String message) {
super(message);
}

public SemanticCheckException(String message, Throwable cause) {
super(message, cause);
}
}
4 changes: 4 additions & 0 deletions core/src/main/java/org/opensearch/sql/expression/DSL.java
Original file line number Diff line number Diff line change
Expand Up @@ -563,6 +563,10 @@ public static FunctionExpression regexp(Expression... expressions) {
return compile(FunctionProperties.None, BuiltinFunctionName.REGEXP, expressions);
}

public static FunctionExpression cidrmatch(Expression... expressions) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why you added the match here:

Suggested change
public static FunctionExpression cidrmatch(Expression... expressions) {
public static FunctionExpression cidr(Expression... expressions) {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acarbonetto good catch 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, as mentioned, I used cidrmatch for consistency with Spark (even though I agree that cidr is a better name...)

return compile(FunctionProperties.None, BuiltinFunctionName.CIDRMATCH, expressions);
}

public static FunctionExpression concat(Expression... expressions) {
return compile(FunctionProperties.None, BuiltinFunctionName.CONCAT, expressions);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,9 @@ public enum BuiltinFunctionName {
/** Text Functions. */
TOSTRING(FunctionName.of("tostring")),

/** IP Functions. */
CIDRMATCH(FunctionName.of("cidrmatch")),
currantw marked this conversation as resolved.
Show resolved Hide resolved

/** Arithmetic Operators. */
ADD(FunctionName.of("+")),
ADDFUNCTION(FunctionName.of("add")),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.opensearch.sql.expression.aggregation.AggregatorFunction;
import org.opensearch.sql.expression.datetime.DateTimeFunction;
import org.opensearch.sql.expression.datetime.IntervalClause;
import org.opensearch.sql.expression.ip.IPFunction;
import org.opensearch.sql.expression.operator.arthmetic.ArithmeticFunction;
import org.opensearch.sql.expression.operator.arthmetic.MathematicalFunction;
import org.opensearch.sql.expression.operator.convert.TypeCastOperator;
Expand Down Expand Up @@ -81,6 +82,7 @@ public static synchronized BuiltinFunctionRepository getInstance() {
TypeCastOperator.register(instance);
SystemFunctions.register(instance);
OpenSearchFunctions.register(instance);
IPFunction.register(instance);
currantw marked this conversation as resolved.
Show resolved Hide resolved
}
return instance;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ public Pair<FunctionSignature, FunctionBuilder> resolve(FunctionSignature unreso
&& !FunctionSignature.isVarArgFunction(bestMatchEntry.getValue().getParamTypeList())) {
throw new ExpressionEvaluationException(
String.format(
"%s function expected %s, but get %s",
"%s function expected %s, but got %s",
functionName,
formatFunctions(functionBundle.keySet()),
unresolvedSignature.formatTypes()));
Expand Down
103 changes: 103 additions & 0 deletions core/src/main/java/org/opensearch/sql/expression/ip/IPFunction.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.expression.ip;
currantw marked this conversation as resolved.
Show resolved Hide resolved

import static org.opensearch.sql.data.type.ExprCoreType.BOOLEAN;
import static org.opensearch.sql.data.type.ExprCoreType.STRING;
import static org.opensearch.sql.expression.function.FunctionDSL.define;
import static org.opensearch.sql.expression.function.FunctionDSL.impl;
import static org.opensearch.sql.expression.function.FunctionDSL.nullMissingHandling;

import inet.ipaddr.AddressStringException;
import inet.ipaddr.IPAddressString;
import inet.ipaddr.IPAddressStringParameters;
import lombok.experimental.UtilityClass;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.data.model.ExprValueUtils;
import org.opensearch.sql.exception.SemanticCheckException;
import org.opensearch.sql.expression.function.BuiltinFunctionName;
import org.opensearch.sql.expression.function.BuiltinFunctionRepository;
import org.opensearch.sql.expression.function.DefaultFunctionResolver;

/** Utility class that defines and registers IP functions. */
@UtilityClass
public class IPFunction {
currantw marked this conversation as resolved.
Show resolved Hide resolved

public void register(BuiltinFunctionRepository repository) {
repository.register(cidrmatch());
}

private DefaultFunctionResolver cidrmatch() {

// TODO #3145: Add support for IP address data type.
return define(
BuiltinFunctionName.CIDRMATCH.getName(),
impl(nullMissingHandling(IPFunction::exprCidrMatch), BOOLEAN, STRING, STRING));
}

/**
* Returns whether the given IP address is within the specified CIDR IP address range. Supports
currantw marked this conversation as resolved.
Show resolved Hide resolved
* both IPv4 and IPv6 addresses.
currantw marked this conversation as resolved.
Show resolved Hide resolved
*
* @param addressExprValue IP address (e.g. "198.51.100.14" or "2001:0db8::ff00:42:8329").
* @param rangeExprValue IP address range in CIDR notation (e.g. "198.51.100.0/24" or
* "2001:0db8::/32")
currantw marked this conversation as resolved.
Show resolved Hide resolved
* @return true if the address is in the range; otherwise false.
* @throws SemanticCheckException if the address or range is not valid, or if they do not use the
* same version (IPv4 or IPv6).
*/
private ExprValue exprCidrMatch(ExprValue addressExprValue, ExprValue rangeExprValue) {

String addressString = addressExprValue.stringValue();
String rangeString = rangeExprValue.stringValue();

final IPAddressStringParameters validationOptions =
new IPAddressStringParameters.Builder()
.allowEmpty(false)
.setEmptyAsLoopback(false)
.allow_inet_aton(false)
.allowSingleSegment(false)
.toParams();

// Get and validate IP address.
IPAddressString address =
new IPAddressString(addressExprValue.stringValue(), validationOptions);

try {
address.validate();
} catch (AddressStringException e) {
String msg =
String.format(
"IP address '%s' is not valid. Error details: %s", addressString, e.getMessage());
throw new SemanticCheckException(msg, e);
}

// Get and validate CIDR IP address range.
IPAddressString range = new IPAddressString(rangeExprValue.stringValue(), validationOptions);

try {
range.validate();
} catch (AddressStringException e) {
String msg =
String.format(
"CIDR IP address range '%s' is not valid. Error details: %s",
rangeString, e.getMessage());
throw new SemanticCheckException(msg, e);
}

// Address and range must use the same IP version (IPv4 or IPv6).
if (address.isIPv4() ^ range.isIPv4()) {
String msg =
String.format(
"IP address '%s' and CIDR IP address range '%s' are not compatible. Both must be"
+ " either IPv4 or IPv6.",
addressString, rangeString);
throw new SemanticCheckException(msg);
}

return ExprValueUtils.booleanValue(range.contains(address));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - but next time lets separate general code refactory from a specific feature plz

Original file line number Diff line number Diff line change
Expand Up @@ -111,8 +111,7 @@ private static DefaultFunctionResolver ifFunction() {
.map(v -> impl((UnaryPredicateOperator::exprIf), v, BOOLEAN, v, v))
.collect(Collectors.toList());

DefaultFunctionResolver functionResolver = FunctionDSL.define(functionName, functionsOne);
return functionResolver;
return FunctionDSL.define(functionName, functionsOne);
currantw marked this conversation as resolved.
Show resolved Hide resolved
}

private static DefaultFunctionResolver ifNull() {
Expand All @@ -128,40 +127,37 @@ private static DefaultFunctionResolver ifNull() {
.map(v -> impl((UnaryPredicateOperator::exprIfNull), v, v, v))
.collect(Collectors.toList());

DefaultFunctionResolver functionResolver = FunctionDSL.define(functionName, functionsOne);
return functionResolver;
return FunctionDSL.define(functionName, functionsOne);
}

private static DefaultFunctionResolver nullIf() {
FunctionName functionName = BuiltinFunctionName.NULLIF.getName();
List<ExprCoreType> typeList = ExprCoreType.coreTypes();

DefaultFunctionResolver functionResolver =
FunctionDSL.define(
functionName,
typeList.stream()
.map(v -> impl((UnaryPredicateOperator::exprNullIf), v, v, v))
.collect(Collectors.toList()));
return functionResolver;
return FunctionDSL.define(
functionName,
typeList.stream()
.map(v -> impl((UnaryPredicateOperator::exprNullIf), v, v, v))
.collect(Collectors.toList()));
}

/**
* v2 if v1 is null.
*
* @param v1 varable 1
* @param v2 varable 2
* @param v1 variable 1
* @param v2 variable 2
* @return v2 if v1 is null
*/
public static ExprValue exprIfNull(ExprValue v1, ExprValue v2) {
return (v1.isNull() || v1.isMissing()) ? v2 : v1;
}

/**
* return null if v1 equls to v2.
* return null if v1 equals to v2.
*
* @param v1 varable 1
* @param v2 varable 2
* @return null if v1 equls to v2
* @param v1 variable 1
* @param v2 variable 2
* @return null if v1 equals to v2
currantw marked this conversation as resolved.
Show resolved Hide resolved
*/
public static ExprValue exprNullIf(ExprValue v1, ExprValue v2) {
return v1.equals(v2) ? LITERAL_NULL : v1;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ public void filter_relation_with_invalid_qualifiedName_ExpressionEvaluationExcep
"= function expected {[BYTE,BYTE],[SHORT,SHORT],[INTEGER,INTEGER],[LONG,LONG],"
+ "[FLOAT,FLOAT],[DOUBLE,DOUBLE],[STRING,STRING],[BOOLEAN,BOOLEAN],[DATE,DATE],"
+ "[TIME,TIME],[TIMESTAMP,TIMESTAMP],[INTERVAL,INTERVAL],"
+ "[STRUCT,STRUCT],[ARRAY,ARRAY]}, but get [STRING,INTEGER]",
+ "[STRUCT,STRUCT],[ARRAY,ARRAY]}, but got [STRING,INTEGER]",
exception.getMessage());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,18 @@

package org.opensearch.sql.expression.aggregation;

import static org.junit.jupiter.api.Assertions.*;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.Mockito.when;
import static org.opensearch.sql.data.model.ExprValueUtils.*;
import static org.opensearch.sql.data.type.ExprCoreType.*;
import static org.opensearch.sql.data.model.ExprValueUtils.integerValue;
import static org.opensearch.sql.data.model.ExprValueUtils.longValue;
import static org.opensearch.sql.data.type.ExprCoreType.DOUBLE;
import static org.opensearch.sql.data.type.ExprCoreType.FLOAT;
import static org.opensearch.sql.data.type.ExprCoreType.INTEGER;
import static org.opensearch.sql.data.type.ExprCoreType.LONG;
import static org.opensearch.sql.data.type.ExprCoreType.STRING;

import java.util.ArrayList;
import java.util.List;
Expand Down Expand Up @@ -195,7 +202,7 @@ public void test_percentile_with_invalid_size() {
"percentile_approx function expected"
+ " {[INTEGER,DOUBLE],[INTEGER,DOUBLE,DOUBLE],[LONG,DOUBLE],[LONG,DOUBLE,DOUBLE],"
+ "[FLOAT,DOUBLE],[FLOAT,DOUBLE,DOUBLE],[DOUBLE,DOUBLE],[DOUBLE,DOUBLE,DOUBLE]},"
+ " but get [DOUBLE,STRING]",
+ " but got [DOUBLE,STRING]",
exception2.getMessage());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ public void adddate_has_second_signature_but_not_date_add() {
() -> date_add(LocalDateTime.of(1961, 4, 12, 9, 7), 100500));
assertEquals(
"date_add function expected {[DATE,INTERVAL],[TIMESTAMP,INTERVAL],"
+ "[TIME,INTERVAL]}, but get [TIMESTAMP,INTEGER]",
+ "[TIME,INTERVAL]}, but got [TIMESTAMP,INTEGER]",
exception.getMessage());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ public void subdate_has_second_signature_but_not_date_sub() {
ExpressionEvaluationException.class,
() -> date_sub(LocalDateTime.of(1961, 4, 12, 9, 7), 100500));
assertEquals(
"date_sub function expected {[DATE,INTERVAL],[TIMESTAMP,INTERVAL],[TIME,INTERVAL]}, but get"
"date_sub function expected {[DATE,INTERVAL],[TIMESTAMP,INTERVAL],[TIME,INTERVAL]}, but got"
+ " [TIMESTAMP,INTEGER]",
exception.getMessage());
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ void resolve_function_not_match() {
assertThrows(
ExpressionEvaluationException.class, () -> resolver.resolve(functionSignature));
assertEquals(
"add function expected {[INTEGER,INTEGER]}, but get [BOOLEAN,BOOLEAN]",
"add function expected {[INTEGER,INTEGER]}, but got [BOOLEAN,BOOLEAN]",
exception.getMessage());
}

Expand Down
Loading
Loading