Skip to content

Commit

Permalink
+ IQueryable<>.FilterByItems() from https://stackoverflow.com/quest…
Browse files Browse the repository at this point in the history
…ions/67666649/lambda-linq-with-contains-criteria-for-multiple-keywords/67666993#67666993 to replace `IQueryable<>.WhereOrContainsValues()` except its last usage in `SaverWithRevision.AddRevisionsWithDuplicateIndex()` since it's passing a visited non-generic expression @ ExtensionMethods.cs

@ c#/crawler
  • Loading branch information
n0099 committed Jul 10, 2024
1 parent 9cefac9 commit dc30b8e
Show file tree
Hide file tree
Showing 4 changed files with 53 additions and 14 deletions.
40 changes: 40 additions & 0 deletions c#/crawler/src/ExtensionMethods.cs
Original file line number Diff line number Diff line change
Expand Up @@ -73,4 +73,44 @@ public static IQueryable<TEntity> WhereOrContainsValues<TEntity, TToCompare>(
LinqKit.PredicateBuilder.New<TEntity>(),
(innerPredicate, expressionFactory) =>
innerPredicate.And(expressionFactory(valueToCompare))))));

/// <see>https://stackoverflow.com/questions/67666649/lambda-linq-with-contains-criteria-for-multiple-keywords/67666993#67666993</see>
public static IQueryable<T> FilterByItems<T, TItem>(

Check failure on line 78 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (ubuntu-latest) / build (crawler)

Check failure on line 78 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (ubuntu-latest) / build (crawler)

Check failure on line 78 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (macos-latest) / build (crawler)

Check failure on line 78 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (macos-latest) / build (crawler)

Check failure on line 78 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (windows-latest) / build (crawler)

Check failure on line 78 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (windows-latest) / build (crawler)

this IQueryable<T> query,
IEnumerable<TItem> items,
Expression<Func<T, TItem, bool>> filterPattern,
bool isOr = true)
{
var predicate = items.Aggregate<TItem, Expression?>(null, (current, item) =>
{
var itemExpr = Expression.Constant(item);
var itemCondition = FilterByItemsExpressionReplacer
.Replace(filterPattern.Body, filterPattern.Parameters[1], itemExpr);
return current == null
? itemCondition
: Expression.MakeBinary(isOr ? ExpressionType.OrElse : ExpressionType.AndAlso,

Check failure on line 92 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (ubuntu-latest) / build (crawler)

Extract this nested ternary operation into an independent statement. (https://rules.sonarsource.com/csharp/RSPEC-3358)

Check failure on line 92 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (macos-latest) / build (crawler)

Extract this nested ternary operation into an independent statement. (https://rules.sonarsource.com/csharp/RSPEC-3358)

Check failure on line 92 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (windows-latest) / build (crawler)

Extract this nested ternary operation into an independent statement. (https://rules.sonarsource.com/csharp/RSPEC-3358)
current,
itemCondition);
}) ?? Expression.Constant(false);
var filterLambda = Expression.Lambda<Func<T, bool>>(predicate, filterPattern.Parameters[0]);

return query.Where(filterLambda);
}

private class FilterByItemsExpressionReplacer(IDictionary<Expression, Expression> replaceMap) : ExpressionVisitor

Check failure on line 101 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (ubuntu-latest) / build (crawler)

Private classes which are not derived in the current assembly should be marked as 'sealed'. (https://rules.sonarsource.com/csharp/RSPEC-3260)

Check failure on line 101 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (macos-latest) / build (crawler)

Private classes which are not derived in the current assembly should be marked as 'sealed'. (https://rules.sonarsource.com/csharp/RSPEC-3260)

Check warning on line 101 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (ubuntu-latest) / ReSharper

"[CA1852] Type 'FilterByItemsExpressionReplacer' can be sealed because it has no subtypes in its containing assembly and is not externally visible" on /home/runner/work/open-tbm/open-tbm/c#/crawler/src/ExtensionMethods.cs(101,4105)

Check failure on line 101 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (windows-latest) / build (crawler)

Private classes which are not derived in the current assembly should be marked as 'sealed'. (https://rules.sonarsource.com/csharp/RSPEC-3260)

Check warning on line 101 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (macos-latest) / ReSharper

"[CA1852] Type 'FilterByItemsExpressionReplacer' can be sealed because it has no subtypes in its containing assembly and is not externally visible" on /Users/runner/work/open-tbm/open-tbm/c#/crawler/src/ExtensionMethods.cs(101,4105)

Check warning on line 101 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (windows-latest) / ReSharper

"[CA1852] Type 'FilterByItemsExpressionReplacer' can be sealed because it has no subtypes in its containing assembly and is not externally visible" on D:\a\open-tbm\open-tbm\c#\crawler\src\ExtensionMethods.cs(101,4105)
{
private readonly IDictionary<Expression, Expression> _replaceMap
= replaceMap ?? throw new ArgumentNullException(nameof(replaceMap));

[return: NotNullIfNotNull(nameof(exp))]
public override Expression? Visit(Expression? exp) =>

Check failure on line 107 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (ubuntu-latest) / build (crawler)

Rename parameter 'exp' to 'node' to match the base class declaration. (https://rules.sonarsource.com/csharp/RSPEC-927)

Check failure on line 107 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (macos-latest) / build (crawler)

Rename parameter 'exp' to 'node' to match the base class declaration. (https://rules.sonarsource.com/csharp/RSPEC-927)

Check failure on line 107 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (windows-latest) / build (crawler)

Rename parameter 'exp' to 'node' to match the base class declaration. (https://rules.sonarsource.com/csharp/RSPEC-927)
exp != null && _replaceMap.TryGetValue(exp, out var replacement)
? replacement
: base.Visit(exp);

public static Expression Replace(Expression expr, Expression toReplace, Expression toExpr) =>

Check failure on line 112 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (ubuntu-latest) / build (crawler)

Check failure on line 112 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (macos-latest) / build (crawler)

Check failure on line 112 in c#/crawler/src/ExtensionMethods.cs

View workflow job for this annotation

GitHub Actions / runs-on (windows-latest) / build (crawler)

new FilterByItemsExpressionReplacer(new Dictionary<Expression, Expression> {{toReplace, toExpr}})
.Visit(expr);
}
}
8 changes: 3 additions & 5 deletions c#/crawler/src/Tieba/Crawl/Saver/ReplyContentImageSaver.cs
Original file line number Diff line number Diff line change
Expand Up @@ -104,11 +104,9 @@ on existingOrNew.UrlFilename equals replyContentImage.ImageInReply.UrlFilename
select (existingOrNew, replyContentImage))
.ForEach(t => t.replyContentImage.ImageInReply = t.existingOrNew);
var existingReplyContentImages = db.ReplyContentImages.AsNoTracking()
.WhereOrContainsValues(replyContentImages, [
newOrExisting => existing => existing.Pid == newOrExisting.Pid,
newOrExisting => existing =>
existing.ImageInReply.UrlFilename == newOrExisting.ImageInReply.UrlFilename
])
.FilterByItems(replyContentImages, (existing, newOrExisting) =>
existing.Pid == newOrExisting.Pid
&& existing.ImageInReply.UrlFilename == newOrExisting.ImageInReply.UrlFilename)
.Include(e => e.ImageInReply)
.Select(e => new {e.Pid, e.ImageInReply.UrlFilename})
.ToList();
Expand Down
9 changes: 5 additions & 4 deletions c#/crawler/src/Tieba/Crawl/Saver/ReplySignatureSaver.cs
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,11 @@ public Action Save(CrawlerDbContext db, IEnumerable<ReplyPost> replies)
r => r.Signature,
SignatureIdAndValueEqualityComparer.Instance);

var existingSignatures = db.ReplySignatures.AsTracking().WhereOrContainsValues(signatures, [
newOrExisting => existing => existing.SignatureId == newOrExisting.SignatureId,
newOrExisting => existing => existing.XxHash3 == newOrExisting.XxHash3
]).ToList();
var existingSignatures = db.ReplySignatures.AsTracking().FilterByItems(
signatures, (existing, newOrExisting) =>
existing.SignatureId == newOrExisting.SignatureId
&& existing.XxHash3 == newOrExisting.XxHash3)
.ToList();
(from existing in existingSignatures
join newInReply in signatures on existing.SignatureId equals newInReply.SignatureId
select (existing, newInReply))
Expand Down
10 changes: 5 additions & 5 deletions c#/crawler/src/Tieba/Crawl/Saver/ThreadLatestReplierSaver.cs
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ public Action Save(CrawlerDbContext db, IReadOnlyCollection<ThreadPost> threads)
var uniqueLatestRepliers = threads
.Where(th => th.LatestReplier != null)
.Select(UniqueLatestReplier.FromThread).ToList();
var existingLatestRepliers = db.LatestRepliers.AsNoTracking().WhereOrContainsValues(uniqueLatestRepliers,
[
newOrExisting => existing => existing.Name == newOrExisting.Name,
newOrExisting => existing => existing.DisplayName == newOrExisting.DisplayName
]).ToList();
var existingLatestRepliers = db.LatestRepliers.AsNoTracking().FilterByItems(
uniqueLatestRepliers, (latestReplier, uniqueLatestReplier) =>
latestReplier.Name == uniqueLatestReplier.Name
&& latestReplier.DisplayName == uniqueLatestReplier.DisplayName)
.ToList();
(from existing in existingLatestRepliers
join thread in threads
on UniqueLatestReplier.FromLatestReplier(existing) equals UniqueLatestReplier.FromThread(thread)
Expand Down

0 comments on commit dc30b8e

Please sign in to comment.