TopicMatcher does not work as expected #226

Altair-Bueno · 2024-04-16T12:14:22Z

Summary

TopicMatcher behaviour is nowhere near the advertised behaviour, and it differs from the Python implementation

Example

from paho.mqtt.matcher import MQTTMatcher

matcher = MQTTMatcher()

matcher["$MAC/+/+/rpc"] = "_/device_type/systemid/_"
matcher["$MAC/+/+/+/rpc"] = "_/device_type/systemid/zoneid/_"
matcher["$MAC/+/rpc"] = "_/device_type/_"
matcher["$MAC/rpc"] = ""

print(list(matcher.iter_match("$MAC/humidifier/1/rpc")))
# ['_/device_type/systemid/_']

use paho_mqtt::topic_matcher::TopicMatcher;

fn main() {
    let mut topic_matcher = TopicMatcher::new();
    topic_matcher.insert("$MAC/+/+/rpc", "_/device_type/systemid/_");
    // Comment all inserts below to see the difference
    topic_matcher.insert("$MAC/+/+/+/rpc", "_/device_type/systemid/zoneid/_");
    topic_matcher.insert("$MAC/+/rpc", "_/device_type/_");
    topic_matcher.insert("$MAC/rpc", "");

    let matches: Vec<_> = topic_matcher.matches("$MAC/humidifier/1/rpc").collect();
    println!("{:?}", matches);
    // Prints []
    // Only works if it is the only insertion
}

MRE

topic-matcher-behaviour.zip

Altair-Bueno · 2024-04-17T07:52:36Z

Looking at the current implementation of TopicMatcher, it seems to me it uses too many resources. There are multiple string clones which is not ideal, at least for my usecase (memory constrained device).

I decided to use a far simpler solution that is both correct and should use less memory. It should be somewhat performant too, at least with a few topics (which is my usecase anyways).

This is the code, for anyone interested (trait bounds should be revisited, as they are too general)

Code

// Author: Altair Bueno <[email protected]>

use std::borrow::Borrow;
use std::collections::BTreeMap;
use std::collections::HashMap;
use std::hash::Hash;

pub fn matches(pattern: &str, topic: &str) -> bool {
    let mut pattern = pattern.split('/');
    let mut topic = topic.split('/');
    loop {
        let pattern_level = pattern.next();
        let topic_level = topic.next();
        match (pattern_level, topic_level) {
            // Exhausted both pattern and topic
            (None, None) => return true,
            // Wildcard on pattern
            (Some("#"), _) => return true,
            // Single level wildcard on pattern
            (Some("+"), Some(_)) => continue,
            // Equal levels
            (Some(pattern), Some(topic)) if pattern == topic => continue,
            // Otherwise, no match
            _ => return false,
        }
    }
}

pub trait MQTTMatches {
    type Key<'this>
    where
        Self: 'this;
    type Value<'this>
    where
        Self: 'this;
    fn matches<'this, 'topic>(
        &'this self,
        topic: &'topic str,
    ) -> impl Iterator<Item = (Self::Key<'this>, Self::Value<'this>)> + 'topic
    where
        'this: 'topic;
}

impl<K, V, S> MQTTMatches for HashMap<K, V, S>
where
    K: Eq + Hash + Borrow<str>,
    S: std::hash::BuildHasher,
{
    type Key<'this> = &'this K
    where
        Self: 'this;
    type Value<'this> = &'this V
    where
        Self: 'this;

    fn matches<'this, 'topic>(
        &'this self,
        topic: &'topic str,
    ) -> impl Iterator<Item = (Self::Key<'this>, Self::Value<'this>)> + 'topic
    where
        'this: 'topic,
    {
        self.iter().filter(move |(pattern, _)| {
            let pattern: &str = (*pattern).borrow();
            matches(pattern, topic)
        })
    }
}

impl<K, V> MQTTMatches for BTreeMap<K, V>
where
    K: Eq + Hash + Borrow<str>,
{
    type Key<'this> = &'this K
    where
        Self: 'this;
    type Value<'this> = &'this V
    where
        Self: 'this;

    fn matches<'this, 'topic>(
        &'this self,
        topic: &'topic str,
    ) -> impl Iterator<Item = (Self::Key<'this>, Self::Value<'this>)> + 'topic
    where
        'this: 'topic,
    {
        self.iter().filter(move |(pattern, _)| {
            let pattern: &str = (*pattern).borrow();
            matches(pattern, topic)
        })
    }
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn assert_it_works_for_our_usecase() {
        let mut matcher = HashMap::<&str, &str>::new();
        matcher.insert("$MAC/+/+/rpc", "_/device_type/systemid/_");
        matcher.insert("$MAC/+/+/+/rpc", "_/device_type/systemid/zoneid/_");
        matcher.insert("$MAC/+/rpc", "_/device_type/_");
        matcher.insert("$MAC/rpc", "");

        let topic = "$MAC/humidifier/1/rpc";
        let matches: Vec<_> = matcher.matches(topic).collect();
        assert_eq!(
            matches,
            vec![(&"$MAC/+/+/rpc", &"_/device_type/systemid/_")]
        );
    }
}

Altair-Bueno · 2024-04-25T09:55:02Z

Improved code with tests and documentation. Also, it now works with anything that can be iterated over.

Code

// Author: Altair Bueno <[email protected]>

/// Checks if a topic with wildcards matches a given topic.
fn matches(pattern: &str, topic: &str) -> bool {
    let mut pattern = pattern.split('/');
    let mut topic = topic.split('/');
    loop {
        let pattern_level = pattern.next();
        let topic_level = topic.next();
        match (pattern_level, topic_level) {
            // Exhausted both pattern and topic
            (None, None) => return true,
            // Wildcard on pattern
            (Some("#"), _) => return true,
            // Single level wildcard on pattern
            (Some("+"), Some(_)) => continue,
            // Equal levels
            (Some(pattern), Some(topic)) if pattern == topic => continue,
            // Otherwise, no match
            _ => return false,
        }
    }
}

/// Extension trait for map types and tuple iterators that allows to filter
/// entries by matching a MQTT topic.
///
/// # Example
///
/// ```
/// use std::collections::HashMap;
/// use std::collections::HashSet;
/// use az_mqtt::util::MQTTMatchesExt;
///
/// let mut matcher = HashMap::<&str, &str>::new();
/// matcher.insert("$MAC/+/+/rpc", "_/device_type/systemid/_");
/// matcher.insert("$MAC/+/+/+/rpc", "_/device_type/systemid/zoneid/_");
/// matcher.insert("$MAC/+/rpc", "_/device_type/_");
/// matcher.insert("$MAC/rpc", "");
///
/// let topic = "$MAC/humidifier/1/rpc";
/// let matches: HashSet<_> = matcher.matches(topic).collect();
/// assert_eq!(
///    matches,
///   HashSet::from([("$MAC/+/+/rpc", "_/device_type/systemid/_")])
/// );
/// ```
pub trait MQTTMatchesExt {
    /// The key type returned by the iterator.
    type Key;
    /// The value type returned by the iterator.
    type Value;

    /// Matches the given topic against the keys of the map and returns an
    /// iterator over the matching entries. Keys of the map are expected to
    /// be MQTT topic patterns and may contain wildcards.
    fn matches<'topic>(
        self,
        topic: &'topic str,
    ) -> impl Iterator<Item = (Self::Key, Self::Value)> + 'topic
    where
        Self: 'topic;
}

impl<K, V, C> MQTTMatchesExt for C
where
    C: IntoIterator<Item = (K, V)>,
    K: AsRef<str>,
{
    type Key = K;
    type Value = V;

    fn matches<'topic>(
        self,
        topic: &'topic str,
    ) -> impl Iterator<Item = (Self::Key, Self::Value)> + 'topic
    where
        Self: 'topic,
    {
        self.into_iter()
            .filter(move |(pattern, _)| matches(pattern.as_ref(), topic))
    }
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn assert_that_no_wildcards_matches() {
        assert!(matches("a/b/c", "a/b/c"));
    }
    #[test]
    fn assert_that_plus_wildcard_matches() {
        assert!(matches("a/+/c", "a/b/c"));
    }
    #[test]
    fn assert_that_leading_plus_wildcard_matches() {
        assert!(matches("+/b/c", "a/b/c"));
    }
    #[test]
    fn assert_that_trailing_plus_wildcard_matches() {
        assert!(matches("a/b/+", "a/b/c"));
    }
    #[test]
    fn assert_that_hash_wildcard_matches_none_level() {
        assert!(matches("a/b/#", "a/b"));
    }
    #[test]
    fn assert_that_hash_wildcard_matches_single_level() {
        assert!(matches("a/b/#", "a/b/c"));
    }
    #[test]
    fn assert_that_hash_wildcard_matches_multiple_levels() {
        assert!(matches("a/b/#", "a/b/c/d"));
    }
}

fpagliughi · 2024-05-09T11:58:20Z

Hey! Thanks for this. Yes, the TopicMatcher was a fairly quick hack which I meant to re-assess when I had the time. But it has been working for my use case, so it got a low priority.

Unfortunately, since this is an Eclipse project, I can't really do anything with you code unless you submit it as a PR with a signed Eclipse (ECA) Agreement. Eclipse is big on the legal stuff.

Altair-Bueno · 2024-05-09T15:01:17Z

I can do that, i already signed the ~~annoying~~ ECA for another project.

Do you want me to remove TopicMatcher and replace it with MQTTMatchesExt? If so, keep in mind it is a breaking change. But to be fair, TopicMatcher is already broken so there is little motivation to keep it. Additionally, want a different name or API?

fpagliughi · 2024-05-09T15:56:41Z

Keep the name TopicMatcher. Submit against the develop branch. I'll use that to roll up a couple of breaking changes, if necessary.

fpagliughi · 2024-05-20T19:41:48Z

The broken TopicMatcher is fixed in v0.12.4. The Rust example above now produces:

[("$MAC/+/+/rpc", "_/device_type/systemid/_")]

which is the key/value pair that matched; the key being the filter.

The new TopcMatcher is optimized quite a bit, and I'm much happier with the implementation. There are string and map allocations as the trie is created, but the iterator only uses a single Vec<> stack to search for matches. That seems pretty normal/reasonable for a tree search, especially one that is normally built/modified once and searched maybe millions of times.

I'll write some performance benchmarks in the near future.

I assume the existing implementation "does not work as expected" is fixed, so will close this issue. If it still seems broken, please feel free to re-open or create a new ticket.

As for new and alternate implementations and a common trait, let's move that discussion to PR #228

Altair-Bueno mentioned this issue May 10, 2024

feat: TopicMatcher as extension trait to iterators (BREAKING) #228

Open

fpagliughi closed this as completed May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TopicMatcher does not work as expected #226

TopicMatcher does not work as expected #226

Altair-Bueno commented Apr 16, 2024

Altair-Bueno commented Apr 17, 2024

Altair-Bueno commented Apr 25, 2024

fpagliughi commented May 9, 2024

Altair-Bueno commented May 9, 2024

fpagliughi commented May 9, 2024

fpagliughi commented May 20, 2024

TopicMatcher does not work as expected #226

TopicMatcher does not work as expected #226

Comments

Altair-Bueno commented Apr 16, 2024

Summary

Example

MRE

Altair-Bueno commented Apr 17, 2024

Altair-Bueno commented Apr 25, 2024

fpagliughi commented May 9, 2024

Altair-Bueno commented May 9, 2024

fpagliughi commented May 9, 2024

fpagliughi commented May 20, 2024