Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New standard lib operations: union, intersection, subtraction, and unique #698

Merged
merged 17 commits into from
Jun 5, 2024

Conversation

Adam-Vandervorst
Copy link
Contributor

I had to implement a MultiTrie key conversion for grounded types for this to be of any use.
I chose the default hasher, this is not the best choice, but it's asymptotically correct for HE's internal representation.

> !(unique (superpose (a a b a)))
[a, b]

> !(subtraction (superpose (a b b c)) (superpose (b c c d)))
[a, b]

> !(intersection (superpose (a b c c)) (superpose (b c c c d)))
[b, c, c]

> !(union (superpose (a b b c)) (superpose (b c c d)))
[a, b, b, c, b, c, c, d]

fn serialize_f64(&mut self, _v: f64) -> serial::Result { Ok(self.push_str(&*_v.to_string())) }
}

impl Serializer for Vec<u8> { // for speed, but is technically unsafe because not a valid utf-8 string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing unsafe about pushing data onto a vec of bytes. The unsafe thing would be if you then interpreted the bytes as a str / String later on, but I don't see you doing that.

So I think this comment is unnecessarily alarming.

@@ -56,6 +77,14 @@ fn atom_to_trie_key(atom: &Atom) -> TrieKey<SymbolAtom> {
expr.children().iter().for_each(|child| fill_key(child, tokens));
tokens.push(TrieToken::RightPar);
},
// FIXME, see below
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vsbogd to comment. But I think this change will mean that custom grounded matchers will not run if the type successfully serializes and the hashes aren't equal. Which I think is against the design of HE as it currently stands.

It probably a non-issue right now, in practice, since the only types that currently support serialization are numbers and bools, and there is only one implementation of those. But I think this cuts against the intentions of the design.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in general case this fix means when atom is Grounded and serializable then it will be matched by exact match while atom can have custom matching procedure implemented. It is the reason why I wanted introduce this by separating Grounded atoms on atoms with custom match and atoms matched by equality and only last kind of atoms should be matched using Exact key type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will try to implement this and raise the PR to this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change is quite big https://github.com/Adam-Vandervorst/hyperon-experimental/compare/main...vsbogd:hyperon-experimental:grounded-matching?expand=1 so I think I will merge it and will make additional changes as separate PRs.

let arg_error = || ExecError::from("unique expects single executable atom as an argument");
let atom = args.get(0).ok_or_else(arg_error)?;

// TODO: Calling interpreter inside the operation is not too good
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. There are two possible solutions here, but both are beyond the scope of this PR.

1.) Make the interpreter aware of which arguments need to be interpreted / reduced, so the reduction could happen before this execute method is invoked. or
2.) Make this method able to essentially act lazily. Same pattern as async Rust, which returns a future object, and is evaluated by a runtime.

My vote would be for 1, because 2 comes with a lot of overhead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is a third way which is possible with Minimal MeTTa interpreter:

  1. one can return a minimal MeTTa chain which reduces the argument first and then call second part of the operation which manipulates the evaluated value (probably it is what Adam's comment says). It is the reason why interpret_no_error function is inside non_minimal_only_stdlib module.

But for this case specifically it is enough to change the unique op type to the (-> %Undefined% Atom) and argument is evaluated before call is made.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example UniqueOp could have the following implementation:

impl Grounded for UniqueOp {
    fn type_(&self) -> Atom {
        Atom::expr([ARROW_SYMBOL, ATOM_TYPE_UNDEFINED, ATOM_TYPE_ATOM])
    }

    fn execute(&self, args: &[Atom]) -> Result<Vec<Atom>, ExecError> {
        let arg_error = || ExecError::from("unique expects single executable atom as an argument");
        let atom = args.get(0).ok_or_else(arg_error)?;

        let mut expr: ExpressionAtom = atom.clone().try_into()?;
        let mut set = GroundingSpace::new();
        expr.children_mut().retain(|x| {
            let not_contained = set.query(x).is_empty();
            if not_contained { set.add(x.clone()) };
            not_contained
        });
        Ok(expr.into_children())
    }

    fn match_(&self, other: &Atom) -> MatchResultIter {
        match_by_equality(self, other)
    }
}

@@ -56,6 +77,14 @@ fn atom_to_trie_key(atom: &Atom) -> TrieKey<SymbolAtom> {
expr.children().iter().for_each(|child| fill_key(child, tokens));
tokens.push(TrieToken::RightPar);
},
// FIXME, see below
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in general case this fix means when atom is Grounded and serializable then it will be matched by exact match while atom can have custom matching procedure implemented. It is the reason why I wanted introduce this by separating Grounded atoms on atoms with custom match and atoms matched by equality and only last kind of atoms should be matched using Exact key type.

@@ -56,6 +77,14 @@ fn atom_to_trie_key(atom: &Atom) -> TrieKey<SymbolAtom> {
expr.children().iter().for_each(|child| fill_key(child, tokens));
tokens.push(TrieToken::RightPar);
},
// FIXME, see below
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will try to implement this and raise the PR to this PR.

let arg_error = || ExecError::from("unique expects single executable atom as an argument");
let atom = args.get(0).ok_or_else(arg_error)?;

// TODO: Calling interpreter inside the operation is not too good
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is a third way which is possible with Minimal MeTTa interpreter:

  1. one can return a minimal MeTTa chain which reduces the argument first and then call second part of the operation which manipulates the evaluated value (probably it is what Adam's comment says). It is the reason why interpret_no_error function is inside non_minimal_only_stdlib module.

But for this case specifically it is enough to change the unique op type to the (-> %Undefined% Atom) and argument is evaluated before call is made.

let arg_error = || ExecError::from("unique expects single executable atom as an argument");
let atom = args.get(0).ok_or_else(arg_error)?;

// TODO: Calling interpreter inside the operation is not too good
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example UniqueOp could have the following implementation:

impl Grounded for UniqueOp {
    fn type_(&self) -> Atom {
        Atom::expr([ARROW_SYMBOL, ATOM_TYPE_UNDEFINED, ATOM_TYPE_ATOM])
    }

    fn execute(&self, args: &[Atom]) -> Result<Vec<Atom>, ExecError> {
        let arg_error = || ExecError::from("unique expects single executable atom as an argument");
        let atom = args.get(0).ok_or_else(arg_error)?;

        let mut expr: ExpressionAtom = atom.clone().try_into()?;
        let mut set = GroundingSpace::new();
        expr.children_mut().retain(|x| {
            let not_contained = set.query(x).is_empty();
            if not_contained { set.add(x.clone()) };
            not_contained
        });
        Ok(expr.into_children())
    }

    fn match_(&self, other: &Atom) -> MatchResultIter {
        match_by_equality(self, other)
    }
}

@Adam-Vandervorst
Copy link
Contributor Author

@vsbogd thanks for your changes!

For example UniqueOp could have the following implementation:

I believe we should have a similar implementation between unique, collapse, union, etc., so I propose we merge this with the interpret_no_error and redo the "chain evaluation" for these ops in another PR.

@vsbogd vsbogd merged commit 3dd92c7 into trueagi-io:main Jun 5, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants