New standard lib operations: union, intersection, subtraction, and unique #698

Adam-Vandervorst · 2024-05-24T19:48:55Z

I had to implement a MultiTrie key conversion for grounded types for this to be of any use.
I chose the default hasher, this is not the best choice, but it's asymptotically correct for HE's internal representation.

> !(unique (superpose (a a b a)))
[a, b]

> !(subtraction (superpose (a b b c)) (superpose (b c c d)))
[a, b]

> !(intersection (superpose (a b c c)) (superpose (b c c c d)))
[b, c, c]

> !(union (superpose (a b b c)) (superpose (b c c d)))
[a, b, b, c, b, c, c, d]

…sed" interpreter

lib/src/space/grounding.rs

luketpeterson · 2024-06-03T05:37:54Z

lib/src/space/grounding.rs

+    fn serialize_f64(&mut self, _v: f64) -> serial::Result { Ok(self.push_str(&*_v.to_string())) }
+}
+
+impl Serializer for Vec<u8> { // for speed, but is technically unsafe because not a valid utf-8 string


There is nothing unsafe about pushing data onto a vec of bytes. The unsafe thing would be if you then interpreted the bytes as a str / String later on, but I don't see you doing that.

So I think this comment is unnecessarily alarming.

luketpeterson · 2024-06-03T05:42:11Z

lib/src/space/grounding.rs

@@ -56,6 +77,14 @@ fn atom_to_trie_key(atom: &Atom) -> TrieKey<SymbolAtom> {
                expr.children().iter().for_each(|child| fill_key(child, tokens));
                tokens.push(TrieToken::RightPar);
            },
+            // FIXME, see below


@vsbogd to comment. But I think this change will mean that custom grounded matchers will not run if the type successfully serializes and the hashes aren't equal. Which I think is against the design of HE as it currently stands.

It probably a non-issue right now, in practice, since the only types that currently support serialization are numbers and bools, and there is only one implementation of those. But I think this cuts against the intentions of the design.

Yeah, in general case this fix means when atom is Grounded and serializable then it will be matched by exact match while atom can have custom matching procedure implemented. It is the reason why I wanted introduce this by separating Grounded atoms on atoms with custom match and atoms matched by equality and only last kind of atoms should be matched using Exact key type.

I think I will try to implement this and raise the PR to this PR.

Change is quite big https://github.com/Adam-Vandervorst/hyperon-experimental/compare/main...vsbogd:hyperon-experimental:grounded-matching?expand=1 so I think I will merge it and will make additional changes as separate PRs.

lib/src/space/grounding.rs

luketpeterson · 2024-06-03T05:56:56Z

lib/src/metta/runner/stdlib.rs

+        let arg_error = || ExecError::from("unique expects single executable atom as an argument");
+        let atom = args.get(0).ok_or_else(arg_error)?;
+
+        // TODO: Calling interpreter inside the operation is not too good


Indeed. There are two possible solutions here, but both are beyond the scope of this PR.

1.) Make the interpreter aware of which arguments need to be interpreted / reduced, so the reduction could happen before this execute method is invoked. or
2.) Make this method able to essentially act lazily. Same pattern as async Rust, which returns a future object, and is evaluated by a runtime.

My vote would be for 1, because 2 comes with a lot of overhead.

Actually there is a third way which is possible with Minimal MeTTa interpreter:

one can return a minimal MeTTa chain which reduces the argument first and then call second part of the operation which manipulates the evaluated value (probably it is what Adam's comment says). It is the reason why interpret_no_error function is inside non_minimal_only_stdlib module.

But for this case specifically it is enough to change the unique op type to the (-> %Undefined% Atom) and argument is evaluated before call is made.

For example UniqueOp could have the following implementation:

impl Grounded for UniqueOp { fn type_(&self) -> Atom { Atom::expr([ARROW_SYMBOL, ATOM_TYPE_UNDEFINED, ATOM_TYPE_ATOM]) } fn execute(&self, args: &[Atom]) -> Result<Vec<Atom>, ExecError> { let arg_error = || ExecError::from("unique expects single executable atom as an argument"); let atom = args.get(0).ok_or_else(arg_error)?; let mut expr: ExpressionAtom = atom.clone().try_into()?; let mut set = GroundingSpace::new(); expr.children_mut().retain(|x| { let not_contained = set.query(x).is_empty(); if not_contained { set.add(x.clone()) }; not_contained }); Ok(expr.into_children()) } fn match_(&self, other: &Atom) -> MatchResultIter { match_by_equality(self, other) } }

vsbogd · 2024-06-03T15:33:57Z

lib/src/space/grounding.rs

@@ -56,6 +77,14 @@ fn atom_to_trie_key(atom: &Atom) -> TrieKey<SymbolAtom> {
                expr.children().iter().for_each(|child| fill_key(child, tokens));
                tokens.push(TrieToken::RightPar);
            },
+            // FIXME, see below


Yeah, in general case this fix means when atom is Grounded and serializable then it will be matched by exact match while atom can have custom matching procedure implemented. It is the reason why I wanted introduce this by separating Grounded atoms on atoms with custom match and atoms matched by equality and only last kind of atoms should be matched using Exact key type.

lib/src/metta/runner/string.rs

lib/src/metta/runner/arithmetics.rs

lib/src/atom/serial.rs

lib/src/metta/runner/string.rs

vsbogd · 2024-06-03T16:57:00Z

lib/src/space/grounding.rs

@@ -56,6 +77,14 @@ fn atom_to_trie_key(atom: &Atom) -> TrieKey<SymbolAtom> {
                expr.children().iter().for_each(|child| fill_key(child, tokens));
                tokens.push(TrieToken::RightPar);
            },
+            // FIXME, see below


I think I will try to implement this and raise the PR to this PR.

vsbogd · 2024-06-03T17:01:22Z

lib/src/metta/runner/stdlib.rs

+        let arg_error = || ExecError::from("unique expects single executable atom as an argument");
+        let atom = args.get(0).ok_or_else(arg_error)?;
+
+        // TODO: Calling interpreter inside the operation is not too good


Actually there is a third way which is possible with Minimal MeTTa interpreter:

one can return a minimal MeTTa chain which reduces the argument first and then call second part of the operation which manipulates the evaluated value (probably it is what Adam's comment says). It is the reason why interpret_no_error function is inside non_minimal_only_stdlib module.

But for this case specifically it is enough to change the unique op type to the (-> %Undefined% Atom) and argument is evaluated before call is made.

vsbogd · 2024-06-03T17:12:22Z

lib/src/metta/runner/stdlib.rs

+        let arg_error = || ExecError::from("unique expects single executable atom as an argument");
+        let atom = args.get(0).ok_or_else(arg_error)?;
+
+        // TODO: Calling interpreter inside the operation is not too good


For example UniqueOp could have the following implementation:

impl Grounded for UniqueOp { fn type_(&self) -> Atom { Atom::expr([ARROW_SYMBOL, ATOM_TYPE_UNDEFINED, ATOM_TYPE_ATOM]) } fn execute(&self, args: &[Atom]) -> Result<Vec<Atom>, ExecError> { let arg_error = || ExecError::from("unique expects single executable atom as an argument"); let atom = args.get(0).ok_or_else(arg_error)?; let mut expr: ExpressionAtom = atom.clone().try_into()?; let mut set = GroundingSpace::new(); expr.children_mut().retain(|x| { let not_contained = set.query(x).is_empty(); if not_contained { set.add(x.clone()) }; not_contained }); Ok(expr.into_children()) } fn match_(&self, other: &Atom) -> MatchResultIter { match_by_equality(self, other) } }

Adam-Vandervorst · 2024-06-03T17:41:25Z

@vsbogd thanks for your changes!

For example UniqueOp could have the following implementation:

I believe we should have a similar implementation between unique, collapse, union, etc., so I propose we merge this with the interpret_no_error and redo the "chain evaluation" for these ops in another PR.

Review comments

Adam-Vandervorst added 6 commits May 24, 2024 12:48

Add unique op

c49a9db

Fix unique test

dff94b2

Add union op

96e5f95

Add intersection op

c92e1c6

Add subtraction op

9895cf8

Fix MultiTrie reject on grounded types in intersection and subtraction

603f148

luketpeterson requested review from vsbogd and luketpeterson June 3, 2024 00:06

luketpeterson and others added 3 commits June 3, 2024 13:29

Merge branch 'main' into main

d7dee88

Registering the new ops in minimal interpreter as well as in "Rust-ba…

09c1478

…sed" interpreter

Squishing warnings

a4d5f7f

luketpeterson reviewed Jun 3, 2024

View reviewed changes

Adam-Vandervorst and others added 6 commits June 3, 2024 12:23

Add string serializer (and consequently hash) support

1b234af

Use equals on grounded types in intersection and subtraction

b4c8f8c

Use native Rust type in string serialization API

65710d1

Move ConvertingSerializer into crate::atom::serial module

2a325f7

Simplify Str::as_str()

39add07

Removing code which is not used

07bf0f7

vsbogd reviewed Jun 3, 2024

View reviewed changes

Adam-Vandervorst and others added 2 commits June 3, 2024 19:42

Merge pull request #1 from vsbogd/union-intersection

6a268f4

Review comments

Merge branch 'main' into main

37b4a59

vsbogd approved these changes Jun 5, 2024

View reviewed changes

vsbogd merged commit 3dd92c7 into trueagi-io:main Jun 5, 2024
2 checks passed

vsbogd mentioned this pull request Jun 10, 2024

current status of documentation #694

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New standard lib operations: union, intersection, subtraction, and unique #698

New standard lib operations: union, intersection, subtraction, and unique #698

Adam-Vandervorst commented May 24, 2024

luketpeterson Jun 3, 2024

luketpeterson Jun 3, 2024

vsbogd Jun 3, 2024

vsbogd Jun 3, 2024

vsbogd Jun 5, 2024

luketpeterson Jun 3, 2024

vsbogd Jun 3, 2024

vsbogd Jun 3, 2024

vsbogd Jun 3, 2024

vsbogd Jun 3, 2024

vsbogd Jun 3, 2024

vsbogd Jun 3, 2024

Adam-Vandervorst commented Jun 3, 2024

New standard lib operations: union, intersection, subtraction, and unique #698

New standard lib operations: union, intersection, subtraction, and unique #698

Conversation

Adam-Vandervorst commented May 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Adam-Vandervorst commented Jun 3, 2024