Closed
Description
The following program:
package main
import "regexp"
func main() {
re := regexp.MustCompile(".")
println(re.MatchString("\xd1"))
println(re.MatchString("\xd1\x84"))
println(re.MatchString("\xd1\xd1"))
re = regexp.MustCompile("..")
println(re.MatchString("\xd1"))
println(re.MatchString("\xd1\x84"))
println(re.MatchString("\xd1\xd1"))
}
prints:
true
true
true
false
false
true
While the following C++ program:
#include <stdio.h>
#include <re2/re2.h>
int main() {
RE2 re1(".");
printf("%d\n", RE2::PartialMatch("\xd1", re1));
printf("%d\n", RE2::PartialMatch("\xd1\x84", re1));
printf("%d\n", RE2::PartialMatch("\xd1\xd1", re1));
RE2 re2(".");
printf("%d\n", RE2::PartialMatch("\xd1", re2));
printf("%d\n", RE2::PartialMatch("\xd1\x84", re2));
printf("%d\n", RE2::PartialMatch("\xd1\xd1", re2));
}
prints:
0
1
0
0
1
0
This raises 2 questions:
- Why is behavior different between regexp and re2 (re2 seems to be more consistent)?
- Why is "\xd1\xd1" matched against both "." and ".."? I can understand if it is matched against one or another, but not both; is it one character or two?
go version devel +b0532a9 Mon Jun 8 05:13:15 2015 +0000 linux/amd64
Activity