-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIX fixes #700
base: master
Are you sure you want to change the base?
AIX fixes #700
Conversation
When running the install script on AIX, the build pipes most of the output through dist/isum.awk. This AWK script does not display any output on AIX using the default awk located in /usr/bin/awk. On the other platforms, this script displays the current item being compiled on a single line and updates that line as the build progresses. Using GNU AWK, which is available from the IBM AIX Toolbox for Open Source Software, allows the script to run correctly. This makes for a prettier build at the cost of requiring gawk on AIX. During the run of INSTALL, /opt/freeware/bin must also be in the PATH environment variable.
The INSTALL script detects if the fontsrv prerequsites are installed by compiling a test program. However, this does not work on AIX even with freetype installed from the IBM Toolbox for Open Source Software. This does not work for two reasons. First, the IBM- supplied package is installed into /opt/freeware, and therefore we need to include the following directories: /opt/freeware/include /opt/freeware/include/freetype2 Secondly, the INSTALL script uses cc to compile the script, and this does not set up enough PLAN9 environment for the compiler to find the right headers on AIX. Therefore, we use 9c which handles all of the AIX compiler envionment setup, and should also work for the other platforms.
The INSTALL script typically prints the version of the compiler in use during a build, and does so by running "9c -v". Ultimately the -v argument gets passed to the actual compiler. However, the AIX compiler (XL C) does not use -v to display the compiler version, but instead uses the argument -qversion. Therefore, we add -qversion to the 9c invocation and it does not matter if this throws an error in addition to printing the compiler version, as the subsequent grep call will exclude any errors. This should allow the script to print the compiler version on any platform.
The fonsrv program can build on AIX so long as the prerequsites are installed. These include freetype2 and fontconfig, both of which are available from the IBM Toolbox for Open Source Software. However, these get installed into /opt/freeware. Update the freetyperules.sh script to look for the headers and libraries in /opt/freeware on AIX.
When attempting to build rio on AIX, the following compiler error occurs: 9c -DDEBUG -DSHAPE -DDEBUG_EV -DDEBUG event.c event.c:9.10: 1506-296 (S) #include file <X11/extensions/shape.h> not found. event.c:32.59: 1506-046 (S) Syntax error. event.c:32.60: 1506-068 (S) Operation between types "void" and "union _XEvent" is not allowed. event.c:473.25: 1506-277 (S) Syntax error: possible missing ')' or ','? event.c:478.23: 1506-045 (S) Undeclared identifier e. mk: 9c -DDEBUG -DSHAPE ... : exit status=exit(1) mk: for i in ... : exit status=exit(1) This is because IBM does not supply X11/extensions/shape.h in any LPP. Unfortunately, this means that rio is permanently unusable on AIX, without significant modification. The easiest path forward is to add rio to the BUGGERED list in src/cmd/mkfile. However, rio works on most of the other platforms. Therefore, add an extra test and add rio to the BUGGERED list only if we are performing the build on AIX.
Suppress additional compiler and linker warnings on AIX.
On newer versions of AIX (at least 7.3, possibly 7.2), when a program calls post9pservice, the call fails due to a timing problem just after the call to waitfor. Adding this short delay allows the function to succeed and the program works correctly on AIX afterwards. Fixes 9fans#699
@@ -115,8 +118,8 @@ fi | |||
if [ `uname` != Darwin ]; then | |||
# Determine whether fontsrv X11 files are available. | |||
rm -f a.out | |||
cc -o a.out -c -Iinclude -I/usr/include -I/usr/local/include -I/usr/include/freetype2 -I/usr/local/include/freetype2 \ | |||
-I/usr/X11R7/include -I/usr/X11R7/include/freetype2 \ | |||
9c -o a.out -c -Iinclude -I/usr/include -I/usr/local/include -I/usr/include/freetype2 -I/usr/local/include/freetype2 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I don't know about using 9c
here; perhaps just keep this cc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to 9c because 9c sets up more plan9port environment, much of which we need on AIX. With this just using cc, this test fails with the following error messages:
cc -o a.out -c -Iinclude -I/usr/include -I/usr/local/include -I/usr/include/freetype2 -I/usr/local/include/freetype2 -I/usr/X11R7/include -I/usr/X11R7/include/freetype2 -I/opt/freeware/include -I/o>
"include/u.h", line 1.1: 1506-046 (S) Syntax error.
"include/u.h", line 1.4: 1506-166 (S) Definition of function See requires parentheses.
"/usr/include/sys/types.h", line 345.3: 1506-273 (E) Missing type in declaration of sigset_t.
"/usr/include/sys/types.h", line 396.9: 1506-046 (S) Syntax error.
"/usr/include/sys/types.h", line 484.9: 1506-166 (S) Definition of function uint64_t requires parentheses.
"/usr/include/sys/types.h", line 484.18: 1506-276 (S) Syntax error: possible missing '{'?
"/usr/include/unistd.h", line 909.43: 1506-282 (S) The type of the parameters must be specified in a prototype.
"/usr/include/unistd.h", line 915.56: 1506-046 (S) Syntax error.
"/usr/include/unistd.h", line 916.59: 1506-275 (S) Unexpected text off_t encountered.
"/usr/include/unistd.h", line 920.64: 1506-275 (S) Unexpected text size_t encountered.
"/usr/include/unistd.h", line 922.16: 1506-166 (S) Definition of function pid_t requires parentheses.
"/usr/include/unistd.h", line 922.42: 1506-276 (S) Syntax error: possible missing '{'?
"/usr/include/stdlib.h", line 114.3: 1506-273 (E) Missing type in declaration of div_t.
"/usr/include/stdlib.h", line 126.8: 1506-166 (S) Definition of function size_t requires parentheses.
"/usr/include/stdlib.h", line 126.15: 1506-276 (S) Syntax error: possible missing '{'?
"/usr/include/sys/signal.h", line 275.9: 1506-046 (S) Syntax error.
"/usr/include/sys/signal.h", line 309.15: 1506-007 (S) "union sigval" is undefined.
"/usr/include/sys/signal.h", line 347.9: 1506-046 (S) Syntax error.
"/usr/include/sys/vm_types.h", line 34.9: 1506-166 (S) Definition of function ulong_t requires parentheses.
"/usr/include/sys/vm_types.h", line 34.25: 1506-276 (S) Syntax error: possible missing '{'?
"/usr/include/sys/vm_types.h", line 148.9: 1506-045 (S) Undeclared identifier vmhandle_t.
"/usr/include/sys/vm_types.h", line 149.9: 1506-045 (S) Undeclared identifier caddr_t.
"/usr/include/sys/vm_types.h", line 151.3: 1506-273 (E) Missing type in declaration of vmaddr_t.
"/usr/include/sys/vm_types.h", line 164.9: 1506-046 (S) Syntax error.
"/usr/include/sys/vm_types.h", line 258.9: 1506-166 (S) Definition of function vmhandle_t requires parentheses.
"/usr/include/sys/vm_types.h", line 258.25: 1506-276 (S) Syntax error: possible missing '{'?
"/usr/include/sys/m_types.h", line 51.3: 1506-273 (E) Missing type in declaration of label_t.
"/usr/include/sys/m_types.h", line 53.9: 1506-166 (S) Definition of function int32long64_t requires parentheses.
"/usr/include/sys/m_types.h", line 53.25: 1506-276 (S) Syntax error: possible missing '{'?
"/usr/include/sys/mstsave.h", line 96.26: 1506-045 (S) Undeclared identifier __kjmpbuf.
"/usr/include/sys/mstsave.h", line 101.25: 1506-275 (S) Unexpected text __curid encountered.
"/usr/include/sys/mstsave.h", line 101.9: 1506-045 (S) Undeclared identifier pid_t.
"include/libc.h", line 1.57: 1506-209 (S) Character constants must end before the end of a line.
"include/libc.h", line 1.34: 1506-076 (W) Character constant 's /sys/include/libc.h.' has more than 4 characters. No more than rightmost 4 characters are used.
"src/cmd/fontsrv/a.h", line 15.89: 1506-209 (S) Character constants must end before the end of a line.
"src/cmd/fontsrv/a.h", line 15.65: 1506-076 (W) Character constant 's lo rune / SubfontSize' has more than 4 characters. No more than rightmost 4 characters are used.
We'd need to add several additional changes here to keep it as cc. I can investigate that if that is the preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh dear. Ok, I guess that's fine, then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could add a quick uname test and only use 9c on AIX and leave it as-is for the others...
@@ -5,7 +5,9 @@ TARG=`ls *.[cy] *.lx | egrep -v "\.tab\.c$|^x\." | sed 's/\.[cy]//; s/\.lx//'` | |||
<$PLAN9/src/mkmany | |||
|
|||
BUGGERED='CVS|faces|factotum|lp|ip|mailfs|upas|vncv|mnihongo|mpm|index|u9fs|secstore|smugfs|snarfer' | |||
DIRS=lex `ls -l |sed -n 's/^d.* //p' |egrep -v "^($BUGGERED)$"|egrep -v '^lex$'` $FONTSRV | |||
UNAME=`uname` | |||
AIXBUGGERED=`if [ "$UNAME" == "AIX" ]; then echo rio; fi` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not append to BUGGERED
if on AIX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the effect here, but indirectly. I suppose I could add AIXBUGGERED to BUGGERED if you would prefer to see it done that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think adding to BUGGERED
would be better.
@@ -41,6 +41,7 @@ post9pservice(int fd, char *name, char *mtpt) | |||
} | |||
close(fd); | |||
w = waitfor(pid); | |||
sleep(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that you know this, but this is clearly the wrong fix. I recognize that this probably masks the symptoms of the underlying bug enough that things work on AIX with this in place, but I'm very hesitant to commit a bandaid like this into the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. This is one of those things that is going to be monstrously difficult to track down. Single-stepping through the code under the debugger adds enough of a delay that it works correctly while debugging it, so figuring out what is going wrong is pretty tough. Furthermore, if I compile the unpatched code on AIX 7.1, it works there but then when I move the binaries over to AIX 7.3, we also hit this issue. I'd welcome suggestions for how to proceed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, maybe not.
Looking at what's going on, it appears that, without the sleep, waitfor
is returning a bogus, nil, value. Drilling down into waitfor
, we see that it delegates to _wait
and the first thing that does is see if n
is <= 0 and return nil
if it is. The other cases where it returns nil
are if it can't tokenize whatever is in buf
, and if malloc
fails. Let's assume malloc
isn't failing; then if tokenize
failed, we should see the "couldn't parse wait message" error somewhere and we don't, so it stands to reason that n <= 0
is probably true. n
in this case is the return value from awaitfor
.
awaitfor
delegates to _await
, which in turn calls wait3
or wait4
, depending on whether pid4
is -1. There's a comment there that is curiously incomplete; "On Linux, pid==-1 means anyone; on SunOS it's pid==0." That's nice, but there's no special case for SunOS; perhaps the comment is simply a vestige of something that's changed over time. But it's probably irrelevant here; we know that the pid we get here is the PID of a child process created by fork()
in post9pservice
, and the error case there was already dealt with. Therefore, we know that _await
is calling wait4
in this instance. The the conclusion here is that this code is failing because _await
is returning 0 or -1.
How that can that happen? There are three returns from this function, in three separate cases. Two of those are strlen
, which can't return -1, and can only return 0 if passed an empty string. But in this case, I don't see how that can happen: the string in question is the result of a copying a formatted string into a buffer, and we can see statically that that buffer isn't empty. This suggests that _await
is returning -1 since due to the check against the return value of wait3
or wait4
. Since we know that we're calling wait4
here, we can deduce that this function is returning -1 because wait4
is returning 0 or something negative; I'm not a betting man, but if I were, my would be on -1.
I'll bet the, "Restart the system call" message mentioned in #699 is from actually from wait4
; some random Google searches suggest that that's the strerror string for "ERESTART" on AIX. I don't have access to AIX (and haven't for decades); does the man page for wait4
mention anything about ERESTART
?
Something you may try, instead of single-stepping in the debugger, is seeing if you can put a breakpoint on the return -1
on line 99 of await.c
and see if you hit it without the sleep(10)
and without single stepping.
@@ -136,7 +139,7 @@ if [ -f LOCAL.config ]; then | |||
fi | |||
|
|||
echo "* Compiler version:" | |||
9c -v 2>&1 | grep -v 'Configured with:' | grep -i version | sed 's/^/ /' | |||
9c -v -qversion 2>&1 | grep -v 'Configured with:' | grep -i version | sed 's/^/ /' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-qversion
is presumably specific to xlc; with e.g. clang, the output here makes sense, but isn't what I'd normally expect to see:
* Compiler version:
clang: error: unknown argument '-qversion'; did you mean '--version'?
OpenBSD clang version 16.0.6
(Note the error message that sneaks in.)
Perhaps a way around this is add an extra variable in the large case OS statement above that adds this specifically for AIX. Something like:
CCVERSFLAG=''
case `uname` in
# ...
AIX)
# ...
CCVERSFLAG=-qversion
;;
esac
9c -v $CCVERSFLAG 2>&1 | grep ...
These commits address several issues on AIX:
Unfortunately, this introduces a requirement to install gawk on AIX, however doing so is fairly straightforward.