PW32 has its roots in frustration with CygWin and amusement with DJGPP. From DJGPP it takes its runtime library as a base, its structure, its packaging conventions, its debugger. From familiarity with CygWin in takes desire to be more efficient and sensible, as well as attentive to Win9x chores. But even those listed above is not enough for such child library to mature. So, when DJGPP doesn't show good example, and CygWin - bad one, Linux used as the reference Unix/POSIX implementation. That goes even so far as to shying its Win32 nature and to pretending to be Linux itself. (Why so? Because, in our times, GNU software is gleamed with such pearls as
#ifdef _WIN32 #define getpid() GetCurrentProcessId() #endif). Of course, that's little insolent (I'm about masquerading as Linux), but let it stay that way for now.
First that worth a note on PW32 filesystem philosophy and implementation is that it supports DJGPP's Filesystem Extensions (FSEXT). This means that semantics of POSIX-level file-handling functions may be completely redefined on per-object basis by user-level code. This is very powerful feature of DJGPP, but with PW32, running in multitasking dynamic loading- and network- enabled environemt, capabilities are realy amazing. Imagine filesystem drivers built as DLLs and loaded by highly configurable means (such as system-supported per-application config file or environment). Then any existing program may use /dev/hd??, /dev/random, /dev/fb, FTP, HTTP, AFS, etc., while those which don't need this array will work efficiently by underlying system-provided means. As more dumb example of what can be done, those folks who don't agree with just solution of so-called 'binary vs text files problem' (see below), can develop a data-tampering extension which will distort your files (what a horror!).
DJGPP offers 3 levels of file access, from most low-level:
So or other, but it's done that way. But that's only half a story: the most interesting part is that 'text mode' is applied by default to 3 default POSIX file descriptors and to all files accessed via ANSI stdio level. Result is overall corruption of data streams processed by applications on those systems. Even such simple command as 'gzip -c a.tar.gz|tar xv' is unable to perform without errors. To remedy situation, PW32 return to parental sources: no such notion as 'text/binary' on POSIX level, all data read and written intact. Also, binary is default mode for stdio access. Text stdio mode means converting "\r\n" to "\n" on reading, no output conversion. Ok, but does Win32 support such way? Sure. When writing to console in cooked(default) mode, "\n" automagically prepended with CR. Not so fine with reading console - it returns \r\n as EOL. As of version 0.5.0, there's builtin filesystem extension is being applied to fd 0 if it is terminal device on process startup.
And at last, three level of file access with PW32. As you see, they differ from DJGPP's.
Summing up, PW32 implements file accessing techique of reference implementation (which, as was explained, Linux), plus some non-intruding support for native conventions. This leads not only to data integrity, but also to not degraded performance.
PW32 implements simplified version of POSIX file and directories access control. Specifically, the only supported permissions are for owner; owner's permissions are propogated to group's and others' ones. However, it works even on win9x (of course, on NT full POSIX permission system can be implemented - not (yet?) done). Following mapping of POSIX permissions to standard win32 (or, to be precise, FAT) file attributes used:
perm | attribute |
---|---|
r | not hidden |
w | not readonly |
x | for files: archive for directories: not archive |
This mapping is not arbitrary: using READONLY attribute as negation of w is obvious. When file doesn't have r permission, it's impossible to look into it, hence HIDDEN. SYSTEM attribute is used to mark symlinks, so the only remaining one is ARCHIVE. But it's usage for x dictated not only by inevitability, it suites that need rather good. While ARCHIVE bit was intended for the purpose, it's not really used and all typical native files have it set. So, PW32 won't have problems recognizing native applications, since all files are 'executable'. However for directories picture is opposite: they typically have ARCHIVE bit reset, from here stems distinctions in mapping x for directories, so PW32 apps won't have problems searching in natively created dirs.
It should be stated that whole permission system is implemented above win32 API and may lead to noticable overhead. As for files it's not a big deal, since PW32 gets file attributes to check for symlink. However, to implement search permissions for directories, it is required to traverse path and check attributes for all directories on it. I never thought I would implement permission system. But I have noticed that many testsuites check how some apps behaves when presented with non-accessible files. So, to make these testcases pass, I decided to implement perms for files - as I told, it doesn't pose much overhead. I tried that on tar testsuite, just to find out that testcase in question still fails because it expects directory perms also work. Then, I just did it all. Well, regression tests passing is nice thing, but introducing overhead for real work is bad. I benchmarked performance of GetFileAttributes() and CreateFile() Win32 API functions and got susteianed 100,000 and 140,000 Pentium ticks for each function for 50-files directory under original win95/FAT. So, while it's milliseconds even for P5-100, overhead is really times-fold. Fortunately, solution came automagically: it is not needed to check permissions for root. So, what is needed just way to specify uid to run PW32 under. It is not yet done. Still, to implement directory symlinks, it will need to traverse path anyway...
Note that to preserve file permissions across archiving, you must use native PW32 tar. However, using win32 InfoZip (i.e. zip.exe & unzip.exe) with -S switch (for packing) will preserve r and w permissions; x will be set unconditionally.
You think PW32 is really orthodox and does all to break "compatibility" with native conventions? No, it does not. One thing POSIX implementation for Win32 have to deal with is native convention on suffixing executable names with '.exe', while POSIX doesn't have such stipulation. The most straight solution would be to subdue that Win32 idiosyncrasy, but I haven't dared that by following reasons:
So, if suffixes are still there, how they are dealt with? Let's first survey how it is done in other implementations:
Of course, this may potentially lead to problems (scenario: you thought you had your file, 'file', somewhere, and you wanted to delete it. However, in it was gone before. As the other coincidence, 'file.exe' lay around. Trying to delete 'file' will kill 'file.exe'). '.exe' lookup is recent addition to PW32 and may be refined in the future.
PW32 implements symlinks compatible with CygWin. Hard links are aliased to symlinks. Currently, symlinks apply only to leaves of filesystem (misfeature). Hard link implementation for NTFS is welcome.
There's static counter, each *stat() call returns succesive number. It at least allows fileutils not to complain about curcular dependencies in filesystem. Better solution is welcome.
Yes, there's an arena reserved in the address space of process, with brk attached. 128Mb by default. You may define unsigned int __arena_sz in your sources to value you need .
Code is written but not even tested once. (Fortunately, configure finds that mmap() does not really work ;-) )
One of the issue with Win32 is its idiosyncratic ';' pathnames separator in PATH environment string. PW32 deals with that bravely: if environment contains PATH in Win/DOS format (heuristic used to determine whether), it's converted to POSIX format, so application will see decent POSIX environment, as it expects. On exec, conversion goes in reverse direction, so if by chance launched program is native one, it won't be confused. If ';'-separated path doesn't contain entry for '.', it is prepended. This efficiently means that path cannot not to contain entry for current dir. This is misfeature.
PW32's processes are first-order citizens in Win32 and vice-versa (unlike CygWin, which sees only its processes). However, PW32 doesn't always use pids as provided by underlying system. That's because pids on Win9x systems are known to be negative integers. While in Unix world pids are known to be little (not much bigger than 16 bits) positive numbers. (NT also conforms to this de facto convention). Some process-related functions rely on that positivity, and, in fact, treat negative pids in special way (ref: waitpid()). What PW32 do on Win9x is just negates number returned by system and uses that as pid. So, simple (even for human) association with real system pids is maintained. Note however, that resulting numbers are bigger than ones used on Unix. For example, don't remember having seen on Linux pid with more than 5 decimal digits, while with PW32 and Win9x value around 900000 is starting. There's no wonder to get 8-digit value (diving in dirty implementation details, pid, as returned by Win9x is xored with fixed value('obsfucated', in MS terms (ref: industrial anecdote, taking place with 'Inside' or 'Internals' of Petzold or other author)), pointer to internal process information block, which may reside within range of 0x80000000-0xbfffffff. That fixed value is something 0x7fxxxxxx, so in worst case pid as returned by PW32 may have 10 decimal digits.)
So, least unpleasant thing is that ps should be patched to use more width for pid column. Far more unpleasant one is old software which use shorts to store pids (ref: ash from Slackware 3.2).
As of version 0.5.0, getppid() still dummy returning 1. However, I know ways how to get ppid for both 9x and NT, so in the next versions it will be implemented correctly. The other problem is that many processes want to get ppid to communicate with there parent with signals. There's however problem - since under win32 it's impossible to overlay current process with new image, separate process is being started for each exec(). This means there's extra process in fork-exec-child chain, and it should forward signals between real fork parent and exec'ed child. This is also on TODO list.
Other issue with processes are their exit codes. Following is true for both Unix and Win/DOS: if exit code is 0, program has terminated successfully. But for other codes, there's destinction: native processes terminates with exit code passed to exit() as-is. But Unix by de facto standard uses 16 bit process exit code, low bytes of which contains number of signal by which app was terminated, or zero otherwise, and high - value passed to exit().
To smooth this difference, special utilities are provided:
run-w32 cmdline | This will run cmdline (with executable searching on the PATH) and convert exit code to PW32's standards (by shifting left by 8) |
run-pw32 cmdline | This will run cmdline (with executable searching on the PATH) and convert exit code from PW32 to native convention (if no terminating signal, return high byte, else, return 256+signo) |
exec-w32 | This, being renamed to <something>.exe, will try to execute file <something>.w32 in the same directory where <something>.exe resides with the rest of args and convert exit code to PW32's. |
exec-pw32 | Being renamed to <something>.exe, will try to execute file <something>.pw32 in the same directory where <something>.exe resides with the rest of args and convert exit code to native. |
There exists such notion as signals. They can be recieved interprocessically. They are implemented in sensible to native processes way (besides ability to SIGKILL any process, GUI processes can be closed gracefully with SIGTERM). However, currently Win32 exceptions and events are not mapped to signals.
Currently not supported
Currently not supported
Currently not supported
Dynamic libraries are implemented via Win32 DLLs. It is well known that Win32 DLL model has number of idiosyncrasies which render it, from the first view, largely incompatible and underfunctional with respect to standard *nix shared libraries model. However, investigations and carefully worked out techniques allow to use DLLs in ways very similiar to usage of shared libraries.
Due to strange design of DLLs, they are searched in the same directories where executables are - i.e., on $PATH, while standard Unices have separate environment variable for that.
Currently not supported
I contributed basic CTYPE locale support to djgpp, it should be just thrown in to PW32.
There's an itimer implementation. Dunno whether it works.
Security? Sorry, if you need security, you should really get Real OS. However following may be said: