- libast: Add |globat()| which works like |glob()| but uses a |int fd| as cwd directory Notes: - What about the |newlocale()| API in XPG7 ? - libast: |fdopendir()| for libast - sfio: |sfopenat()| as counterpart for openat() - sfio: |sfopen()|/|sfopenat()| should be able to read (if a special option is given!! (otherwise it's becoming ineffcient if a |open()| is a |miss| ... otherwise we walk the whole damn path down just to realise none of the elements is a pax archive)) files within a PAX archive, e.g. /usr/shared/kshlib/sys.pax/com/att/foo.sh where /usr/shared/kshlib/sys.pax is a PAX archive and com/att/foo.sh is a file in there. Maybe using /dev/pax/ ? or pax:// ? - global: libast should check for EINTR at |close()| time - global: Switch documentation over to DocBook/XML - libast: |opendir()| should use |O_DIRECTORY| - libast: |optget()|&co. should support DocBook as input (better via helper script/app to convert DocBook/XML to an internal representation) - "iconv" cmd: - Add hardcoded support for platform-specific "iconv -l" lookup - Add optional way to handle cases when a character cannot be represented in the destination encoding. Solaris adds '?' in this case. This should be one option, but there needs to be a way to pass the byte sequence which couldn't be mapped through. - src/cmd/ksh93" |sh_run()&&|sh_trap()|: - |sh_run()| should support a |context| field and a flags field as arguments - |sh_run()| should support buitins - |sh_run()| should have a flag to disable |xtrace| mode (for cases when a builtin executes some script code as part of it's functionality) - |sh_trap()| should have a |shp| field and a flags field as arguments - src/cmd/ksh93: - src/cmd/ksh93: cleanup "." ---> |e_dot| - src/cmd/ksh93: cleanup ".." ---> |e_dotdot| - src/cmd/ksh93: globbing should use |shp->pwdfd| - wc cmd: Add -X to count bytes which are not valid (multi-)byte characters - libcmd: Add grep builtin - libcmd: Add xargs builtin - libcmd: Add tr builtin - libcmd: Add od builtin - libcmd: Add pty builtin - libcmd: Add poll builtin (like poll(2), NOT |sfpoll()|) - libcmd: Add iconv builtin - "sort" cmd: |sfopen()| intercept no longer works with Solaris B_DIRECT linker option - ksh93 tests: ksh93 testsuite should include a test to counte the number of syscalls done by common ksh93 applications and complain if a certain limit is hit - ksh93: Add standard API to find script add-on packages like perl has This is pretty much the major difference between ksh93 and perl, perl has a huge library of add-ons (mostly scripts, with a minor part native code, see http://en.wikipedia.org/wiki/CPAN and http://en.wikipedia.org/wiki/PyPI) - This includes: - Simple to-use - Single or multi-file, covering multiple classes/types and functions - Nested namespace - Easy to use, e.g. add "use " and perl interpreter does the rest. - Short, catchy name to activate libraries... like "lib" or "shlib", and usage should be similar short... like "shlib com/att/networking/*" to load all types/functions/builtins in the com/att/networking/* package. - Notes: - ksh93 needs to have a _builtin_ path (e.g. multiple locations, e.g. /usr/share/kshlib/ksh93v/...:/usr/share/kshlib/ksh93/...:/usr/share/kshlib/ksh/...) where to search for the libraries - How does python does the same ? - How does java does the same ? - Libraries should be organised along DNS architecture, e.g. $KSHLIB/com/att/networking/service.sh would be ATT.com's private networking library - All code should have version numbers applied and consumers should have a way to request a specific version _range_ - IMO we should allow ksh93's library loading system to look into "pax" archives (libast has AFAIK already code for this (Hi gsf! :-) ) WARNING: Care must be taken that this feature does not produce excessive number of syscalls in case a normal file cannot be found. - ksh93: Add signal payloads: AFAIK almost all signals have payloads these days To quote from on SuSE Linux 12.1: -- snip -- typedef struct siginfo { int si_signo; /* Signal number. */ int si_errno; /* If non-zero, an errno value associated with this signal, as defined in . */ int si_code; /* Signal code. */ union { int _pad[__SI_PAD_SIZE]; /* kill(). */ struct { __pid_t si_pid; /* Sending process ID. */ __uid_t si_uid; /* Real user ID of sending process. */ } _kill; /* POSIX.1b timers. */ struct { int si_tid; /* Timer ID. */ int si_overrun; /* Overrun count. */ sigval_t si_sigval; /* Signal value. */ } _timer; /* POSIX.1b signals. */ struct { __pid_t si_pid; /* Sending process ID. */ __uid_t si_uid; /* Real user ID of sending process. */ sigval_t si_sigval; /* Signal value. */ } _rt; /* SIGCHLD. */ struct { __pid_t si_pid; /* Which child. */ __uid_t si_uid; /* Real user ID of sending process. */ int si_status; /* Exit value or signal. */ __clock_t si_utime; __clock_t si_stime; } _sigchld; /* SIGILL, SIGFPE, SIGSEGV, SIGBUS. */ struct { void *si_addr; /* Faulting insn/memory ref. */ } _sigfault; /* SIGPOLL. */ struct { long int si_band; /* Band event for SIGPOLL. */ int si_fd; } _sigpoll; } _sifields; } siginfo_t; -- snip -- What we need for ksh93 signal payload support is AFAIK this: .sh.sig.signo = integer, referring to the signal number; .sh.sig.signame = name of signal (saves an extra call to kill(1) to lookup the name and aids with portability since signal numbers are not (always) portable; .sh.sig.error = integer; .sh.sig.code = integer; Now where come the variable names for the individual signals. David: Please keep the si_* prefix of the variable names or replace "si_" with "si." to avoid possible namespace clashes between te generic fields above and the signal-specific ones below: SIGKILL .sh.sig.si_pid = integer; /* Sending process ID. */ .sh.sig.si_uid = integer; /* Real user ID of sending process. */ SIGTIMER /* POSIX.1b timers. */ .sh.sig.si_tid = integer; /* Timer ID. */ .sh.sig.si_overrun= integer; /* Overrun count. */ .sh.sig.si_sigval.int = integer /* Signal value (note this is a sigval_t). */ .sh.sig.si_sigval.ptr = plain long hexadecimal integer, e.g. (size_t)((char*)addr-(char*)0) SIGRT /* POSIX.1b realtime signals. */ .sh.sig.si_pid; /* Sending process ID. */ .sh.sig.si_uid; /* Real user ID of sending process. */ .sh.sig.si_sigval.int = integer .sh.sig.si_sigval.ptr = plain long hexadecimal integer, e.g. (size_t)((char*)addr-(char*)0) SIGCHLD .sh.sig.si_pid = integer; /* Which child. */ .sh.sig.si_uid = integer; /* Real user ID of sending process. */ .sh.sig.si_status = integer; /* Exit value or signal. */ .sh.sig.si_utime = [uhm... Glenn... how do we do this best ?]; .sh.sig.si_stime = [uhm... Glenn... how do we do this best ?]; SIGILL, SIGFPE, SIGSEGV, SIGBUS .sh.sig.si_addr = plain long hexadecimal integer, e.g. (size_t)((char*)addr-(char*)0) /* Faulting insn/memory ref. */ SIGPOLL .sh.sig.si_band = integer; /* Band event for SIGPOLL. */ .sh.sig.si_fd = integer; - AST awk: Add a AST awk version. - Should be multibyte aware - Should support C99 float in printf via %a/%A - Should create bytecode internally before execution for _fast_ execution - global: Remove pre-C89 (=pre-ANSI-C) support - ksh93: Add demo code. - builtin cmd: Add option to allow that builtins are used even when they are called with their full path (this should even work for stuff like |sh_run()|) - libast: Add AST header which either defines a |bool| datatype if the C language level is older than C99 or uses the C99 header . I've attached an old email from the OpenSolaris PSARC architecture commitee which explains some reasons why having a |bool| is a very good idea - libast: Access to |restrict| keyword (see http://developers.sun.com/solaris/articles/cc_restrict.html): Please add a |_ast_restrict| cpp symbol in the libast includes which defaults to || (=nothing) if the C language level is older than C99 or |restrict| if the C language level is C99 or higher, e.g. put this in the matching libast header: -- snip -- /* * The following macro defines a value for the ISO C99 restrict * keyword so that _AST_RESTRICT resolves to "restrict" if * an ISO C99 compiler is used and "" (null string) if any other * compiler is used. This allows for the use of single prototype * declarations regardless of compiler version. */ #if defined(_STDC_C99)) #define _AST_RESTRICT restrict #else #define _AST_RESTRICT #endif -- snip -- - sfio: Add |sfpollflags()| which works more like poll(2) and can probe whether a fd is ready for { read, write }, got a hangup, received an error, an our-of-band message or the poll request contained invalid values for this fd). It should only _optionally_ return (e.g. the caller provides the array, |sfpollflags()| writes into this array and terminates it with a -1) an array of those index numbers which fds received an event. It should be possible that the same fd appears multiple times in the same list (for example to poll on different flags or in cases multiple consumers are pooled together in one |sfpollflags()| call. The API signature should contain an |unsigned int flags| field to allow future extensions. Timing value should be |int64_t| and requires that the |SFPOLLFLAGS_TIMEOUT| flag is set in |flags|. Prototype should end with elipsis, e.g. |flags, ...| for future extensions as defined via |flags| field. - ksh93: read -p and print -p should optionally take a shell pid (e.g. pid or job number (e.g. $ read -p%4 # etc.) to reference different co-processes (e.g. access to multiple co-processs (read --help references a "current co-process" but there is no explanation how the "current" can be changed)). - libast: Add |stpcpy()|, like |strcpy()| but returns pointer to end of the string - see POSIX |stpcpy()| spec. - libast: Add |fcntl()| wrapper for |fcntl(fildes, F_DUPFD_CLOEXEC, 0)| |fcntl(fildes, F_DUP2FD_CLOEXEC, fildes2)| - libshell builtin "builtin", add support for scoped builtins, e.g. add "-s" that builtin settings are scoped, e.g. can be function-locale ...a and "-S" to make it function-local-static - Add AST busybox. - ksh93: arrays (associative and indexed) should use AVL trees to scale better for large amounts of entries - ksh93: indexed arrays should support at _least_ 2**32 entries on 64bit platforms, better more than that. - ksh93: |tty_raw(int fd, int edit);| does not work for multiple fds running on different terminals with different settings. proposed fix: Add new API in libast: 1. |struct ttystate *newttystate(unsigned int state);| 1. |ttypushstate(struct ttystate *st, int fd, unsigned int flags)|, which saves the current state 2. |ttypopstate(struct ttystate *st, int fd)|, which restores a previously saved tty state 4. |void freettystate(struct ttystate *st);| - libsum performance patch (prefetch, manual loop refactoring to match real CPU pipelines) - builtin "unset:" - unset for types (unset -T typename) - unset for namespaces (unset -N namespace) - unset with shell regex (-R *pattern*, including ~(E)/~(F)) etc. - nameref (all three required for the ksh shell library project): - nameref for functions (nameref -f) - nameref for namespaces (nameref -N) - nameref for types (nameref -T) - builtin "kill": - kill -T to send signals to a specific thread in the current process - NFSv4 XATTR support: - ls -@ option for NFSv4 XATTR, like Solaris ls -@ - test -@ option to test for NFSv4 XATTR - "ls": - optimise "ls -U" like Illumos "ls" - Add "-@" for NFSv4 XATTR support - libshell |nv.*()|: - Add |nv_openat()| to obtain a handle within a nested tree of compound variables, similar to how |openat()| works for files/directories - libast: Add XPG7 memory stream API to libast: - |fmemopen()| - http://pubs.opengroup.org/onlinepubs/9699919799/functions/fmemopen.html# - |open_memstream()| - http://pubs.opengroup.org/onlinepubs/9699919799/functions/open_memstream.html - |open_wmemstream()| - http://pubs.opengroup.org/onlinepubs/9699919799/functions/open_memstream.html - libast: Add |*w*()|-stream functions to libast/sfio/stdio - sfio: Documentation needs examples... one per function which demonstrates the use of _exactly_ one sfio function - libast: - Add support for |int128_t| and |binary128| (see http://en.wikipedia.org/wiki/Quadruple_precision_floating-point_format) in |sprintf()|&co. - Add support for decimal floating-point in |sprintf()|&co. - Use |int128_t| for bitfields if usefull - ksh93: - Use |int128_t| for "integer" (e.g. typeset -lli) - Use |binary128| for "float" if compiler supports it - Use |int128_t| for bitfields if usefull - iconv: - Add support for "wide" streams (see |fwide()| and http://pubs.opengroup.org/onlinepubs/009695399/functions/fwide.html ... we assume this means a single character is represented by a |wchar_t| on disk) if possible. - sfio/stdio: Add |fwide()| support - ksh93: Builtin API should pass a read-only directory fd and a read-only |locale_t| to the current locale (see http://pubs.opengroup.org/onlinepubs/9699919799/functions/newlocale.html) - grep: Implement -r/-R change for symlinks as GNU grep did in http://git.savannah.gnu.org/cgit/grep.git/commit/?id=c6e3ea61d9f08aa0128a0eb13d31a2fbad376f99? Olga/gsf said: > > IMO the behavioral change for grep -r makes sense; -R remains the > > same. How can I implement this in AST grep? Just set FTS_PHYSICAL? > > seems ok, as long as its posix compatible > following cmd arg symlinks only would need FTS_META|FTS_PHYSICAL - API: 1. should use |signed int128_t| if available. The |signed| part should make it possible to use it to define historical date/time values accurately 2. should provide two macros: -- snip -- /* nanoseconds to milliseconds */ #define TIME_NS2MS(t) ((t)/(1000UL*1000UL)) /* milliseconds to nanoseconds */ #define TIME_MS2NS(t) (((Time_t)(t))*(1000UL*1000UL)) -- snip -- - bug: indexed type arrays have an extra element: -- snip -- $ ~/bin/ksh -c 'typeset -T x_t=(integer i) ; compound c=(x_t -a foo) ; typeset -p c.foo' x_t -a c.foo=([0]='(typeset -l -i i=0)') $ ~/bin/ksh -c 'typeset -T x_t=(integer i) ; compound c=(x_t -A foo) ; typeset -p c.foo' x_t -A c.foo=() -- snip -- #EOF.