awk
Language
This book describes the GNU implementation of awk
, which follows
the POSIX specification. Many awk
users are only familiar
with the original awk
implementation in Version 7 Unix.
(This implementation was the basis for awk
in Berkeley Unix,
through 4.3--Reno. The 4.4 release of Berkeley Unix uses gawk
2.15.2
for its version of awk
.) This chapter briefly describes the
evolution of the awk
language, with cross references to other parts
of the book where you can find more information.
The awk
language evolved considerably between the release of
Version 7 Unix (1978) and the new version first made generally available in
System V Release 3.1 (1987). This section summarizes the changes, with
cross-references to further details.
awk
Statements Versus Lines).
return
statement
(see section User-defined Functions).
delete
statement (see section The delete
Statement).
do
-while
statement
(see section The do
-while
Statement).
atan2
, cos
, sin
, rand
and
srand
(see section Numeric Built-in Functions).
gsub
, sub
, and match
(see section Built-in Functions for String Manipulation).
close
, and system
(see section Built-in Functions for Input/Output).
ARGC
, ARGV
, FNR
, RLENGTH
, RSTART
,
and SUBSEP
built-in variables (see section Built-in Variables).
awk
programs (see section Operator Precedence (How Operators Nest)).
FS
(see section Specifying How Fields are Separated), and as the
third argument to the split
function
(see section Built-in Functions for String Manipulation).
awk
to
recognize `\r', `\b', and `\f', but this is not
something you can rely on.)
getline
function
(see section Explicit Input with getline
).
BEGIN
and END
rules
(see section The BEGIN
and END
Special Patterns).
The System V Release 4 version of Unix awk
added these features
(some of which originated in gawk
):
ENVIRON
variable (see section Built-in Variables).
srand
built-in function
(see section Numeric Built-in Functions).
toupper
and tolower
built-in string functions
for case translation
(see section Built-in Functions for String Manipulation).
printf
function
(see section Format-Control Letters).
"%*.*d"
)
in the argument list of the printf
function
(see section Format-Control Letters).
/foo/
as expressions, where
they are equivalent to using the matching operator, as in `$0 ~ /foo/'
(see section Using Regular Expression Constants).
awk
The POSIX Command Language and Utilities standard for awk
introduced the following changes into the language:
CONVFMT
for controlling the conversion of numbers
to strings (see section Conversion of Strings and Numbers).
The following common extensions are not permitted by the POSIX standard:
\x
escape sequences are not recognized
(see section Escape Sequences).
func
for the keyword function
is not
recognized (see section Function Definition Syntax).
FS
to be a single tab character
(see section Specifying How Fields are Separated).
fflush
built-in function is not supported
(see section Built-in Functions for Input/Output).
awk
Brian Kernighan, one of the original designers of Unix awk
,
has made his version available via anonymous ftp
(see section Other Freely Available awk
Implementations).
This section describes extensions in his version of awk
that are
not in POSIX awk
.
fflush
built-in function for flushing buffered output
(see section Built-in Functions for Input/Output).
gawk
Not in POSIX awk
The GNU implementation, gawk
, adds a number of features.
This sections lists them in the order they were added to gawk
.
They can all be disabled with either the `--traditional' or
`--posix' options
(see section Command Line Options).
Version 2.10 of gawk
introduced these features:
AWKPATH
environment variable for specifying a path search for
the `-f' command line option
(see section Command Line Options).
IGNORECASE
variable and its effects
(see section Case-sensitivity in Matching).
gawk
).
Version 2.13 of gawk
introduced these features:
FIELDWIDTHS
variable and its effects
(see section Reading Fixed-width Data).
systime
and strftime
built-in functions for obtaining
and printing time stamps
(see section Functions for Dealing with Time Stamps).
Version 2.14 of gawk
introduced these features:
next file
statement for skipping to the next data file
(see section The nextfile
Statement).
Version 2.15 of gawk
introduced these features:
ARGIND
variable, that tracks the movement of FILENAME
through ARGV
(see section Built-in Variables).
ERRNO
variable, that contains the system error message when
getline
returns -1, or when close
fails
(see section Built-in Variables).
gawk
).
Version 3.0 of gawk
introduced these features:
next file
statement became nextfile
(see section The nextfile
Statement).
awk
(see section Major Changes between V7 and SVR3.1).
FS
to be a null string, and for the third
argument to split
to be the null string
(see section Making Each Character a Separate Field).
RS
to be a regexp
(see section How Input is Split into Records).
RT
variable
(see section How Input is Split into Records).
gensub
function for more powerful text manipulation
(see section Built-in Functions for String Manipulation).
strftime
function acquired a default time format,
allowing it to be called with no arguments
(see section Functions for Dealing with Time Stamps).
IGNORECASE
changed, now applying to string comparison as well
as regexp operations
(see section Case-sensitivity in Matching).
fflush
function from the
Bell Labs research version of awk
(see section Command Line Options; also
see section Built-in Functions for Input/Output).
gawk
for Unix).
gawk
on an Amiga).