program that would have been better written in another language.
Amazingly Workable Formatter (awf
)
Henry Spencer at the University of Toronto wrote a formatter that accepts
a large subset of the `nroff -ms' and `nroff -man' formatting
commands, using awk
and sh
.
ANSI
The American National Standards Institute. This organization produces
many standards, among them the standards for the C and C++ programming
languages.
Assignment
An awk
expression that changes the value of some awk
variable or data object. An object that you can assign to is called an
lvalue. The assigned values are called rvalues.
See section Assignment Expressions.
awk
Language
The language in which awk
programs are written.
awk
Program
An awk
program consists of a series of patterns and
actions, collectively known as rules. For each input record
given to the program, the program's rules are all processed in turn.
awk
programs may also contain function definitions.
awk
Script
Another name for an awk
program.
Bash
The GNU version of the standard shell (the Bourne-Again shell).
See "Bourne Shell."
BBS
See "Bulletin Board System."
Boolean Expression
Named after the English mathematician Boole. See "Logical Expression."
Bourne Shell
The standard shell (`/bin/sh') on Unix and Unix-like systems,
originally written by Steven R. Bourne.
Many shells (Bash, ksh
, pdksh
, zsh
) are
generally upwardly compatible with the Bourne shell.
Built-in Function
The awk
language provides built-in functions that perform various
numerical, time stamp related, and string computations. Examples are
sqrt
(for the square root of a number) and substr
(for a
substring of a string). See section Built-in Functions.
Built-in Variable
ARGC
, ARGIND
, ARGV
, CONVFMT
, ENVIRON
,
ERRNO
, FIELDWIDTHS
, FILENAME
, FNR
, FS
,
IGNORECASE
, NF
, NR
, OFMT
, OFS
, ORS
,
RLENGTH
, RSTART
, RS
, RT
, and SUBSEP
,
are the variables that have special meaning to awk
.
Changing some of them affects awk
's running environment.
Several of these variables are specific to gawk
.
See section Built-in Variables.
Braces
See "Curly Braces."
Bulletin Board System
A computer system allowing users to log in and read and/or leave messages
for other users of the system, much like leaving paper notes on a bulletin
board.
C
The system programming language that most GNU software is written in. The
awk
programming language has C-like syntax, and this book
points out similarities between awk
and C when appropriate.
Character Set
The set of numeric codes used by a computer system to represent the
characters (letters, numbers, punctuation, etc.) of a particular country
or place. The most common character set in use today is ASCII (American
Standard Code for Information Interchange). Many European
countries use an extension of ASCII known as ISO-8859-1 (ISO Latin-1).
CHEM
A preprocessor for pic
that reads descriptions of molecules
and produces pic
input for drawing them. It was written in awk
by Brian Kernighan and Jon Bentley, and is available from
netlib@research.att.com
.
Compound Statement
A series of awk
statements, enclosed in curly braces. Compound
statements may be nested.
See section Control Statements in Actions.
Concatenation
Concatenating two strings means sticking them together, one after another,
giving a new string. For example, the string `foo' concatenated with
the string `bar' gives the string `foobar'.
See section String Concatenation.
Conditional Expression
An expression using the `?:' ternary operator, such as
`expr1 ? expr2 : expr3'. The expression
expr1 is evaluated; if the result is true, the value of the whole
expression is the value of expr2, otherwise the value is
expr3. In either case, only one of expr2 and expr3
is evaluated. See section Conditional Expressions.
Comparison Expression
A relation that is either true or false, such as `(a < b)'.
Comparison expressions are used in if
, while
, do
,
and for
statements, and in patterns to select which input records to process.
See section Variable Typing and Comparison Expressions.
Curly Braces
The characters `{' and `}'. Curly braces are used in
awk
for delimiting actions, compound statements, and function
bodies.
Dark Corner
An area in the language where specifications often were (or still
are) not clear, leading to unexpected or undesirable behavior.
Such areas are marked in this book with "(d.c.)" in the
text, and are indexed under the heading "dark corner."
Data Objects
These are numbers and strings of characters. Numbers are converted into
strings and vice versa, as needed.
See section Conversion of Strings and Numbers.
Double Precision
An internal representation of numbers that can have fractional parts.
Double precision numbers keep track of more digits than do single precision
numbers, but operations on them are more expensive. This is the way
awk
stores numeric values. It is the C type double
.
Dynamic Regular Expression
A dynamic regular expression is a regular expression written as an
ordinary expression. It could be a string constant, such as
"foo"
, but it may also be an expression whose value can vary.
See section Using Dynamic Regexps.
Environment
A collection of strings, of the form name=
val, that each
program has available to it. Users generally place values into the
environment in order to provide information to various programs. Typical
examples are the environment variables HOME
and PATH
.
Empty String
See "Null String."
Escape Sequences
A special sequence of characters used for describing non-printing
characters, such as `\n' for newline, or `\033' for the ASCII
ESC (escape) character. See section Escape Sequences.
Field
When awk
reads an input record, it splits the record into pieces
separated by whitespace (or by a separator regexp which you can
change by setting the built-in variable FS
). Such pieces are
called fields. If the pieces are of fixed length, you can use the built-in
variable FIELDWIDTHS
to describe their lengths.
See section Specifying How Fields are Separated,
and also see
See section Reading Fixed-width Data.
Floating Point Number
Often referred to in mathematical terms as a "rational" number, this is
just a number that can have a fractional part.
See "Double Precision" and "Single Precision."
Format
Format strings are used to control the appearance of output in the
printf
statement. Also, data conversions from numbers to strings
are controlled by the format string contained in the built-in variable
CONVFMT
. See section Format-Control Letters.
Function
A specialized group of statements used to encapsulate general
or program-specific tasks. awk
has a number of built-in
functions, and also allows you to define your own.
See section Built-in Functions,
and section User-defined Functions.
FSF
See "Free Software Foundation."
Free Software Foundation
A non-profit organization dedicated
to the production and distribution of freely distributable software.
It was founded by Richard M. Stallman, the author of the original
Emacs editor. GNU Emacs is the most widely used version of Emacs today.
gawk
The GNU implementation of awk
.
General Public License
This document describes the terms under which gawk
and its source
code may be distributed. (see section GNU GENERAL PUBLIC LICENSE)
GNU
"GNU's not Unix". An on-going project of the Free Software Foundation
to create a complete, freely distributable, POSIX-compliant computing
environment.
GPL
See "General Public License."
Hexadecimal
Base 16 notation, where the digits are 0
-9
and
A
-F
, with `A'
representing 10, `B' representing 11, and so on up to `F' for 15.
Hexadecimal numbers are written in C using a leading `0x',
to indicate their base. Thus, 0x12
is 18 (one times 16 plus 2).
I/O
Abbreviation for "Input/Output," the act of moving data into and/or
out of a running program.
Input Record
A single chunk of data read in by awk
. Usually, an awk
input
record consists of one line of text.
See section How Input is Split into Records.
Integer
A whole number, i.e. a number that does not have a fractional part.
Keyword
In the awk
language, a keyword is a word that has special
meaning. Keywords are reserved and may not be used as variable names.
gawk
's keywords are:
BEGIN
,
END
,
if
,
else
,
while
,
do...while
,
for
,
for...in
,
break
,
continue
,
delete
,
next
,
nextfile
,
function
,
func
,
and exit
.
Logical Expression
An expression using the operators for logic, AND, OR, and NOT, written
`&&', `||', and `!' in awk
. Often called Boolean
expressions, after the mathematician who pioneered this kind of
mathematical logic.
Lvalue
An expression that can appear on the left side of an assignment
operator. In most languages, lvalues can be variables or array
elements. In awk
, a field designator can also be used as an
lvalue.
Null String
A string with no characters in it. It is represented explicitly in
awk
programs by placing two double-quote characters next to
each other (""
). It can appear in input data by having two successive
occurrences of the field separator appear next to each other.
Number
A numeric valued data object. The gawk
implementation uses double
precision floating point to represent numbers.
Very old awk
implementations use single precision floating
point.
Octal
Base-eight notation, where the digits are 0
-7
.
Octal numbers are written in C using a leading `0',
to indicate their base. Thus, 013
is 11 (one times 8 plus 3).
Pattern
Patterns tell awk
which input records are interesting to which
rules.
A pattern is an arbitrary conditional expression against which input is
tested. If the condition is satisfied, the pattern is said to match
the input record. A typical pattern might compare the input record against
a regular expression. See section Pattern Elements.
POSIX
The name for a series of standards being developed by the IEEE
that specify a Portable Operating System interface. The "IX" denotes
the Unix heritage of these standards. The main standard of interest for
awk
users is
IEEE Standard for Information Technology, Standard 1003.2-1992,
Portable Operating System Interface (POSIX) Part 2: Shell and Utilities.
Informally, this standard is often referred to as simply "P1003.2."
Private
Variables and/or functions that are meant for use exclusively by library
functions, and not for the main awk
program. Special care must be
taken when naming such variables and functions.
See section Naming Library Function Global Variables.
Range (of input lines)
A sequence of consecutive lines from the input file. A pattern
can specify ranges of input lines for awk
to process, or it can
specify single lines. See section Pattern Elements.
Recursion
When a function calls itself, either directly or indirectly.
If this isn't clear, refer to the entry for "recursion."
Redirection
Redirection means performing input from other than the standard input
stream, or output to other than the standard output stream.
You can redirect the output of the print
and printf
statements
to a file or a system command, using the `>', `>>', and `|'
operators. You can redirect input to the getline
statement using
the `<' and `|' operators.
See section Redirecting Output of print
and printf
,
and section Explicit Input with getline
.
Regexp
Short for regular expression. A regexp is a pattern that denotes a
set of strings, possibly an infinite set. For example, the regexp
`R.*xp' matches any string starting with the letter `R'
and ending with the letters `xp'. In awk
, regexps are
used in patterns and in conditional expressions. Regexps may contain
escape sequences. See section Regular Expressions.
Regular Expression
See "regexp."
Regular Expression Constant
A regular expression constant is a regular expression written within
slashes, such as /foo/
. This regular expression is chosen
when you write the awk
program, and cannot be changed doing
its execution. See section How to Use Regular Expressions.
Rule
A segment of an awk
program that specifies how to process single
input records. A rule consists of a pattern and an action.
awk
reads an input record; then, for each rule, if the input record
satisfies the rule's pattern, awk
executes the rule's action.
Otherwise, the rule does nothing for that input record.
Rvalue
A value that can appear on the right side of an assignment operator.
In awk
, essentially every expression has a value. These values
are rvalues.
sed
See "Stream Editor."
Short-Circuit
The nature of the awk
logical operators `&&' and `||'.
If the value of the entire expression can be deduced from evaluating just
the left-hand side of these operators, the right-hand side will not
be evaluated
(see section Boolean Expressions).
Side Effect
A side effect occurs when an expression has an effect aside from merely
producing a value. Assignment expressions, increment and decrement
expressions and function calls have side effects.
See section Assignment Expressions.
Single Precision
An internal representation of numbers that can have fractional parts.
Single precision numbers keep track of fewer digits than do double precision
numbers, but operations on them are less expensive in terms of CPU time.
This is the type used by some very old versions of awk
to store
numeric values. It is the C type float
.
Space
The character generated by hitting the space bar on the keyboard.
Special File
A file name interpreted internally by gawk
, instead of being handed
directly to the underlying operating system. For example, `/dev/stderr'.
See section Special File Names in gawk
.
Stream Editor
A program that reads records from an input stream and processes them one
or more at a time. This is in contrast with batch programs, which may
expect to read their input files in entirety before starting to do
anything, and with interactive programs, which require input from the
user.
String
A datum consisting of a sequence of characters, such as `I am a
string'. Constant strings are written with double-quotes in the
awk
language, and may contain escape sequences.
See section Escape Sequences.
Tab
The character generated by hitting the TAB key on the keyboard.
It usually expands to up to eight spaces upon output.
Unix
A computer operating system originally developed in the early 1970's at
AT&T Bell Laboratories. It initially became popular in universities around
the world, and later moved into commercial evnironments as a software
development system and network server system. There are many commercial
versions of Unix, as well as several work-alike systems whose source code
is freely available (such as Linux, NetBSD, and FreeBSD).
Whitespace
A sequence of space or tab characters occurring inside an input record or a
string.
Mini annuaire : Gawk
Youhp3 | Youpee est un preprocesseur HTML pour vous simplifier toutes les tâches répétitives dans la création d'un site web. Salemioche.net utilise trés largement ses possibilités |
cygwin | le compilateur gcc sous windows ainsi que tous les outils unix (awk, grep, sed, bash, ksh ...) |