Home > Oracle, Unix > I sed awk!

I sed awk!


Oracle DBAs most often find themselves working on Unix platforms and therefore some familiarity with Unix scripting is helpful. Utilities like “sed,” “awk,” and “grep” are often used in Unix scripts.

  • Sed is a “stream editor” which filters and transforms the input stream.
  • Awk is a pattern scanning and processing language named after Unix legends Alfred Aho, Peter Weinberger, and Brian Kernighan. From Alfred Aho’s interview in Computerworld: “A file is treated as a sequence of records, and by default each line is a record. Each line is broken up into a sequence of fields, so we can think of the first word in a line as the first field, the second word as the second field, and so on. An AWK program is of a sequence of pattern-action statements. AWK reads the input a line at a time. A line is scanned for each pattern in the program, and for each pattern that matches, the associated action is executed.”
  • Grep prints lines containing the pattern of interest and its name comes from the editing command “g/re/p” where g indicates global scope, re is a “regular expression,” and p is the print command.

When our research colleagues saw the three of us in one or another’s office, they’d walk by the open door and say ‘AWK! AWK!’. So, we called the language AWK because of the good natured ribbing we received from our colleagues.—Alfred Aho

Here is an example which uses all these three utilities in a single multi-line Unix command. You are given the following log file (imp.log) of the Oracle import utility. The log file shows some errors in enabling constraints. You are only interested in foreign key constraints. Also, you want to print them nicely with each error printed on a single line (instead of being broken up into multiple lines as in the log file).

Import: Release 11.2.0.2.0 - Production on XXX XXX XX XX:XX:XX XXXX

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Export file created by EXPORT:V11.02.00 via conventional path

Warning: the objects were exported by XXXXX, not by you

import done in AL32UTF8 character set and AL16UTF16 NCHAR character set
. importing XXXXX's objects into XXXXXX
IMP-00017: following statement failed with ORACLE error 2264:
 "ALTER TABLE "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" ADD CONSTRAINT "XXXXXXXXXXXXXXX"
 "XXXXXXX" CHECK (XXXXXXXXXXXXXXXXXX IN ('XXXXX','XXXX')) ENABLE NOVALIDATE"
IMP-00003: ORACLE error 2264 encountered
ORA-02264: name already used by an existing constraint
IMP-00017: following statement failed with ORACLE error 942:
 "ALTER TABLE "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" ADD CONSTRAINT "XXXXXXXXXXXXXXX"
 "XXXXXXX" FOREIGN KEY ("XXXXXXXXXXX") REFERENCES "XXXXXXXXXXXXXXXXX" ("XXXXX"
 "X") DISABLE NOVALIDATE"
IMP-00003: ORACLE error 942 encountered
ORA-00942: table or view does not exist
About to enable constraints...
Import terminated successfully with warnings.

First, let’s extract blocks of lines which begin with a line containing the characters “ADD CONSTRAINT” (anywhere in the line) and ending before the line containing the characters “IMP-00003” (at the beginning of the line). Let’s also remove the space which occurs at the beginning of lines and the double quotation marks. All this can be done by the sed utility.

sed -n -e '                       # invoke the stream editor "sed"; the -n flag suppresses the default printing action of sed

  /ADD CONSTRAINT/,/^IMP-00003/ { # block of commands to execute for each line in a range of lines
                                  # the first line should contain the characters "ADD CONSTRAINT" (anywhere within the line)
                                  # the last line should contain the characters "IMP-00003" at the beginning of the line (as indicated by the ^ character)
    s/"//g                        # remove quote marks from each line
    s/^ //g                       # remove single blank spaces if they occur at the beginning of the line
    p                             # print the line
  }

' imp.log                         # the end of the sed processing

The results are as follows:

ALTER TABLE XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ADD CONSTRAINT XXXXXXXXXXXXXXX
XXXXXXX CHECK (XXXXXXXXXXXXXXXXXX IN ('XXXXX','XXXX')) ENABLE NOVALIDATE
ALTER TABLE XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ADD CONSTRAINT XXXXXXXXXXXXXXX
XXXXXXX FOREIGN KEY (XXXXXXXXXXX) REFERENCES XXXXXXXXXXXXXXXXX (XXXXX

Next let’s concatenate lines until each ALTER TABLE command occupies a single line. This can be done by piping the previous results to the awk utility.

sed -n -e '

  /ADD CONSTRAINT/,/^IMP-00003/ {
    s/"//g
    s/^ //g
    p
  }

' imp.log | awk '                 # pipe the previous results to the pattern matching and processing utility "awk"

  /^IMP-00003/ {                  # block of commands to perform when the characters "IMP-00003" are found at the start of a line
    print line                    # print the contents of the line variable if it is non-empty
    line=""                       # reinitialize the line variable
  }                               # the end of the command block

  !/^IMP-00003/ {                 # block of commands to perform when the characters "IMP-00003" are not found at the start of a line
                                  # the ! character indicates negation
    line=line $0                  # concatenate the contents of the line variable and the current line from the input stream
  }                               # the end of the command block

'                                 # the end of the awk processing

The results are as follows:

ALTER TABLE XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ADD CONSTRAINT XXXXXXXXXXXXXXXXXXXXXX CHECK (XXXXXXXXXXXXXXXXXX IN ('XXXXX','XXXX')) ENABLE NOVALIDATE
ALTER TABLE XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ADD CONSTRAINT XXXXXXXXXXXXXXXXXXXXXX FOREIGN KEY (XXXXXXXXXXX) REFERENCES XXXXXXXXXXXXXXXXX (XXXXXX) DISABLE NOVALIDATE

The last thing to do is to eliminate lines which do not contain the phrase “FOREIGN KEY”. We could have done this during the awk processing but, just as an exercise, let’s pipe the previous results to the grep utility to perform the filtering. We’re now ready to put everything together.

sed -n -e '                       # invoke the stream editor "sed"

  /ADD CONSTRAINT/,/^IMP-00003/ {
    s/"//g
    s/^ //g
    p
  }

' imp.log | awk '                 # pipe the previous results to the pattern matching and processing utility "awk"

  /^IMP-00003/ {
    print line
    line=""
  }

  !/^IMP-00003/ {
    line=line $0
  }

' | grep 'FOREIGN KEY'            # pipe the previous results to the grep utility; print lines that contain the phrase "FOREIGN KEY"

The results are as follows. Errors not relating to foreign key constraints have been eliminated and each error is printed on a single line (instead of being broken up into multiple lines as in the original log file).

ALTER TABLE XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ADD CONSTRAINT XXXXXXXXXXXXXXXXXXXXXX FOREIGN KEY (XXXXXXXXXXX) REFERENCES XXXXXXXXXXXXXXXXX (XXXXXX) DISABLE NOVALIDATE

Addendum: As Jamerfort demonstrated in his comments, it is possible to perform this task using sed only because sed offers some rudimentary programming capabilities such as branching (GOTO). My own attempt is shown below. Note that I only deleted the quotation marks at the beginning and end of each line; I realized that the quotation marks within the lines should be retained.

sed -n -e '
  /ADD CONSTRAINT/ {              # This is the block of commands to execute for lines containing the characters "ADD CONSTRAINT".
    s/^ "//g                      # Delete the space and quotation mark at the beginning of the line.
    s/"$//g                       # Delete the quotation mark at the end of the line.
    h                             # Overwrite the contents of the "hold space" with the contents of the "pattern space" (the current line).
    n                             # Read the next line; stop the program if no more lines are found.
    :loop                         # Perform the following commands in a loop.
      /^IMP-00003/ {              # This the the block of commands to execute for lines containing the characters "IMP-00003" at the beginning of the line.
        g                         # Overwrite the contents of the pattern space with the contents of the hold space.
        s/\n//g                   # Remove all newline characters from the pattern space.
        /FOREIGN KEY/p            # Print the pattern space if it contains the characters "FOREIGN KEY".
        b endloop                 # Branch to the end of the loop.
      }                           # This is the end of the command block.
      s/^ "//g                    # Delete the space and quotation mark at the beginning of the line.
      s/"$//g                     # Delete the quotation mark at the end of the line.
      H                           # Append the pattern space (the current line) to the hold space.
      n                           # Read the next line; stop the program if no more lines are found.
      b loop                      # Branch to the beginning of the loop.
    :endloop                      # This is the end of the loop.
  }                               # This is the end of the command block.
' imp.log

Here are the explanations of the very cryptic one-character sed commands used in the above example. A complete list can be obtained using the command “man sed” on a Unix machine.

n next line
h hold this line (discarding whatever else there may be in the hold space)
H hold this line (without discarding whatever else there may be in the hold space)
g get back the contents of the hold space
b branch
s substitute
p print

The results are as follows:

ALTER TABLE "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" ADD CONSTRAINT "XXXXXXXXXXXXXXXXXXXXXX" FOREIGN KEY ("XXXXXXXXXXX") REFERENCES "XXXXXXXXXXXXXXXXX" ("XXXXXX") DISABLE NOVALIDATE

Conversely, the task can be performed using awk only because awk offers variables, conditional processing, and a rich collection of functions such as length, substr, and index (instr).

awk '

  /ADD CONSTRAINT/ {
    line=""
    flag="Y"
  }

  /^IMP-00003/ {
    if(flag=="Y") {
      if (index(line, "FOREIGN KEY")!=0) print line
      line=""
      flag="N"
    }
  }

  {
    if (flag=="Y") line=line substr(substr($0, 1, length($0)-1), 3)
  }

' imp.log

The results don’t change.

ALTER TABLE "XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" ADD CONSTRAINT "XXXXXXXXXXXXXXXXXXXXXX" FOREIGN KEY ("XXXXXXXXXXX") REFERENCES "XXXXXXXXXXXXXXXXX" ("XXXXXX") DISABLE NOVALIDATE

Sed – An Introduction and Tutorial

Awk – An Introduction and Tutorial

About these ads
Categories: Oracle, Unix
  1. April 20, 2012 at 12:20 pm

    Just for fun, I threw together a solution only using sed. Let me know what you think or if you have questions.

  2. Iggy Fernandez
    April 20, 2012 at 2:06 pm

    I’m always happy to learn something new. Your script produced the following output:

    ALTER TABLE XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ADD CONSTRAINT XXXXXXXXXXXXXXXXXXXXXX CHECK (XXXXXXXXXXXXXXXXXX IN ('XXXXX','XXXX')) ENABLE NOVALIDATE
    ALTER TABLE XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ADD CONSTRAINT XXXXXXXXXXXXXXXXXXXXXX FOREIGN KEY (XXXXXXXXXXX) REFERENCES XXXXXXXXXXXXXXXXX (XXXXXX) DISABLE NOVALIDATE
    

    The only thing remaining is to restrict the output to lines which contain the characters “FOREIGN KEY” (since I am only interested in foreign key constraints) but all that has to be done is to change the print command in your script to “/FOREIGN KEY/p”.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 711 other followers

%d bloggers like this: