Excercise on bash text processing commands

recall command line redirection and pipelining (STDIN, STDOUT, STDERR)
using wget, download skakalpes-il2.txt
view the file using cat and less
using iconv, convert the file from iso-8859-2 to to utf-8 and store it into skakalpes-utf8.txt
view the new file
count the number of lines in the file using wc
using head and tail, view the first 15 lines , the last 15 lines and lines 10-20
using cut, print the first two words on each line
using grep, print all lines containing a digit
using sed, substitute spaces and punctuations marks with the new line symbol, so that there is at most one word per line (\n)
using grep, avoid empty lines
using sort, sort the words alphabetically
using wc, count the number of words in the text
using sort|uniq, count the number of distinct words in the text
using sort|uniq -c|sort -nr, create a frequency list of words
create a frequency list of letters
using paste, create the frequency list of word bigrams (create another file with lines shifted upwards by one, merge it by paste with the original file and make a frequency list of the lines)
Longer excercise: write a shell script that downloads the main web-page of some news server and finds all word bigrams in it in which both words are capitalized. Make a frequency list of HTML tags used in the document.
reorganize it into a Makefile; name your targets t2-t18
suggest similar new exercises