Occasionally I find I need to perform a search-and-replace over many files in a directory hierarchy. I could open each such file, do a search-and-replace with my editor, then save, but this would be tedious.
On the other hand, writing a full-fledged script to automate the process would be silly, since I know about perl’s -i option.
An example: poor man’s refactoring. I find all .C files in or beneath the current directory, and change toString to to_string in those files:
$ perl -pi -e 's/toString/to_string/g' `find -name '*.C'`
Basically, the -p option coupled with the -e EXPRESSION bit causes perl to execute the given expression on each line of each file specified on the command line. The expression I use here is a regular expression replacement.
Normally, -p causes perl to print the results to stdout, but combined with -i and no argument, perl effectively edits the files in-place. Let’s hope you use version control, in case you make a mistake. ;-)
An argument can be given to the -i option to cause perl to backup the original files with the specified extension. For more info, try perldoc perlrun.
Doublets is a game described by Lewis Carroll, in which one attempts to transform the start word into the end word by changing one letter at a time, such that each intermediate step results in another valid word.
For example, a doublet between ‘head’ and ‘tail’ is
> head
> heal
> teal
> tell
> tall
> tail
Finding doublets without assistance is satisfying and tests your vocabulary, but I knew I could have a computer find them through brute force. I wrote a Haskell program that, given a word list file, finds the shortest doublet between two words. Naturally, the quality of the results depends on the word list given to the program.
This program builds in-memory graphs whose nodes are labeled with words of the same length. That is, there will be one graph for words of length 0, one graph for words of length 1, one graph for words of length 2…
In each of these graphs, two nodes are connected if the two labels differ by only one letter. So, an edge would connect `book’ and `look’, but not `book’ and `beak’. The shortest doublet is found by computing the shortest path between corresponding nodes.
This program can utilize multiple cores and has a simple REPL using the Haskell bindings to GNU readline. To build and run this program you the readline bindings and Martin Erwig’s graph library, both available on hackage.
The source is available as a gzipped tar: doublets.tgz.