Filepp : The generic file preprocessor

Hello

What is filepp?

filepp is a generic file preprocessor. It is designed to allow the functionality provided by the C preprocessor to be used with any file type. filepp supports the following keywords, all of which have their usual C preprocessor meanings and usage:
  • #include
  • #define
  • #if
  • #elif
  • #ifdef
  • #ifndef
  • #else
  • #endif
  • #undef
  • #error
  • #warning

However, filepp is much more than a rewrite of the C preprocessor, it features the following enhancements:

  • An extended #if keyword which includes string and regular expression parsing.
  • It works with all character sets including international characters, not just ASCII characters.
  • The prefix to the keyword (normally #) and the line continuation character (normally \) can be set to any character, string or regular expression.
  • Keywords can be added, removed or modified.
  • Macro expansion can work on whole or part words (the C preprocessor's macro expansion only works on whole words).
  • Macros can have multiple arguments.
  • Filepp has a debugging mode to help when things go wrong.
  • Environment variables can be automatically defined as macros.
  • Blank lines originating from include files can be suppressed.
  • Filepp can be customised on the fly using the #pragma keyword which allows any of filepp's internal functions to be called.
  • Modules can be written (in Perl) to modify or extend the behaviour of filepp.
These are just some of the enhancements filepp has over the normal C preprocessor. Its main advantage is the ability to write modules to extend and modify its behaviour. Filepp is written in Perl and allows anyone who knows how to program in Perl to easily write modules. Filepp comes with a set of modules which do the following:
  • for module: Implements the #for keyword. The allows loops to be generated, the behaviour is similar to Perl and C loops.
  • foreach module: Implements the #foreach keyword. The allows loops to be generated for a list of strings, the behaviour is similar to Perl and csh foreach loops.
  • c-comment module: Removes C and C++ style comments from a file.
  • hash-comment module: Remove # style comments (as used in Perl and shell scripts) from a file.
  • function module: Allows macros to be written which directly call Perl functions. This allows macros to give dynamic output.
  • maths module: Implements basic maths functions including add, subtract, multiply, divide, sine, cosine, exponential, random etc.
  • format module: Provides a list of functions for formatting text including a C/Perl style printf function and the Perl substr function.
  • literal module: This module prevents macros appearing in strings being replaced.
  • toupper module: Converts all lowercase letters in a file to uppercase.
  • tolower module: Converts all uppercase letters in a file to lowercase.
  • bigdef module: Enables multi-line macros to be defined without needing to put a line continuation character at the end of each line - makes large macros much more readable.
  • bigfunc module: Same as bigdef, only difference is any keywords embedded in the macro are evaluated when the macro is replaced rather than when the macro is defined.
  • defplus module: Enables existing macros to be appened to.
  • blc module: Enables automatic line continuation if a closing bracket is on a line below the opening bracket.
  • cmacro module: Makes certain macros more "C" like by putting quotes around their values.
  • cpp module: Makes filepp behave as a basic C preprocessor.
  • regexp module: Implements Perl style regular expression search and replacement, which allows regular expressions to be searched for and replaced with other strings.
  • grab module: Used to grab input before any processing is done on it.

Filepp is written in Perl and should therefore run on any system for which Perl is available.


Why filepp and not plain old cpp?

cpp is designed specifically to generate output for the C compiler. Yes, you can use any file type with it, but the output it creates includes loads of blank lines and lines of the style:
# 1 "file.c"
Obviously these lines are very useful to the C-compiler, but no use in say an HTML file. Also, as filepp is written in Perl, it is 8-bit clean and so works on any character set, not just ASCII characters. filepp is also customisable and hopefully more user friendly than cpp.

Where can I get filepp?

The current version of filepp is available here:

How do I use filepp?

Filepp is fully documented in its man page which is available here:

Why would I want to use filepp?

Filepp is written to be as generic as possible and so should work on almost any file type. Here are some example uses:

HTML preprocessor

filepp works great as a HTML preprocessor. Using a HTML preprocessor allows you to easily maintain a consistent look and feel for your web site. All style, colours, fonts, formatting etc. can be written as macros which not only simplifies the web design process, but also means you can get a whole new look and feel for your entire site by simply changing one file.

For an example of HTML preprocessing in action, this webpage along with the header files and Makefile used to generate it are available here:

(Note: if you try to view some of these files in your web browser they will appear partially rendered and you will not be able to read them properly. Look at the page source instead.)

A simple sed replacement

filepp can be used to replace all occurrences of a string in a file with another string in the following way:

  • filepp -k -u -Dfoo=bar input -o output
In this example all occurrences of the string "foo" in the file "input" will be replaced with the string "bar". The output will be written to the file "output". In the above example the option -k turns off all filepp's keywords and -u undefines all filepp's predefined macros.

filepp also allows macros with arguments to be specified on the command line:

  • cat input | filepp -k -u -c -D"foo(ARG1,ARG2)=foo has args: ARG1 and ARG2" > output
So if for example the file "input" contained the line:
  • foo(one, two)
then in the file "output" this would be replaced with:
  • foo has args: one and two
The above example also shows filepp reading from standard input and writing to standard output. The command line option -c causes filepp to read from standard input. filepp writes to standard output by default if no output file is specified.

Maintain consistency between software and documentation

(This example was inspired by the book The Pragmatic Programmer by Andrew Hunt and David Thomas.)

Most software allows the user to enter parameters to configure the software's behaviour. These parameters often have a default value. For example, the program xbiff (which informs users when new mail arrives) has an update rate which tells xbiff how often it should check for new mail. The user can specify a value for this option when starting the program. If no value is specified a default is used. The default value will be documented somewhere, normally in a man page and possibly in any other documentation which comes with the program. If the developer changes the default value in the code, they must also remember to change the value wherever it appears in the documentation. filepp can be used to automate these changes.

When setting default values in a C program, it is normal to #define them, as the value may be needed in more than one place. For example: xbiff's update rate may be needed in the piece of code that does the actual updating and in the command line help. As filepp understands the #define keyword, it can also read the include file containing the default value. The include file could appear something like this:

/* Include file "default.h".
// Contains all default values for program.  Read by cpp and filepp */

/* default update rate (seconds) */
#define UPDATE_RATE 30

This file can be included in all C and C++ files wherever it is needed. It can also be used in any ASCII documentation (man page, HTML, LaTeX, etc.) and pre-processed by filepp provided care is taken over the handling of the comments. For example, in a HTML file the include file could be embedded in a HTML comment, eg:

<!--
#include "defaults.h"
-->

This allows filepp to parse the include file, but as the contents of the include file are hidden in a HTML comment they will not appear in the actual web page produced. However, as filepp has parsed the #define line, it will replace all occurrences of the macro UPDATE_RATE with the definition 30.

A different way of hiding the C comments is to use filepp to convert them into the file's native comment style. For example, all LaTeX comments start with the character %. So for the above example, if filepp is run with the following command line:

  • filepp -D"/*=%" -D"//=%" userguide.tex.in -o userguide.tex
it will convert all C and C++ comments of the form "/*" and "//" to LaTeX comments: "%".

A preprocessor for any language

As filepp is written in Perl, it is 8-bit clean. This means it can be used on any character set, not just ASCII characters. For example, in Britain the only really useful character we have that is not part of the ASCII character set is the pound sign '£'. The following works fine with filepp:
  • #define £ pound

An easy front end to your file processing routines

filepp is designed to be a highly customisable generic file pre-processor. Its C pre-processor style functionality should just be considered its default behaviour, as it can be used as a frontend for almost any file pre-processing or conversion processes. filepp can be customised to do other sorts of pre-processing or file conversion by using filepp modules, which are written in Perl. Therefore, anyone who knows how to program in Perl should find customising filepp very easy. filepp modules allow you to do the following:
  • Add / remove / change the behaviour of keywords.
  • Add / remove / reorder processing routines. A processing routine takes in a line from the input, processes it in some way and returns the processed output. filepp has a chain of these processing routines which process each line of the input in turn. Processing routines can be easily added, removed or modified.
  • Write macros which directly call Perl functions. filepp comes with a module called function.pm which allows macros to be added which call Perl functions. This allows macros to have dynamic values. An example of how this works is the maths module maths.pm which has a macro add(a, b). When this macro is found in the input, a Perl function is called which replaces the macro with the sum of a and b.

When did the latest version of filepp appear?

The current filepp ChangeLog is available here:

Older versions of filepp can be found here:

Who wrote filepp?

Filepp was written by Darren Miller (darren at cabaret dot demon dot co dot uk). Many others have contributed patches, bug reports and suggestions, see the README for a list of some of the main contributors. Thanks to everyone who has helped make filepp what it is.
I wanna go home Hello

Last updated: Aug 19 2008