Welcome to Atalasoft Community Sign in | Help

Nicer String.Format

I will say that I’ve rarely seen a good string formatting method that I’ve liked.  You can start with one of the first that I’ve used which is in Pascal in the Writeln pseudo-function.  Writeln looks like a normal functional/procedure call, but it lets you do some interesting formatting things:

writeln('Hello, world!'); { simple }
writeln('I am ', age, ' years old.'); { age is an integer }
writeln('I weigh ', weight:3:1, ' pounds.'); { weight is a real }

writeln takes any number of arguments prints them all out with a final newline at the end.  You can also add in formatting information – the expression weight:3:1 says to format weight with three digits left of the decimal and 1 to the right.  For other types you can only use one : modifier.  It’s a fairly decent approach for what it does, but there are a few things severely broken about it.  The first is that you can’t write writeln in Pascal – it’s a language feature not a library function, that means that it can’t be used in any other context.  As a result, modularity is also wrong – I can’t format into some other blob to be output by some other means.  If it can’t be written to a file, then you can’t use writeln.  It would be nice to be able to writeln to a string or byte array, but then again, Pascal doesn’t have byte strings.  It’s also not extensible so you can’t have a standard way to format a record or some other type.

C does better by using a formatting string with a domain-specific language (ie, %fmt), where the fmt describes how a particular type should be printed.  It nice – you can do an awful lot with what’s there and the level of modularity is nearly right – it’s possible to format into other data types and if you don’t have the right one ready, you can use vaprintf if you have to.  Unfortunately, printf and its peers are clever hacks, but they’re still hacks and the result is a very fragile implementation that frequently breaks if you change a type or the format string.  This problem will always be present when you have a string formatter which is not infix.

C++ sets the bar higher by defining clever << operations on ostream.  Since ostream::operator<< returns a reference to an ostream, it cascades nicely and read in a nice infix way.  Since operator<< is overloadable, it’s nicely modular.  Control over how things are formatted are done with stream manipulators.  All of this is a nice step up from C, albeit with a much greater weight.

C# uses {n} to insert the nth argument into a string in the method String.Format().  This works better than C’s approach since you can easily insert your arguments in any order any number of ways.  For example you can do this:

string output = String.Format("I heard your {1} went into a {0} and {2} everything in the {0} and they had to close the {0}.",
    "dad", "restaurant", "ate");

“restaurant” with get substituted in several times in the correct places.  Awesome, but I still hate it.  So much so that I usually end up using a string builder or the less desirable series of + operations.

Today on Stack Overflow, I saw this question about how to do a String.Format but that used names to specify the things to be formatted.

In the accepted answer, there is a link to some work done by Phil Haack that describes the problem well, compares other solutions, then offers his own.  It’s great work and a nice read.

Published Monday, January 05, 2009 11:23 AM by Steve Hawley
Filed under:


No Comments
Anonymous comments are disabled