Thursday, March 6, 2008

qmaptranslator.NET - baby steps with regex

setting up the project I had grandiose Ideas ... :[ that is usually a nooby way to say that I didn't realised how much input I would have to do with this project. So instead I went back to the drawing board and approach the project thinking about making a simple assemly to be run in cmd line only. It should take one argument (that's right) and read through a file (obviously a .map file) line by line, and copy it over when no difference between the original format (mohaa map file) and the target format (cod2 map file).
The idea is to implement a simple read-transform-copy without the time consumming step of loading objects like vertex/polygon/mesh. But currently I have a hard time running a 'simple' regex. :P

Solution to the 'Regex knowledge wall' I've faced earlier on this week:

I've found out a few interesting things about the Regex that I didn't knew about. the first one being that a regex can have named groups. At first I thought: "what the heck is that about?". And guess what?
It is simply put that grouping a few item together in a regex string can be attributed a name. therefore making the retrieval of the item more easely. Sound complicated? here is what I had in mind: Reading a quake .map file (or any of the game based around the same game engine) line by line and matching the read line with a regex, it is possible to manipulate the string read. The .map file format for a brushgives the following for a brush:

// brush 0
{
( 104 128 0 ) ( -72 128 0 ) ( -72 -40 0 ) caulk 64 64 0 0 0 0
( -72 -40 8 ) ( -72 128 8 ) ( 104 128 8 ) caulk 64 64 0 0 0 0
( -72 -40 8 ) ( 104 -40 8 ) ( 104 -40 0 ) caulk 64 64 0 0 0 0
( 104 -40 8 ) ( 104 128 8 ) ( 104 128 0 ) caulk 64 64 0 0 0 0
( 104 128 8 ) ( -72 128 8 ) ( -72 128 0 ) caulk 64 64 0 0 0 0
( -72 128 8 ) ( -72 -40 8 ) ( -72 -40 0 ) caulk 64 64 0 0 0 0
}

it reads as follow:

the curly brackets are grouping the lines in between them. Then for each line the three items in between parenthesis are the Vertices definition (to be understood as point(x, y, z)). the last part is well documented over the internet, it is the texture's name, X offset, Y offset,
rotation, x scale and y scale. Knowing that the regex to match that string is dead easy:

([0-9 \.\-]+\) ([0-9 \.\-]+\) ([0-9 \.\-]+\) [a-z0-9/_]+ [0-9\.\-]+ [0-9\.\-]+ [0-9\.\-]+ [0-9\.\-]+

Here is the trick for named groups; each item (eight of them in this example) can have a "flag" (or "tag") that can be subsequently called, iterating through the retrieved string. The regex string becomes the following:

(?[\t]*)(?\([0-9 \.\-]+\)) (?\([0-9 \.\-]+\)) (?\([0-9 \.\-]+\)) (?[a-z0-9/_]+) (?[0-9\.\-]+) (?[0-9\.\-]+) (?[0-9\.\-]+) (?[0-9\.\-]+) (?[0-9\.\-]+)

I actually found this very handy trick reading the Chapter 24 of the O'Reilly book: "C# 3.0 in a Nutshell" and experiencing with the fantastic small software call Expresso. Needless to say that all my trouble with regex were coming from my noobyness as a programmer. But hey life is a learning experience. :)

Hope this helps.

Tam

No comments: