Saturday, September 23, 2006

JSP Parser

I was required to write a program to parse JSP for texts in the JSP. Googled for "JSP parser", most results are either irrelevant or a module in servlet/JSP containers. So i decided to write my own simple JSP parser. Initially, I tried with a naive regular expression:




>([^<>]+)<

It seems that this solution is simple enough with resonable results (about 10% of errors). Regular expression is such a powerful and handy tool for string manipulation. It is popular in scripting, yet not many java programmers know or learn it.


So i decided to refine the pattern to more complicated pattern, in order to reduce the error rate.
Then i found this pattern <([A-Z][A-Z0-9]*)\b[^>]*>(.*?) from http://www.regular-expressions.info/examples.html. This is the first time i learn backreference (\1 in the pattern). I thought it really solved my problem as it can scanned for matched tags. While this solution works most of the times, it does have a minor drawback- it cannot handle nested tag (such as <td>...<td> ...</td> ...</td>).
It seems that although this is acceptable, it is not perfect solution. Tedious human verification is required.
...
...
...

Finally i found HTML Parser (http://htmlparser.sourceforge.net/) which can solve my problem elegantly. What i need to do is just to extend org.htmlparser.visitors.NodeVisitor (actual implementation is just modified from org.htmlparser.visitors.TextExtractingVisitor).
After trying it, it leaves a final issue to solve...

HTML Parser cannot handle properly handle tags </tag between <SCRIPT> </SCRIPT>. HTML Parser assumes that first </tag marks the end of <SCRIPT>. Although this is hardly a issue in normal HTML, in many JSPs, JSP tags such a <logic:present ... can appear everywhere, including between script tags.

Since HTML Parser is subject to GNU Lesser General Public License, so it is legal to modify it ; P
After trying to understanding its code, i modified its org.htmlparser.lexer.Lexer.parseCDATA(boolean) parse for matched ending tag to opening tag. It works!


Lesson learnt: a good software is easy to understand, modify/fix (without breaking the code) and test. Its related behaviours are put into a single place (i.e. cohesive).
HTML Parser is a good example of quality software which undergoes continuous unittesting, enhancement and refactoring.

Sunday, September 17, 2006

shoulder injury again...

After performing last rep of seated dumbbell shoulder press, during rotating downwards dumbbell to side of hips (may be due to muscle fatigue, it is somehow uncontrolled), a shoulder subluxation happened on my left shoulder.

A shoulder subluxation is a temporary, partial dislocation of the shoulder joint. The shoulder is a ball and socket joint. The ball of the upper arm bone (humerus) is held into the socket (glenoid) of the shoulder blade (scapula) by a group of ligaments.

I could feel that my left shoulder had gone "in and out of joint" and a sound of bone rubbing was clearly heard, followed by pain and numbness in my shoulder.

Since shoulders are always recruited in upper body workouts and further exercise will definitely worsen my shoulder injury, it's time to focus on lower body workout and cardio exercises...

After my shoulder recovers
- my injured shoulder has full range of motion without pain
- my injured shoulder has regained normal strength compared to the uninjured shoulder,
my top priority will be to strengthen my rotator cuff (internal & external) muscles, bcoz strong rotator cuff help prevent shoulder injuries.

Sunday, September 03, 2006

Grips

Overhand grip, underhand grip - the way to hold the handles
Wide grip, narrow grip, medium grip - the distance between left & right hands
T-bar grip, V-bar grip - handle/bar used to grip

After today experiments with pull-downs and cable seated rows, I found that different grips stimulate slightly different muscle groups. Certain muscles (such as back, biceps & triceps) can benefit significantly from varying the grip styles (such as in supersets)...