XF 2.1 Help w JavaScript to syntax color a new language

Joe Kuhn

Active member
I've installed the Code addon in my FoxProToCSharp discussion forum to add line numbering and line highlighting for my users. Code coloring is supported in XF via PrismJS which supports C#, but not FoxPro. My goal is to take the PrismJS file for VisualBasic and make it work for FoxPro.

Believe it or not, here's the complete PrismJS JavaScript file for VB:

JavaScript:
[CODE=javascript]Prism.languages['visual-basic'] = {
    'comment': {
        pattern: /(?:['‘’]|REM\b).*/i,
        inside: {
            'keyword': /^REM/i
        }
    },
    'directive': {
        pattern: /#(?:Const|Else|ElseIf|End|ExternalChecksum|ExternalSource|If|Region)(?:[^\S\r\n]_[^\S\r\n]*(?:\r\n?|\n)|.)+/i,
        alias: 'comment',
        greedy: true
    },
    'string': {
        pattern: /["“”](?:["“”]{2}|[^"“”])*["“”]C?/i,
        greedy: true
    },
    'date': {
        pattern: /#[^\S\r\n]*(?:\d+([/-])\d+\1\d+(?:[^\S\r\n]+(?:\d+[^\S\r\n]*(?:AM|PM)|\d+:\d+(?::\d+)?(?:[^\S\r\n]*(?:AM|PM))?))?|(?:\d+[^\S\r\n]*(?:AM|PM)|\d+:\d+(?::\d+)?(?:[^\S\r\n]*(?:AM|PM))?))[^\S\r\n]*#/i,
        alias: 'builtin'
    },
    'number': /(?:(?:\b\d+(?:\.\d+)?|\.\d+)(?:E[+-]?\d+)?|&[HO][\dA-F]+)(?:U?[ILS]|[FRD])?/i,
    'boolean': /\b(?:True|False|Nothing)\b/i,
    'keyword': /\b(?:AddHandler|AddressOf|Alias|And(?:Also)?|As|Boolean|ByRef|Byte|ByVal|Call|Case|Catch|C(?:Bool|Byte|Char|Date|Dbl|Dec|Int|Lng|Obj|SByte|Short|Sng|Str|Type|UInt|ULng|UShort)|Char|Class|Const|Continue|Date|Decimal|Declare|Default|Delegate|Dim|DirectCast|Do|Double|Each|Else(?:If)?|End(?:If)?|Enum|Erase|Error|Event|Exit|Finally|For|Friend|Function|Get(?:Type|XMLNamespace)?|Global|GoSub|GoTo|Handles|If|Implements|Imports|In|Inherits|Integer|Interface|Is|IsNot|Let|Lib|Like|Long|Loop|Me|Mod|Module|Must(?:Inherit|Override)|My(?:Base|Class)|Namespace|Narrowing|New|Next|Not(?:Inheritable|Overridable)?|Object|Of|On|Operator|Option(?:al)?|Or(?:Else)?|Out|Overloads|Overridable|Overrides|ParamArray|Partial|Private|Property|Protected|Public|RaiseEvent|ReadOnly|ReDim|RemoveHandler|Resume|Return|SByte|Select|Set|Shadows|Shared|short|Single|Static|Step|Stop|String|Structure|Sub|SyncLock|Then|Throw|To|Try|TryCast|TypeOf|U(?:Integer|Long|Short)|Using|Variant|Wend|When|While|Widening|With(?:Events)?|WriteOnly|Xor)\b/i,
    'operator': [
        /[+\-*/\\^<=>&#@$%!]/,
        {
            pattern: /([^\S\r\n])_(?=[^\S\r\n]*[\r\n])/,
            lookbehind: true
        }
    ],
    'punctuation': /[{}().,:?]/
};

Prism.languages.vb = Prism.languages['visual-basic'];
I took the above file, renamed it accordingly and changed the header and footer to say foxpro instead of visual-basic. After installing per the Code addon directions, you can see that coloring is happening, but it's not working correctly for FoxPro comments:


First goal: get comments working for Fox.

In VB comments start with 'REM' or the single quote character.

In Fox they start with 'NOTE' an asterisk or two ampersand characters. More details about comments are included in my link just above, but I'd like to get some coloring working correctly first.

Based on this for VB:

JavaScript:
    'comment': {
        pattern: /(?:['‘’]|REM\b).*/i,
        inside: {
            'keyword': /^REM/i
        }
I'm changing it to this for Fox:

JavaScript:
    'comment': {
        pattern: /(?:['*’]|['&&’]|NOTE\b).*/,
        inside: {
            'keyword': /^['&&']/
        }
But I have some questions:

1. It appears forward slash is some type of delimiter, correct?
2. Why the question mark and colon after the pattern object?
3. What's with the 'i' at the end of some of the lines?
4. And \b. What's that for?

Any other comments you care to make about my first pass are welcome. And thanks in advance.
 
Last edited:

Jeremy P

Well-known member
The patterns themselves are regular expressions. The slash is a deliminator, (?:...) creates a non-capturing group, i is a case-insensitive modifier, and \b is a word boundary. They can be a bit confusing at first glance, but they're not too difficult once you learn the basic constructs.
 

Joe Kuhn

Active member
Case is important, so I took the 'i' out. That regular expression tester is awesome. I wonder what flavor we have in xf.
 

Joe Kuhn

Active member
Code:
^(NOTE ).*
^[*].*
&{2,}.*
Got three regular expressions above working for these strings separately:

NOTE this is a comment
* another comment
&& and a third comment

Now to put the 3 regexs together so they'll work on any one of the three strings... I better stop now...
 
Last edited:

Joe Kuhn

Active member
I tuned these up:
Code:
^(NOTE ).*
^[*].*
^&{2,}.*
With the following tests:
Code:
* There are three types of comments in FoxPro:
*        1. A full line comment starting with an asterisk
NOTE     2. A full line comment starting with 'NOTE'
X = 5 && 3. An inline comment starting at '&&' after a command
&& A double ampersand can be used at the beginning of a line as well.

* All three types of comments can be extended with the FoxPro ;
line extension character, the semicolon.

NOTE - example of a 'NOTE' comment extended ;
    with a semicolon.

X = 5 && Example of an inline comment;
         extended to a new line with a semicolon.

* Comments beginning with an asterisk or NOTE cannot be used inline;
NOTE everything above should be green except for 'X = 5'

&&& This is a comment
** So is this
* the remaining lines are not:
NOTECOUNT = 5
THISNOTECOUNT = 5
X = 5 * 5
X = 5 * 5 * 5
test = "hello"
t = "test"
s = &t
t = "Test with an * asterisk"
In English, a FoxPro comment can begin with an asterisk, the word "NOTE" or 2 ampersands. Only the ampersands can be used in line.

The hat character (^) limits you to the top line which may not be needed in PrismJS if it evals one line at a time.
 
Last edited:

Joe Kuhn

Active member
I changed the comment part of the js file from this:
JavaScript:
    'comment': {
        pattern: /(?:['‘’]|REM\b).*/i,
        inside: {
            'keyword': /^REM/i
        }
to this:
JavaScript:
    'comment': {
        pattern: /(NOTE ).*|[*].*|&{2,}.*/
    }
And it's starting to work, from my phone, but not my laptop until I cleared cached images and pages, as expected. Did have to take the hat character out to get it working even better. Now just the following lines from the '*' on are showing as comments when they shouldn't:

X = 5 * 5
X = 5 * 5 * 5

Oh yeah, plus the line extensions via the semicolon aren't working. See results here:

 
Last edited:

Joe Kuhn

Active member
The regex test site Jeremy references is awesome and i discovered the editor in cPanel (GoDaddy) is syntax sensitive. Omg, very helpful.

And I find this fun, mainly because it's so doable.
 
Last edited:

Joe Kuhn

Active member
This one took care of the X = 5 * 5 lines that should not have been comments. I stole it from an older Fox coloring system:

^\s*\*

Here's what Rexex101.com says about it:

^ asserts position at start of a line
\s* matches any whitespace character (equal to [\r\n\t\f\v ])
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\* matches the character * literally (case sensitive)

The whole pattern:

pattern: /(?:NOTE\b.*|^\s*\*|&{2,}).*/gm

And test page:

Realized that line continuation is a higher level problem to solve, so I'll come back to that later. Unless @Jeremy P knows the answer. All Foxpro lines can be continued with the semicolon, not just comments.
 

Joe Kuhn

Active member
Was lucky enough to find a database file with our FoxPro installation that lists all the tokens in the language. They are coded, so I tried to figure out what the codes mean in terms of coloring. They mean nothing consistent, so I put all the tokens in my keyword list except for some small ones at the top that Fox doesn't colorize. Now my coloring js file looks like this:

JavaScript:
Prism.languages['foxpro'] = {
    'comment': /(NOTE\b|^\s*\*|&{2,}).*/gm,
    'string': {
        pattern: /["'](?:["']{2}|[^"'])*["']C?/g,
        greedy: true
    },
    'keyword': /\b(Abs|Accelerate|Accept|Access|Aclass|Acopy|Acos)\b/i
};
Prism.languages.vb = Prism.languages['foxpro'];
I trimmed up the keyword object in order to be able to show it here. It goes on for 7 pages, but works fine.

I had to add the string object and make it greedy because I was highlighting some keywords inside of strings which it shouldn't do. I stole the string object straight out of the Visual-Basic script file. Greedy means it's a priority over other objects. Had to add the single quote because it's legal in Fox and use /g for Global regex option (don't return after first match) in order to make double quotes AND single quotes work.

Next is line continuation.
 
Last edited:

Joe Kuhn

Active member
If you delimit a string with double quotes, you can use single quotes within the string and vice versa, in Fox. I bet regex will handle that...
 

Joe Kuhn

Active member
Yes, this regex will handle double quote or single quote delimited strings as well as embedding one type within another. You cannot mix quote types around a string in Fox. It's one or the other, but embedding is allowed.

/(["](?:["]{2}|[^"])*["])|(['](?:[']{2}|[^'])*['])/g

1585348539901.png

Tests to date


Next is line continuation.
 
Last edited:

Joe Kuhn

Active member
PCRE for PHP and EMCAScript for JS. Judging by the above, this would be JS.
Any clue on how to do line continuation with a semicolon? Here's my analysis of the problem:

In FoxPro the semicolon is the line continuation character. Lines like the two below both appear green All comments are colored green.

* this is a comment ;
continued here

Even though the second line doesn't begin with the asterisk to indicate it's a comment, it's colored green.

The semicolon can also be used like this:

x = 5 + ;
4 + ;
3

Which is equivalent to: x = 5 + 4 + 3

So, if a line ends is a semicolon, the next line is treated as though the \n (carriage return) and \r (line feed) aren't there.
White space is allowed after the semicolon and is ignored.
There can be no characters after the semicolon except for white space. (verified syntactically)
You can have only one semicolon at the end of a line.
The semicolon is used as a delimiter for certain other commands as in SET PATH TO "c:\" ; "c:\csidev"

MyString = "hello there ;
Joe"

The above yields a string 15 characters long. The cr/lf is not a part of the string.

Line continuation may be a part of the interpreter of the language, but systax coloring honors it as well.
 
Last edited:

Joe Kuhn

Active member
I'm not really familiar with the ins and outs of writing Prism language definitions, but generally speaking you can use multi-line mode and anchoring to accomplish something like that: https://regex101.com/r/9ToY9Z/1
Looking good. I believe I've got the pieces now to make it work with the other two kinds of comments (NOTE, &&) and perhaps regular lines of code, although I only have 3 components so far: comments, strings and keywords. More fun!
 

Jeremy P

Well-known member
I looked quite a bit and could not find the pieces you've used. Sometimes their descriptions don't really match what they do and there are so many options to the regex code.
You can mouse over the components for an explanation, or see the explanation box to the side. As far as I can tell the descriptions there are accurate, but I can appreciate they may be confusing. If there are certain pieces you want explained further, let me know which and I can give it a shot. For what it's worth, there are likely other ways to accomplish the same thing.
 
Top