regular expression for email block

tenants

Well-known member
I'm trying to grab this block:

Code:
    <dl class="ctrlUnit">
        <dt><label for="ctrl_{$fieldMap.email}">{xen:phrase email}:</label></dt>
        <dd>
            <input type="email" name="{$fieldMap.email}" value="{$fields.email}" />
        </dd>
    </dl>

using this reg ex:

Code:
#<dl[^>]*>
    [^<]*<dt[^>]*>[^/dt>]*/dt>
    [^<]*<dd[^>]*>
        [^<]*<input type="email"(.*?)</dl>#siu

I can't spot what I'm doing wrong yet, any pointers ... I don't think its because I'm using greedy *, since this would at least pick up one match, its not the lay out ( I lay it out as a single line, not like above, I've just laid it out like this for clarity). It's been a while since I looked at regex, it's always been one of those things that take me a while to get back into:confused:

Once I've got this working, I still need to figure out how I'm going to avoid the core honeypots, but 1st things 1st

this also didnt work

Code:
#<dl[^>]*>  
    [^<]*<dt[^>]*>[^[/dt>]]*/dt>
    [^<]*<dd[^>]*>
        [^<]*<input type="email"(.*?)</dl>#siu

neither did this

Code:
#<dl[^>]*>   
    [^<]*<dt[^>]*>[^(/dt>)]*/dt>
    [^<]*<dd[^>]*>
        [^<]*<input type="email"(.*?)</dl>#siu
 
Last edited:
okay well this one works (picks up both hp and normal which is fine for now)

Code:
#<dl[^>]*>  
    [^<]*<dt[^>]*>[^<]*<label[^>]*>[^<]*</label>[^<]*</dt[^>]*>
    [^<]*<dd[^>]*>
        [^<]*<input type="email"(.*?)</dl>#siu

Still not sure whats wrong with [^(/dt>)]*

I get the feeling I'm speaking to myself anyway... ho hum
 
okay... done it.

This gets just the core email field block, without the honeypots:

Code:
#<dl[^>]*>[^<]*<dt[^>]*>[^<]*<label[^>]*>[^<]*</label>[^<]*</dt[^>]*>[^<]*<dd[^>]*>[^<]*<input type="email" name="{\$fieldMap.email}"(.*?)</dl>#siu

.. oh well, useful note to myself

I remember a time when Old Chris Kenobi use to respond to these types of Dev questions before I had even finished typing them, the xenForce was strong in that one, but he was too easily drawn to the path of the DevSide.
 
Lord Chris ... an unexpected visit :p

I got there in the end, I'm still not sure what is wrong with [^(/dt>)]* I think this should catch everything that is not /dt> (greedily), but I got there the long way around

Yeah, regex is not my favorite thing either
 
Top Bottom