[OS X TeX] [OT] Need Perl Regex for...

Herbert Schulz herbs at wideopenwest.com
Sat Sep 20 22:41:15 CEST 2008


On Sep 20, 2008, at 1:17 PM, Alan Munn wrote:

> At 12:11 PM -0500 9/20/08, Herbert Schulz wrote:
>> Howdy,
>>
>> Suppose I have a sentence like
>>
>> Here are some words <fnameA>.<fnameB>.<ext> and more text afterward.
>>
>> where <fnameA> and <fnameb> may have spaces/tabs in them and you  
>> may assume <ext> has no spaces/tabs. Can one of you Perl experts  
>> out there (I know you're there!) give me a Perl regex that would  
>> pick out only the <fnameA>.<fnameB.<ext> part of the line. Can it  
>> be generalized to include multiple <fname> sections separated by `.'
>
> Unless the "Here are some words" part is some sort of fixed string  
> that you could identify, I don't think there's any way to  
> distinguish a word that is part of the "Here are some words" part  
> from a word that is part of <fnameA>,  if <fnameA> is allowed to  
> contain spaces.
>
> I.e. if fnameA = My file.ext
>
> how can you tell whether "My" in  "Here are some words My file.ext "  
> belongs to the filename or not?
>
> If spaces are prohibited, then the regex
>
> (:?[\S]*?\.)+[\w]{3}
>
> will pick out sequences of <fnameA>.<fnameB>.ext for arbitrary  
> numbers of <fname> assuming ext is always 3 characters.  But I don't  
> see a way around the spaces problem. (But I'm prepared to be amazed  
> by someone else's answer!)
>
> Alan


Howdy,

You can assume that the first part is fixed, so that isn't the real  
problem. Also, the second part , with its leading `.' is optional and  
may repeat: e.g., <fnameA>.<ext>, <fnameA>.<fnameB>.<ext>,  
<fnameA>.<fnameB>.<fnameC>.<ext>, etc. The final part is NOT fixed and  
may not exist in some situations.

Good Luck,

Herb Schulz
(herbs at wideopenwest dot com)






More information about the macostex-archives mailing list