[OS X TeX] [OT] Need Perl Regex for...
Alan Munn
amunn at msu.edu
Sat Sep 20 23:01:35 CEST 2008
At 3:41 PM -0500 9/20/08, Herbert Schulz wrote:
>On Sep 20, 2008, at 1:17 PM, Alan Munn wrote:
>
>>At 12:11 PM -0500 9/20/08, Herbert Schulz wrote:
>>>Howdy,
>>>
>>>Suppose I have a sentence like
>>>
>>>Here are some words <fnameA>.<fnameB>.<ext> and more text afterward.
>>>
>>>where <fnameA> and <fnameb> may have spaces/tabs in them and you
>>>may assume <ext> has no spaces/tabs. Can one of you Perl experts
>>>out there (I know you're there!) give me a Perl regex that would
>>>pick out only the <fnameA>.<fnameB.<ext> part of the line. Can it
>>>be generalized to include multiple <fname> sections separated by
>>>`.'
>>
>>Unless the "Here are some words" part is some sort of fixed string
>>that you could identify, I don't think there's any way to
>>distinguish a word that is part of the "Here are some words" part
>>from a word that is part of <fnameA>, if <fnameA> is allowed to
>>contain spaces.
>>
>>I.e. if fnameA = My file.ext
>>
>>how can you tell whether "My" in "Here are some words My file.ext
>>" belongs to the filename or not?
>>
>>If spaces are prohibited, then the regex
>>
>>(:?[\S]*?\.)+[\w]{3}
>>
>>will pick out sequences of <fnameA>.<fnameB>.ext for arbitrary
>>numbers of <fname> assuming ext is always 3 characters. But I
>>don't see a way around the spaces problem. (But I'm prepared to be
>>amazed by someone else's answer!)
>>
>>Alan
>
>
>Howdy,
>
>You can assume that the first part is fixed, so that isn't the real
>problem. Also, the second part , with its leading `.' is optional
>and may repeat: e.g., <fnameA>.<ext>, <fnameA>.<fnameB>.<ext>,
><fnameA>.<fnameB>.<fnameC>.<ext>, etc. The final part is NOT fixed
>and may not exist in some situations.
Now it looks like you've changed your original request. But assuming
I understand what you want, the following will do, where DELIM is
whatever the fixed first part is. After evaluating the regex, \1
will contain the string which contains a list of filenames of the
sort FnameA.FnameB.ext for arbitrary numbers of FnameA (separated by
.) and each filename.ext separated by a comma or a comma and a space
(or even a space); Filenames themselves can have spaces.
DELIM ((:?(:?.*?\.)+[\w]{3}[, ]*)+)
>
>Good Luck,
Sometimes you're automatic text produces funny results. I'm not sure
I need the luck here!
Alan
More information about the macostex-archives
mailing list