Help with a regular expression???
Kevin O'Gorman
kogorman at gmail.com
Wed Jan 27 14:06:52 UTC 2010
On Wed, Jan 27, 2010 at 5:36 AM, Ray Parrish <crp at cmc.net> wrote:
> Hello,
>
> I am working on some code to take a textually formatted list, and change
> it into an HTML formatted list. I have input that looks like the following
> -
>
> - visual wxWindows frame design,
> - object inspector and explorer,
> - syntax highlighting editor with code completion, call tips and code
> browsing for Python code,
> - syntax highlighting editor for C, C++, HTML, XML, config files (INI
> style),
> - documentation generation,
> - an integrated Python debugger,
> - integrated help,
> - a Python Shell,
> - an explorer able to browse, open/edit, inspect and interact with
> various data sources including files, CVS, Zope, FTP, DAV and SSH,
> - an UML view generator.
>
> What I need to do is replace the occurrences of 4 spaces followed by any
> character, with an underline character, a space character, and the
> original fifth character on the line to indicate line continuation to a
> following routine, which will concatenate the pieces on the continued
> lines onto the previous line segments.
>
> The part I do not know how to do is preserving the fifth character in an
> assignment like the following [the entire data section above will be in
> one variable]
>
> Data=${Data/ [a-z,A-Z]/_ }
>
> As you can likely see, that code line will not work yet, as I do not
> know how to specify that whatever character gets found after the fourth
> space is to be part of the replace term. I'm not even sure that I have
> properly specified the search term to match the single fifth character
> either.
>
> It looks like you are trying to use bash patterns. They are not even
regular expressions.
However that may be, I would suggest a perl filter, which is a one-liner
suitable for a pipeline.
perl -p -e 's/ ([^ ])/_ \1/g;'
translation:
-p: copy everything in a loop
-e: statement for the loop follows
s/ / /g: do a substitution on everything (g=multiple times on a
line)
( ): remember this
[^ ]: a character class consisting of any one non-space character.
Probably does not match newlines either.
\1 : the first remembered thing
Hope this helps.
++ kevin
--
Kevin O'Gorman, PhD
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-users/attachments/20100127/05fb20da/attachment.html>
More information about the ubuntu-users
mailing list