Difference between revisions of "Perl regex"

from HTYP, the free directory anyone can edit if they can prove to me that they're not a spambot
Jump to navigation Jump to search
(→‎Examples: more examples & variants)
m (→‎Related Articles: revised link)
Line 3: Line 3:
 
This article explains regular expressions in terms understandable to mere mortals, and also how to use them in Perl.
 
This article explains regular expressions in terms understandable to mere mortals, and also how to use them in Perl.
 
==Related Articles==
 
==Related Articles==
*[[regex]]: manpage documentation
+
*{{manpagelink|regex}}: manpage documentation
 +
 
 
==Details==
 
==Details==
 
Special characters in regex:
 
Special characters in regex:

Revision as of 18:21, 12 April 2006

Computing: Programming: Perl: regex

This article explains regular expressions in terms understandable to mere mortals, and also how to use them in Perl.

Related Articles

  • regex: manpage documentation

Details

Special characters in regex:

  • . = any character
  • * = 0 or more of previous character
  • ^ = following string begins the line (except [^...] means "not these characters")
  • $ = preceding string ends the line
  • [] = list of characters which can satisfy the match at this position
  • {} = # of repetitions of previous character:
    • {x} -> exactly x repetitions
    • {x,y} -> minimum of x repetitions, maximum of y repetitions
  • | = alternatives
  • + = 1 or more of previous character
  • ? after +, *, or {} indicates non-greedy behavior, i.e. match the fewest characters, not the most
  • a-b = range of characters from a to b, e.g. "t-w" means any of t,u,v,w in that position
  • ?= = lookahead (need explanation of how this works) a(?=b) returns "a, but only if it's followed by a b"; the a becomes part of the matched sequence, but the b does not
  • ?<= = reverse lookahead (need explanation of how this works)

Operators used to invoke regex:

  • =~ returns TRUE if pattern matches
  • !~ returns FALSE if pattern matches
  • s/pattern/replacement/gi; replaces pattern with replacement
    • g (global) means repeat the pattern search until there are no more matches
    • i (insensitive) means alphabetic matches are checked case-insensitively
  • y/searchlist/replacelist/d: replaces each character found in searchlist with the corresponding character in replacelist
    • d just deletes matching characters
  • tr/ is the same as y/

Examples

These examples have been tested only briefly.

  • Replace "thingy" with "stuffs" in $string:
    • $string =~ s/thingy/stuffs/;
  • Keep only the part of $string before the final "/" (using "|" as the delimiter instead of "/"):
    • $string =~ s|(.*)/[^/]*|$1|;
  • ...after the final "/":
    • $string =~ s| ^.*/([^/]*)$|$1|;
  • ...before the final "-":
    • $string =~ s|(.*)-[^-]*|$1|;
  • ...before the final ".":
    • $string =~ s|(.*)\.[^\.]*|$1|;
  • ...after the final "." (both of these return the full string if no "." is found):
    • $string =~ s|^.+\.(.+$)|$1|;
    • $string =~ s|^.*\.([^\.]*)$|$1|;