Python re模块详解

日期： 2018-06-28 分类：跨站数据 492次阅读

2. Python re模块

2.0 re.flags

re.I 忽略大小写
re.L 表示特殊字符集 \w, \W, \b, \B, \s, \S 依赖于当前环境
re.M 多行模式
re.S 即为 . 并且包括换行符在内的任意字符（. 不包括换行符）
re.U 表示特殊字符集 \w, \W, \b, \B, \d, \D, \s, \S 依赖于 Unicode 字符属性数据库
re.X 为了增加可读性，忽略空格和 # 后面的注释

2.1 re.match()

re.match 尝试从字符串的起始位置匹配一个模式，如果不是起始位置匹配成功的话，match()就返回none。

re.match(pattern, string, flags=0)

匹配成功re.match方法返回一个匹配的对象，否则返回None。

我们可以使用group(num) 或 groups() 匹配对象函数来获取匹配表达式。

匹配对象方法	描述
group(num=0)	匹配的整个表达式的字符串，group() 可以一次输入多个组号，在这种情况下它将返回一个包含那些组所对应值的元组。
groups()	返回一个包含所有小组字符串的元组，从 1 到所含的小组号。

实例

   #!/usr/bin/python
  

   
   import
    
   re
    
   line
    = 
   "
   Cats are smarter than dogs
   "
    
  

   matchObj
    = 
   re
   .
   match
   (
    
   r
   '
   (.*) are (.*?) .*
   '
   , 
   line
   , 
   re
   .
   M
   |
   re
   .
   I
   )
    
  

   if
    
   matchObj
   : 
  

       print
    
   "
   matchObj.group() : 
   "
   , 
   matchObj
   .
   group
   (
   )
    
  

       print
    
   "
   matchObj.group(1) : 
   "
   , 
   matchObj
   .
   group
   (
   1
   )
    
  

       print
    
   "
   matchObj.group(2) : 
   "
   , 
   matchObj
   .
   group
   (
   2
   )
  

   
   else
   : 
  

       print
    
   "
   No match!!
   "
  

以上实例执行结果如下：

matchObj.group() :  Cats are smarter than dogs
matchObj.group(1) :  Cats
matchObj.group(2) :  smarter

2.2 re.search()

re.search 扫描整个字符串并返回第一个成功的匹配。

re.search(pattern, string, flags=0)

匹配成功re.search方法返回一个匹配的对象，否则返回None。

我们可以使用group(num) 或 groups() 匹配对象函数来获取匹配表达式。

2.3 re.sub()

re.sub(pattern, repl, string, count=0, flags=0)  #用于替换字符串中的匹配项

参数：

pattern : 正则中的模式字符串。
repl : 替换的字符串，也可为一个函数。
string : 要被查找替换的原始字符串。
count : 模式匹配后替换的最大次数，默认 0 表示替换所有的匹配。

当repl为一个函数时的例子：

实例

   #!/usr/bin/python
   
   # -*- coding: UTF-8 -*-
    
  

   import
    
   re
    
   # 将匹配的数字乘以 2
  

   
   def
    
   double
   (
   matched
   )
   : 
  

       value
    = 
   int
   (
   matched
   .
   group
   (
   '
   value
   '
   )
   )
    
  

       return
    
   str
   (
   value
    * 
   2
   )
    
  

   s
    = 
   '
   A23G4HFD567
   '
  

   
   print
   (
   re
   .
   sub
   (
   '
   (?P<value>
   \d
   +)
   '
   , 
   double
   , 
   s
   )
   )
  

执行输出结果为：

A46G8HFD1134

2.4 re.split()

re.split(pattern, string[, maxsplit=0, flags=0])  #split 方法按照能够匹配的子串将字符串分割后返回列表
#

maxsplit

分隔次数，maxsplit=1 分隔一次，默认为 0，不限制次数。

2.5 re.finditer()

re.finditer(pattern, string, flags=0)  #findall 类似，在字符串中找到正则表达式所匹配的所有子串，并把它们作为一个迭代器返回。

2.6 re.findall()

findall(string[, pos[, endpos]]) #在字符串中找到正则表达式所匹配的所有子串，并返回一个列表，如果没有找到匹配的，则返回空列表。

参数：

string : 待匹配的字符串。
pos : 可选参数，指定字符串的起始位置，默认为 0。
endpos : 可选参数，指定字符串的结束位置，默认为字符串的长度。

2.7 re.compile()

re.compile(pattern[, flags])  #compile 函数用于编译正则表达式，生成一个正则表达式（ Pattern ）对象，供 match() 和 search() 这两个函数使用。

prog = re.compile(pattern)
result = prog.match(string)

等价于

result = re.match(pattern, string)

有利于pattern的重复利用

2.8 re.fullmatch()

re. fullmatch ( pattern, string, flags=0 )

If the whole string matches the regular expression pattern, return a corresponding match object. Return Noneif the string does not match the pattern; note that this is different from a zero-length match.

3. 各种Objects

3.1 pattern objects -- re.compile()产生

pattern支持的方法和属性：

pattern.search(string[,pos[,endpos]])

pattern.match(string[,pos[,endpos]])

pattern.fullmatch(string[,pos[,endpos]])

pattern.split(string, maxsplit = 0)

...

pattern.groups

pattern.pattern

pattern.flags

3.2 Match objects --由match等方法产生

match.group([groupname1,...])

match.groups()

match.groupdict()

match.start([group])

match.end([group])

match.span([group])

.....

2. Python re模块

2.0 re.flags

re.I 忽略大小写
re.L 表示特殊字符集 \w, \W, \b, \B, \s, \S 依赖于当前环境
re.M 多行模式
re.S 即为 . 并且包括换行符在内的任意字符（. 不包括换行符）
re.U 表示特殊字符集 \w, \W, \b, \B, \d, \D, \s, \S 依赖于 Unicode 字符属性数据库
re.X 为了增加可读性，忽略空格和 # 后面的注释

2.1 re.match()

re.match 尝试从字符串的起始位置匹配一个模式，如果不是起始位置匹配成功的话，match()就返回none。

re.match(pattern, string, flags=0)

匹配成功re.match方法返回一个匹配的对象，否则返回None。

我们可以使用group(num) 或 groups() 匹配对象函数来获取匹配表达式。

匹配对象方法	描述
group(num=0)	匹配的整个表达式的字符串，group() 可以一次输入多个组号，在这种情况下它将返回一个包含那些组所对应值的元组。
groups()	返回一个包含所有小组字符串的元组，从 1 到所含的小组号。

实例

    #!/usr/bin/python
   

    
    import
     
    re
     
    line
     = 
    "
    Cats are smarter than dogs
    "
     
   

    matchObj
     = 
    re
    .
    match
    (
     
    r
    '
    (.*) are (.*?) .*
    '
    , 
    line
    , 
    re
    .
    M
    |
    re
    .
    I
    )
     
   

    if
     
    matchObj
    : 
   

        print
     
    "
    matchObj.group() : 
    "
    , 
    matchObj
    .
    group
    (
    )
     
   

        print
     
    "
    matchObj.group(1) : 
    "
    , 
    matchObj
    .
    group
    (
    1
    )
     
   

        print
     
    "
    matchObj.group(2) : 
    "
    , 
    matchObj
    .
    group
    (
    2
    )
   

    
    else
    : 
   

        print
     
    "
    No match!!
    "
   

以上实例执行结果如下：

matchObj.group() :  Cats are smarter than dogs
matchObj.group(1) :  Cats
matchObj.group(2) :  smarter

2.2 re.search()

re.search 扫描整个字符串并返回第一个成功的匹配。

re.search(pattern, string, flags=0)

匹配成功re.search方法返回一个匹配的对象，否则返回None。

我们可以使用group(num) 或 groups() 匹配对象函数来获取匹配表达式。

2.3 re.sub()

re.sub(pattern, repl, string, count=0, flags=0)  #用于替换字符串中的匹配项

参数：

pattern : 正则中的模式字符串。
repl : 替换的字符串，也可为一个函数。
string : 要被查找替换的原始字符串。
count : 模式匹配后替换的最大次数，默认 0 表示替换所有的匹配。

当repl为一个函数时的例子：

实例

    #!/usr/bin/python
    
    # -*- coding: UTF-8 -*-
     
   

    import
     
    re
     
    # 将匹配的数字乘以 2
   

    
    def
     
    double
    (
    matched
    )
    : 
   

        value
     = 
    int
    (
    matched
    .
    group
    (
    '
    value
    '
    )
    )
     
   

        return
     
    str
    (
    value
     * 
    2
    )
     
   

    s
     = 
    '
    A23G4HFD567
    '
   

    
    print
    (
    re
    .
    sub
    (
    '
    (?P<value>
    \d
    +)
    '
    , 
    double
    , 
    s
    )
    )
   

执行输出结果为：

A46G8HFD1134

2.4 re.split()

re.split(pattern, string[, maxsplit=0, flags=0])  #split 方法按照能够匹配的子串将字符串分割后返回列表
#

maxsplit

分隔次数，maxsplit=1 分隔一次，默认为 0，不限制次数。

2.5 re.finditer()

re.finditer(pattern, string, flags=0)  #findall 类似，在字符串中找到正则表达式所匹配的所有子串，并把它们作为一个迭代器返回。

2.6 re.findall()

findall(string[, pos[, endpos]]) #在字符串中找到正则表达式所匹配的所有子串，并返回一个列表，如果没有找到匹配的，则返回空列表。

参数：

string : 待匹配的字符串。
pos : 可选参数，指定字符串的起始位置，默认为 0。
endpos : 可选参数，指定字符串的结束位置，默认为字符串的长度。

2.7 re.compile()

re.compile(pattern[, flags])  #compile 函数用于编译正则表达式，生成一个正则表达式（ Pattern ）对象，供 match() 和 search() 这两个函数使用。

prog = re.compile(pattern)
result = prog.match(string)

等价于

result = re.match(pattern, string)

有利于pattern的重复利用

2.8 re.fullmatch()

re. fullmatch ( pattern, string, flags=0 )

3. 各种Objects

3.1 pattern objects -- re.compile()产生

pattern支持的方法和属性：

pattern.search(string[,pos[,endpos]])

pattern.match(string[,pos[,endpos]])

pattern.fullmatch(string[,pos[,endpos]])

pattern.split(string, maxsplit = 0)

...

pattern.groups

pattern.pattern

pattern.flags

3.2 Match objects --由match等方法产生

match.group([groupname1,...])

match.groups()

match.groupdict()

match.start([group])

match.end([group])

match.span([group])

.....

除特别声明，本站所有文章均为原创，如需转载请以超级链接形式注明出处：SmartCat's Blog

标签：Python

上一篇： Docker进入mysql容器

下一篇：微信小程序支付(此帖转发)

Young87

So happy to code my life!

Python re模块详解

实例

实例

实例

实例