Module Pcre
Perl Compatibility Regular Expressions for OCaml
7.5.0 - homepage
Exceptions
type error
=
|
Partial
String only matched the pattern partially
|
BadPartial
Pattern contains items that cannot be used together with partial matching.
|
BadPattern of string * int
BadPattern (msg, pos)
regular expression is malformed. The reason is inmsg
, the position of the error in the pattern inpos
.|
BadUTF8
UTF8 string being matched is invalid
|
BadUTF8Offset
Gets raised when a UTF8 string being matched with offset is invalid.
|
MatchLimit
Maximum allowed number of match attempts with backtracking or recursion is reached during matching. ALL FUNCTIONS CALLING THE MATCHING ENGINE MAY RAISE IT!!!
|
RecursionLimit
|
WorkspaceSize
Raised by
pcre_dfa_exec
when the provided workspace array is too small. See documention onpcre_dfa_exec
for details on workspace array sizing.|
InternalError of string
InternalError msg
C-library exhibits unknown/undefined behaviour. The reason is inmsg
.
exception
Error of error
Exception indicating PCRE errors.
exception
Regexp_or of string * error
Regexp_or (pat, error)
gets raised for sub-patternpat
byregexp_or
if it failed to compile.
Compilation and runtime flags and their conversion functions
and cflag
=[
]
Compilation flags
val cflags : cflag list -> icflag
cflags cflag_list
converts a list of compilation flags to their internal representation.
val cflag_list : icflag -> cflag list
cflag_list cflags
converts internal representation of compilation flags to a list.
type rflag
=[
|
`ANCHORED
Treats pattern as if it were anchored
|
`NOTBOL
Beginning of string is not treated as beginning of line
|
`NOTEOL
End of string is not treated as end of line
|
`NOTEMPTY
Empty strings are not considered to be a valid match
|
`PARTIAL
Turns on partial matching
|
`DFA_RESTART
Causes matching to proceed presuming the subject string is further to one partially matched previously using the same int-array working set. May only be used with
pcre_dfa_exec
orunsafe_pcre_dfa_exec
, and should always be paired with`PARTIAL
.]
Runtime flags
Information on the PCRE-configuration (build-time options)
Information on patterns
type firstbyte_info
=[
|
`Char of char
Fixed first character
|
`Start_only
Pattern matches at beginning and end of newlines
|
`ANCHORED
Pattern is anchored
]
Information on matching of "first chars" in patterns
type study_stat
=[
|
`Not_studied
Pattern has not yet been studied
|
`Studied
Pattern has been studied successfully
|
`Optimal
Pattern could not be improved by studying
]
Information on the study status of patterns
val size : regexp -> int
size regexp
- returns
memory size of
regexp
.
val studysize : regexp -> int
studysize regexp
- returns
memory size of study information of
regexp
.
val capturecount : regexp -> int
capturecount regexp
- returns
number of capturing subpatterns in
regexp
.
val backrefmax : regexp -> int
backrefmax regexp
- returns
number of highest backreference in
regexp
.
val namecount : regexp -> int
namecount regexp
- returns
number of named subpatterns in
regexp
.
val nameentrysize : regexp -> int
nameentrysize regexp
- returns
size of longest name of named subpatterns in
regexp
+ 3.
val names : regexp -> string array
names regex
- returns
array of names of named substrings in
regexp
.
val firstbyte : regexp -> firstbyte_info
firstbyte regexp
- returns
firstbyte info on
regexp
.
val firsttable : regexp -> string option
firsttable regexp
- returns
some 256-bit (32-byte) fixed set table in form of a string for
regexp
if available,None
otherwise.
val lastliteral : regexp -> char option
lastliteral regexp
- returns
some last matching character of
regexp
if available,None
otherwise.
val study_stat : regexp -> study_stat
study_stat regexp
- returns
study status of
regexp
.
val get_stringnumber : regexp -> string -> int
get_stringnumber rex name
- returns
the index of the named substring
name
in regular expressionrex
. This index can then be used withget_substring
.
- raises Invalid_arg
if there is no such named substring.
val get_match_limit : regexp -> int option
get_match_limit rex
- returns
some match limit of regular expression
rex
orNone
.
val get_match_limit_recursion : regexp -> int option
get_match_limit_recursion rex
- returns
some recursion match limit of regular expression
rex
orNone
.
Compilation of patterns
val maketables : unit -> chtables
Generates new set of char tables for the current locale.
val regexp : ?study:bool -> ?jit_compile:bool -> ?limit:int -> ?limit_recursion:int -> ?iflags:icflag -> ?flags:cflag list -> ?chtables:chtables -> string -> regexp
regexp ?jit_compile ?study ?limit ?limit_recursion ?iflags ?flags ?chtables pattern
compilespattern
withflags
when given, withiflags
otherwise, and with char tableschtables
. Ifstudy
is true, then the resulting regular expression will be studied. Ifjit_compile
is true, studying will also perform JIT-compilation of the pattern.If [limit] is specified, this sets a limit to the amount of recursion and backtracking (only lower than the builtin default!). If this limit is exceeded, [MatchLimit] will be raised during matching. @param study default = true @param jit_compile default = false @param limit default = no extra limit other than default @param limit_recursion default = no extra limit_recursion other than default @param iflags default = no extra flags @param flags default = ignored @param chtables default = builtin char tables @return the regular expression. For detailed documentation on how you can specify PERL-style regular expressions (= patterns), please consult the PCRE-documentation ("man pcrepattern") or PERL-manuals. @see <http://www.perl.com> www.perl.com
val regexp_or : ?study:bool -> ?jit_compile:bool -> ?limit:int -> ?limit_recursion:int -> ?iflags:icflag -> ?flags:cflag list -> ?chtables:chtables -> string list -> regexp
regexp_or ?study ?limit ?limit_recursion ?iflags ?flags ?chtables patterns
likeregexp
, but combinespatterns
as alternatives (or-patterns) into one regular expression.
Subpattern extraction
val get_subject : substrings -> string
get_subject substrings
- returns
the subject string of
substrings
.
val num_of_subs : substrings -> int
num_of_subs substrings
- returns
number of strings in
substrings
(whole match inclusive).
val get_substring : substrings -> int -> string
get_substring substrings n
- returns
the
n
th substring (0 is whole match) ofsubstrings
.
- raises Invalid_argument
if
n
is not in the range of the number of substrings.
- raises Not_found
if the corresponding subpattern did not capture a substring.
val get_substring_ofs : substrings -> int -> int * int
get_substring_ofs substrings n
- returns
the offset tuple of the
n
th substring ofsubstrings
(0 is whole match).
- raises Invalid_argument
if
n
is not in the range of the number of substrings.
- raises Not_found
if the corresponding subpattern did not capture a substring.
val get_substrings : ?full_match:bool -> substrings -> string array
get_substrings ?full_match substrings
- returns
the array of substrings in
substrings
. It includes the full match at index 0 whenfull_match
istrue
, the captured substrings only when it isfalse
. If a subpattern did not capture a substring, the empty string is returned in the corresponding position instead.
- parameter full_match
default = true
val get_opt_substrings : ?full_match:bool -> substrings -> string option array
get_opt_substrings ?full_match substrings
- returns
the array of optional substrings in
substrings
. It includesSome full_match_str
at index 0 whenfull_match
istrue
,Some captured_substrings
only when it isfalse
. If a subpattern did not capture a substring,None
is returned in the corresponding position instead.
- parameter full_match
default = true
val get_named_substring : regexp -> string -> substrings -> string
get_named_substring rex name substrings
- returns
the named substring
name
in regular expressionrex
andsubstrings
.
- raises Invalid_argument
if there is no such named substring.
- raises Not_found
if the corresponding subpattern did not capture a substring.
val get_named_substring_ofs : regexp -> string -> substrings -> int * int
get_named_substring_ofs rex name substrings
- returns
the offset tuple of the named substring
name
in regular expressionrex
andsubstrings
.
- raises Invalid_argument
if there is no such named substring.
- raises Not_found
if the corresponding subpattern did not capture a substring.
Callouts
type callout_data
=
{
callout_number : int;
Callout number
substrings : substrings;
Substrings matched so far
start_match : int;
Subject start offset of current match attempt
current_position : int;
Subject offset of current match pointer
capture_top : int;
Number of the highest captured substring so far
capture_last : int;
Number of the most recently captured substring
pattern_position : int;
Offset of next match item in pattern string
next_item_length : int;
Length of next match item in pattern string
}
type callout
= callout_data -> unit
Type of callout functions
Callouts are referred to in patterns as "(?Cn)" where "n" is a
callout_number
ranging from 0 to 255. Substrings captured so far are accessible as usual viasubstrings
. You will have to considercapture_top
andcapture_last
to know about the current state of valid substrings.By raising exception
Backtrack
within a callout function, the user can force the pattern matching engine to backtrack to other possible solutions. Other exceptions will terminate matching immediately and return control to OCaml.
Matching of patterns and subpattern extraction
val pcre_exec : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> string -> int array
pcre_exec ?iflags ?flags ?rex ?pat ?pos ?callout subj
- returns
an array of offsets that describe the position of matched subpatterns in the string
subj
starting at positionpos
with patternpat
when given, regular expressionrex
otherwise. The array also contains additional workspace needed by the match engine. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
val pcre_dfa_exec : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> ?workspace:int array -> string -> int array
pcre_dfa_exec ?iflags ?flags ?rex ?pat ?pos ?callout ?workspace subj
invokes the "alternative" DFA matching function.- returns
an array of offsets that describe the position of matched subpatterns in the string
subj
starting at positionpos
with patternpat
when given, regular expressionrex
otherwise. The array also contains additional workspace needed by the match engine. Usesflags
when given, the precompilediflags
otherwise. Requires a sufficiently-largeworkspace
array. Callouts are handled bycallout
.Note that the returned array of offsets are quite different from those returned by
pcre_exec
et al. The motivating use case for the DFA match function is to be able to restart a partial match with N additional input segments. Because the match function/workspace does not store segments seen previously, the offsets returned when a match completes will refer only to the matching portion of the last subject string provided. Thus, returned offsets from this function should not be used to support extracting captured submatches. If you need to capture submatches from a series of inputs incrementally matched with this function, you'll need to concatenate those inputs that yield a successful match here and re-run the same pattern against that single subject string.Aside from an absolute minimum of
20
, PCRE does not provide any guidance regarding the size of workspace array needed by any given pattern. Therefore, it is wise to appropriately handle the possibleWorkspaceSize
error. If raised, you can allocate a new, larger workspace array and begin the DFA matching process again.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
- parameter workspace
default = fresh array of length
20
- raises Not_found
if the pattern match has failed
- raises Error
Partial if the pattern has matched partially; a subsequent exec call with the same pattern and workspace (adding the
DFA_RESTART
flag) be made to either further advance or complete the partial match.
- raises Error
WorkspaceSize if the workspace array is too small to accommodate the DFA state required by the supplied pattern
val exec : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> string -> substrings
exec ?iflags ?flags ?rex ?pat ?pos ?callout subj
- returns
substring information on string
subj
starting at positionpos
with patternpat
when given, regular expressionrex
otherwise. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
val exec_all : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> string -> substrings array
exec_all ?iflags ?flags ?rex ?pat ?pos ?callout subj
- returns
an array of substring information of all matching substrings in string
subj
starting at positionpos
with patternpat
when given, regular expressionrex
otherwise. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
val next_match : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> substrings -> substrings
next_match ?iflags ?flags ?rex ?pat ?pos ?callout substrs
- returns
substring information on the match that follows on the last match denoted by
substrs
, jumping overpos
characters (also backwards!), using patternpat
when given, regular expressionrex
otherwise. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
- raises Invalid_arg
if
pos
let matching start outside of the subject string.
val extract : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?full_match:bool -> ?callout:callout -> string -> string array
extract ?iflags ?flags ?rex ?pat ?pos ?full_match ?callout subj
- returns
the array of substrings that match
subj
starting at positionpos
, using patternpat
when given, regular expressionrex
otherwise. Usesflags
when given, the precompilediflags
otherwise. It includes the full match at index 0 whenfull_match
istrue
, the captured substrings only when it isfalse
. Callouts are handled bycallout
. If a subpattern did not capture a substring, the empty string is returned in the corresponding position instead.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter full_match
default = true
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
val extract_opt : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?full_match:bool -> ?callout:callout -> string -> string option array
extract_opt ?iflags ?flags ?rex ?pat ?pos ?full_match ?callout subj
- returns
the array of optional substrings that match
subj
starting at positionpos
, using patternpat
when given, regular expressionrex
otherwise. Usesflags
when given, the precompilediflags
otherwise. It includesSome full_match_str
at index 0 whenfull_match
istrue
,Some captured-substrings
only when it isfalse
. Callouts are handled bycallout
. If a subpattern did not capture a substring,None
is returned in the corresponding position instead.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter full_match
default = true
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
val extract_all : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?full_match:bool -> ?callout:callout -> string -> string array array
extract_all ?iflags ?flags ?rex ?pat ?pos ?full_match ?callout subj
- returns
an array of arrays of all matching substrings that match
subj
starting at positionpos
, using patternpat
when given, regular expressionrex
otherwise. Usesflags
when given, the precompilediflags
otherwise. It includes the full match at index 0 of the extracted string arrays whenfull_match
istrue
, the captured substrings only when it isfalse
. Callouts are handled bycallout
.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter full_match
default = true
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
val extract_all_opt : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?full_match:bool -> ?callout:callout -> string -> string option array array
extract_all_opt ?iflags ?flags ?rex ?pat ?pos ?full_match ?callout subj
- returns
an array of arrays of all optional matching substrings that match
subj
starting at positionpos
, using patternpat
when given, regular expressionrex
otherwise. Usesflags
when given, the precompilediflags
otherwise. It includesSome full_match_str
at index 0 of the extracted string arrays whenfull_match
istrue
,Some captured_substrings
only when it isfalse
. Callouts are handled bycallout
. If a subpattern did not capture a substring,None
is returned in the corresponding position instead.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter full_match
default = true
- parameter callout
default = ignore callouts
- raises Not_found
if pattern does not match.
val pmatch : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> string -> bool
pmatch ?iflags ?flags ?rex ?pat ?pos ?callout subj
- returns
true
ifsubj
is matched by patternpat
when given, regular expressionrex
otherwise, starting at positionpos
. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.
- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
String substitution
val subst : string -> substitution
subst str
converts the stringstr
representing a substitution pattern to the internal representationThe contents of the substitution string
str
can be normal text mixed with any of the following (mostly as in PERL):- $[0-9]+ - a "$" immediately followed by an arbitrary number. "$0" stands for the name of the executable, any other number for the n-th backreference.
- $& - the whole matched pattern
- $` - the text before the match
- $' - the text after the match
- $+ - the last group that matched
- $$ - a single "$"
- $! - delimiter which does not appear in the substitution. Can be used to part "$
0-9
+" from an immediately following other number.
val replace : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?itempl:substitution -> ?templ:string -> ?callout:callout -> string -> string
replace ?iflags ?flags ?rex ?pat ?pos ?itempl ?templ ?callout subj
replaces all substrings ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the substitution stringtempl
when given,itempl
otherwise. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter itempl
default = empty string
- parameter templ
default = ignored
- parameter callout
default = ignore callouts
- raises Failure
if there are backreferences to nonexistent subpatterns.
val qreplace : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?templ:string -> ?callout:callout -> string -> string
qreplace ?iflags ?flags ?rex ?pat ?pos ?templ ?callout subj
replaces all substrings ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the stringtempl
. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter templ
default = ignored
- parameter callout
default = ignore callouts
val substitute_substrings : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> subst:(substrings -> string) -> string -> string
substitute_substrings ?iflags ?flags ?rex ?pat ?pos ?callout ~subst subj
replaces all substrings ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the result of functionsubst
applied to the substrings of the match. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
val substitute : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> subst:(string -> string) -> string -> string
substitute ?iflags ?flags ?rex ?pat ?pos ?callout ~subst subj
replaces all substrings ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the result of functionsubst
applied to the match. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
val replace_first : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?itempl:substitution -> ?templ:string -> ?callout:callout -> string -> string
replace_first ?iflags ?flags ?rex ?pat ?pos ?itempl ?templ ?callout subj
replaces the first substring ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the substitution stringtempl
when given,itempl
otherwise. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter itempl
default = empty string
- parameter templ
default = ignored
- parameter callout
default = ignore callouts
- raises Failure
if there are backreferences to nonexistent subpatterns.
val qreplace_first : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?templ:string -> ?callout:callout -> string -> string
qreplace_first ?iflags ?flags ?rex ?pat ?pos ?templ ?callout subj
replaces the first substring ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the stringtempl
. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter templ
default = ignored
- parameter callout
default = ignore callouts
val substitute_substrings_first : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> subst:(substrings -> string) -> string -> string
substitute_substrings_first ?iflags ?flags ?rex ?pat ?pos ?callout ~subst subj
replaces the first substring ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the result of functionsubst
applied to the substrings of the match. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
val substitute_first : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?callout:callout -> subst:(string -> string) -> string -> string
substitute_first ?iflags ?flags ?rex ?pat ?pos ?callout ~subst subj
replaces the first substring ofsubj
matching patternpat
when given, regular expressionrex
otherwise, starting at positionpos
with the result of functionsubst
applied to the match. Usesflags
when given, the precompilediflags
otherwise. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter callout
default = ignore callouts
Splitting
val split : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?max:int -> ?callout:callout -> string -> string list
split ?iflags ?flags ?rex ?pat ?pos ?max ?callout subj
splitssubj
into a list of at mostmax
strings, using as delimiter patternpat
when given, regular expressionrex
otherwise, starting at positionpos
. Usesflags
when given, the precompilediflags
otherwise. Ifmax
is zero, trailing empty fields are stripped. If it is negative, it is treated as arbitrarily large. If neitherpat
norrex
are specified, leading whitespace will be stripped! Should behave exactly as in PERL. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter max
default = 0
- parameter callout
default = ignore callouts
val asplit : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?max:int -> ?callout:callout -> string -> string array
asplit ?iflags ?flags ?rex ?pat ?pos ?max ?callout subj
same asPcre.split
but- returns
an array instead of a list.
type split_result
=
|
Text of string
Text part of split string
|
Delim of string
Delimiter part of split string
|
Group of int * string
Subgroup of matched delimiter (subgroup_nr, subgroup_str)
|
NoGroup
Unmatched subgroup
Result of a
Pcre.full_split
val full_split : ?iflags:irflag -> ?flags:rflag list -> ?rex:regexp -> ?pat:string -> ?pos:int -> ?max:int -> ?callout:callout -> string -> split_result list
full_split ?iflags ?flags ?rex ?pat ?pos ?max ?callout subj
splitssubj
into a list of at mostmax
elements of type "split_result", using as delimiter patternpat
when given, regular expressionrex
otherwise, starting at positionpos
. Usesflags
when given, the precompilediflags
otherwise. Ifmax
is zero, trailing empty fields are stripped. If it is negative, it is treated as arbitrarily large. Should behave exactly as in PERL. Callouts are handled bycallout
.- parameter iflags
default = no extra flags
- parameter flags
default = ignored
- parameter rex
default = matches whitespace
- parameter pat
default = ignored
- parameter pos
default = 0
- parameter max
default = 0
- parameter callout
default = ignore callouts
Additional convenience functions
val foreach_line : ?ic:Stdlib.in_channel -> (string -> unit) -> unit
foreach_line ?ic f
appliesf
to each line in inchannelic
until the end-of-file is reached.- parameter ic
default = stdin
val foreach_file : string list -> (string -> Stdlib.in_channel -> unit) -> unit
foreach_file filenames f
opens each file in the listfilenames
for input and appliesf
to each filename and the corresponding channel. Channels are closed after each operation (even when exceptions occur - they get reraised afterwards!).
UNSAFE STUFF - USE WITH CAUTION!
val unsafe_pcre_exec : irflag -> regexp -> pos:int -> subj_start:int -> subj:string -> int array -> callout option -> unit
unsafe_pcre_exec flags rex ~pos ~subj_start ~subj offset_vector callout
. You should read the C-source to know what happens. If you do not understand it - don't use this function!
val make_ovector : regexp -> int * int array
make_ovector regexp
calculates the tuple (subgroups2, ovector) which is the number of subgroup offsets and the offset array.
val unsafe_pcre_dfa_exec : irflag -> regexp -> pos:int -> subj_start:int -> subj:string -> int array -> callout option -> workspace:int array -> unit
unsafe_pcre_dfa_exec flags rex ~pos ~subj_start ~subj offset_vector callout ~workpace
. You should read the C-source to know what happens. If you do not understand it - don't use this function!