regcomp, regerror, regexec, regfree, regex, regex_t, regmatch_t, regoff_t(3, 3type) | POSIX regex functions |
regcomp, regerror, regexec, regfree(3P, 3p) | regular expression matching |
regex(3) | Library Functions Manual | regex(3) |
regcomp, regexec, regerror, regfree - POSIX regex functions
Standard C library (libc, -lc)
#include <regex.h>
int regcomp(regex_t *restrict preg, const char *restrict regex, int cflags); int regexec(const regex_t *restrict preg, const char *restrict string, size_t nmatch, regmatch_t pmatch[_Nullable restrict .nmatch], int eflags);
size_t regerror(int errcode, const regex_t *_Nullable restrict preg, char errbuf[_Nullable restrict .errbuf_size], size_t errbuf_size); void regfree(regex_t *preg);
typedef struct { size_t re_nsub; } regex_t;
typedef struct { regoff_t rm_so; regoff_t rm_eo; } regmatch_t;
typedef /* ... */ regoff_t;
regcomp() is used to compile a regular expression into a form that is suitable for subsequent regexec() searches.
On success, the pattern buffer at *preg is initialized. regex is a null-terminated string. The locale must be the same when running regexec().
After regcomp() succeeds, preg->re_nsub holds the number of subexpressions in regex. Thus, a value of preg->re_nsub + 1 passed as nmatch to regexec() is sufficient to capture all matches.
cflags is the bitwise OR of zero or more of the following:
regexec() is used to match a null-terminated string against the compiled pattern buffer in *preg, which must have been initialised with regexec(). eflags is the bitwise OR of zero or more of the following flags:
Unless REG_NOSUB was passed to regcomp(), it is possible to obtain the locations of matches within string: regexec() fills nmatch elements of pmatch with results: pmatch[0] corresponds to the entire match, pmatch[1] to the first subexpression, etc. If there were more matches than nmatch, they are discarded; if fewer, unused elements of pmatch are filled with -1s.
Each returned valid (non--1) match corresponds to the range [string + rm_so, string + rm_eo).
regoff_t is a signed integer type capable of storing the largest value that can be stored in either an ptrdiff_t type or a ssize_t type.
regerror() is used to turn the error codes that can be returned by both regcomp() and regexec() into error message strings.
If preg isn't a null pointer, errcode must be the latest error returned from an operation on preg.
If errbuf_size isn't 0, up to errbuf_size bytes are copied to errbuf; the error string is always null-terminated, and truncated to fit.
regfree() deinitializes the pattern buffer at *preg, freeing any associated memory; *preg must have been initialized via regcomp().
regcomp() returns zero for a successful compilation or an error code for failure.
regexec() returns zero for a successful match or REG_NOMATCH for failure.
regerror() returns the size of the buffer required to hold the string.
The following errors can be returned by regcomp():
For an explanation of the terms used in this section, see attributes(7).
Interface | Attribute | Value |
regcomp (), regexec () | Thread safety | MT-Safe locale |
regerror () | Thread safety | MT-Safe env |
regfree () | Thread safety | MT-Safe |
POSIX.1-2008.
POSIX.1-2001.
Prior to POSIX.1-2008, regoff_t was required to be capable of storing the largest value that can be stored in either an off_t type or a ssize_t type.
re_nsub is only required to be initialized if REG_NOSUB wasn't specified, but all known implementations initialize it regardless.
Both regex_t and regmatch_t may (and do) have more members, in any order. Always reference them by name.
#include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <regex.h> #define ARRAY_SIZE(arr) (sizeof((arr)) / sizeof((arr)[0])) static const char *const str =
"1) John Driverhacker;\n2) John Doe;\n3) John Foo;\n"; static const char *const re = "John.*o"; int main(void) {
static const char *s = str;
regex_t regex;
regmatch_t pmatch[1];
regoff_t off, len;
if (regcomp(®ex, re, REG_NEWLINE))
exit(EXIT_FAILURE);
printf("String = \"%s\"\n", str);
printf("Matches:\n");
for (unsigned int i = 0; ; i++) {
if (regexec(®ex, s, ARRAY_SIZE(pmatch), pmatch, 0))
break;
off = pmatch[0].rm_so + (s - str);
len = pmatch[0].rm_eo - pmatch[0].rm_so;
printf("#%zu:\n", i);
printf("offset = %jd; length = %jd\n", (intmax_t) off,
(intmax_t) len);
printf("substring = \"%.*s\"\n", len, s + pmatch[0].rm_so);
s += pmatch[0].rm_eo;
}
exit(EXIT_SUCCESS); }
grep(1), regex(7)
The glibc manual section, Regular Expressions
2023-07-20 | Linux man-pages 6.05.01 |