MySQL regular expressions (RE) are a powerful tools that can be very useful in SQL string searches. They enable software engineer to build statements that are very concise and to handle complex string operations that otherwise wouldn’t be possible.
{m,n} => A quantifier matching between ‘m’ and ‘n’ times. m & n are integers.
It is important to note that MySQL REGEXP returns 1 or 0 depending on whether it finds the match in the pattern or not. If a match that your looking for in the pattern or the pattern itself is null, then it returns NULL. Other application languages may return an array or list or true/false. Thus use of REGEXP in where clause is more appropriate than in either SELECT or others.
For example:
SELECT 'AdWords' REGEXP 'A*W*';
returns 1
Trying the following on MySQL Sakila database:
SELECT * FROM country WHERE country REGEXP '^U.*' ;
SELECT name FROM (
SELECT substring_index(name, ' ', 1) REGEXP '^M.*O' YN, name
FROM (
SELECT 'Christopher Columbus' name
UNION
SELECT 'Marco Polo' name
UNION
SELECT 'Vasco Da Gama' name
UNION
SELECT 'Francisco Pizarro' name
) x
) y
WHERE YN = 1
[[:<:]]
token[[:>:]]
Searches the token in the string that is neither preceded nor followed by any alpha-numerals or “_” (underscore). For example,
SELECT 'Here is a token in a string' REGEXP '[[::]]'
Returns 1 (true), where as the following ones return 0 (false)
SELECT 'Here is a _token in a string' REGEXP '[[::]]'
SELECT 'Here is a tokens in a string' REGEXP '[[::]]'
[:character_class:]
SELECT column1, length(column1) from table
WHERE column1 REGEXP '[[:space:]]' ;
will return only those column1 and its length if they have any space embedded in them – quite helpful when you want to identify that any space character has crept in when it is not suppose to.
REGEXP’s complex searches come with a caveat of the way the search is done at the low levels – bits and bytes. Both RLIKE and REGEXP work in byte-wise fashion and can result in unexpected results when used on multi-byte strings. For any LATIN character set one will be safe to use them but for UNICODE need to be careful.