FAQ - How to Use Regular Expressions

Regular expressions are string patters that match text. There are many different uses for regular expressions. Within the Yottaa platform Regular Expressions or “RegExp” are used to create fine tuned rules to include or exclude resources from specific optimizations.

 

Regular Expressions are broken down into Character Classes, Quantifiers, and Meta-characters.

Note: Commonly used characters have been highlighted

Character Classes – A character set that is used to define what your patter should look for

            .        Dot, any character (may or may not match line terminators, read on)

\d       A digit: [0-9]

\D      A non-digit: [^0-9]

\s       A whitespace character: [ \t\n\x0B\f\r]

\S       A non-whitespace character: [^\s]

\w      A word character: [a-zA-Z_0-9]

\W     A non-word character: [^\w]

 

Quantifiers – Is a character set used to specify numbers and lengths of a matching pattern.

            Note: A quantifier binds to the expressions immediately to its left

*     Match 0 or more times

+     Match 1 or more times

?     Match 1 or 0 times

{n}   Match exactly n times

{n,}   Match at least n times

{n,m} Match at least n but not more than m times

 

Meta-Characters – Are a special character set that performs special operations within regular expressions.

 \         Escape the next meta-character (it becomes a normal/literal character)

^         Match the beginning of the line

.          Match any character (except newline)

$         Match the end of the line (or before newline at the end)

|          Alternation (‘or’ statement)

( )       Grouping

[ ]       Custom character class

 

Common Examples in Yottaa – Below are a few common examples where Regular Expressions (RegExp) are used within the Yottaa platform to customize optimization.

Removing optimization from affecting bot traffic:

 

The RegExp used in this instance is "(?i).*(image|bot|google|appengine|msn|bing).*"

Dissecting this pattern we can learn how it functions in full:

(?i)                   Ignores case in the search string

.                       Match any character

*                      Match 0 or more times

( )                    Group everything within the parenthesis

|                       Match previous text “Or” the next

 

Breaking down the RegExp in a sentence:

Ignore casing when matching any character 0 or more times for the text group image or bot or google or appengine or msn or bing

Result of the RegExp:

If a request URL contains / and the Client user agent matches the RegExp explained above, exclude it from optimization.

 

Removing optimizations for specific webpages:

The RegExp used in this instance is “.*(account|admin|cart|checkout).*”

Breaking down the RegExp in a sentence:

Match any character 0 or more times for the text group account or admin or cart or checkout.

Result of the RegExp:

If a request URL matches the RegExp explained above, that page is excluded from optimization.

 

Cache Specific Assets:

The regexp used in this instance is “.*\.(?i:jpg|gif|png|jpeg|woff|svg|tiff|bmp|mp4|pdf)(\?.+)?$”

Breaking down the RegExp in a sentence:

Match any character 0 or more times, prefixed with literal character “.” ignoring case for the text group jpg or gif or png or jpeg or woff or svg or tiff or bmp or mp4 or pdf with a query string matching 1 or more times being the end of the url

Result of the RegExp:

If the asset being called meets any of the extensions in the RegExp it will be cached for the amount of time specified in the Quick Tune Rule.

Have more questions? Submit a request

Comments

Powered by Zendesk