Home | Product: Offline Browser


 
BackStreet Browser - A high-speed, multi-threading website download and viewing program.

Help Section

Overview | Interface and Control | Main Menu | Main Toolbar | Status Bar | Project/folders tree context menu | File list context menu | Getting Started | Creating a New Project | Project Properties | Project Properties Load | Project Properties Load Example | Project Properties FileFilter | Project Properties URL Filter | Project Properties URL Filter Example | Project Properties Connection | Starting and Resuming Download | Interrupting Download | Saving and Copying Project | Project Properties Others | Other Capabilities | Working with Workspaces | Search By Keywords Config File | Search By Keywords | Web Directories Configuration File | Working with Web Directories | Setting Up RAS Connection | Batch Processing: Working with URL Lists | Using Regular Expressions | FAQ


Using Regular expressions

Regular expressions are a widely-used method of specifying patterns of text to search for. Special service characters (or meta characters) allow you to specify, for instance, that a particular string you are looking for occurs at the beginning or end of a line, or contains N recurrences of a certain character.

In BackStreet Browser you can use regular expressions when setting up File filters in the Project properties dialog.

Below you can find a brief description of regular expressions composition and use. We do not provide specific details since this is not the main subject of the present help. If necessary, you can find the detailed descriptions in many open sources.

Simple matches
If a character is not a service character (see below), it just matches itself. A series of characters matches that series of characters in the target string: if you specify “Hello world” as your search string, exactly the phrase “Hello world” will be searched for in the target string.

Character classes (“any character from the list” and “any character not from the list”).
In your search string you can specify a character class instead of a single character by enclosing a list of characters in brackets (“[]”). This will mean that any of the listed characters suits in this character location.
You can also put the character “^”after the “[”.This will mean that any character except of the listed ones suits in this character location.

Examples:
Specify “[ps]ocket” as a search string. Program will find for you “pocket” and “socket”. “Rocket”, “locket”, etc. are not considered suitable.
Specify “[^ps]ocket” as a search string. Program will find for you “rocket”, “locket”, etc. “Pocket” and “socket” are not considered suitable.

For more convenience you can use the "-'' character to specify a characters range. E. g. “a-z” represents all characters between "a'' and "z'', inclusive.
Note. If you want the character "-'' itself to be a member of a class, put it at the start or end of the list, or escape it with a backslash. If You want ']' you may place it at the start of list or precede it with a backslash.

Examples:
[-az] or [az-] matches 'a', 'z' and '-'
[a\-z] matches 'a', 'z' and '-'
[a-z] matches all twenty six small characters from 'a' to 'z'

Service characters (“any character from the list” and “any character not from the list”)
These are special characters which are the essence of regular expressions. There are basic types of service characters, described below.

Note. Any service character will be regarded as a usual one if it is preceded with a slash (“\”).


Pre-defined classes:
\w - an alphanumeric character (a-z, A-Z, 0-9, "_")
\W - a non-alphanumeric character
\d - a numeric character
\D - a non-numeric character
\s - a space
\S - a non-space character

You may use \w, \d and \s within custom character classes.

Examples:
r\dcket matches strings like “r1cket”, “r7cket” and so on but not “rocket”, “racket”, etc.
r[\w\s]cket matches strings like “rocket”, “racket”, “r cket” and so on but not “r0cket”, “r@cket”, etc.

Line separators:
^ - start of line
$ - end of line
\A - start of text
\Z - end of text
. - any character in line

Examples:
^rocket target string matches only if it starts with 'rocket'
rocket$ target string matches only if it ends with 'rocket'
^rocket$ target string matches only if 'rocket' is the only word in the string
r.cket matches strings like 'rocket', 'racket', 'r cket' and so on

Iterators:
Any item of a regular expression may be followed by another type of service characters - iterators. Placing an iterator after a character or sub-expression in you search line you can specify the number of occurrences of this character/sub-expression in the target line.

? - the symbol must enter the target string never or once
* - the symbol must enter the target string never, once or several times
+ - the symbol must enter the target string once or several times
{n} - the symbol must enter the target string exactly n times
{n,} - the symbol must enter the target string at least n times
{n,m} - the symbol must enter the target string at least n but not more than m times

Examples:
r.?cket matches strings like “rocket”, “racket”, “rcket”, but not “roeocket”
r.*cket matches strings like “rocket”, “roeocket”, “rcket”
r.+cket matches strings like “rocket”, “racket” but not “rcket”
ro{2}cket matches the string “roocket”
ro{2,}cket matches the strings “roocket”, “rooocket”, “roooocket”, etc.
ro{2,3}cket matches the strings “roocket” and “rooocket” but not “roooocket”, etc.

Alternatives:
You can specify a number of alternatives for a pattern using “|” to separate them. This means that the search string “racket|rocket” will match any of “racket” and “rocket” in the target string.
Alternatives are tried from left to right. If one of the alternatives matches, the rest (the ones to the right) are not checked.

Example:
bee(hive|line) matches the strings “beehive” and “beeline”.