Edge Language User Manual
The Edge Language User Manual
Table of Contents
Description
This manual documents the Edge language from the user’s perspective.
Some of the features documented in this manual may not have been implemented in the current Edge language implementation. The Edge language development team is quickly catching up though. When in doubt, please contact the OpenResty Inc. company directly.
This document is still a draft. Many details are still subject to change.
Some big features have not yet been covered, like non-buffered pattern matching and substitutions in very large request and response body data streams.
Builtin predicate functions and actions for generic TCP/UDP proxies and DNS servers have not yet been covered.
Convention
We use the word edgelang
for the Edge language throughout this document
for convenience.
Example code with problems will get a question mark, ?
, at the beginning
of each of its lines. For example:
? true =>
? say($a = 3);
Bits and Pieces
Identifiers
Identifiers in edgelang is one or more words connected by dashes. A word is a sequence of alphanumeric characters and underscores. An underscore character cannot appear at the beginning of a word, however. Below are some examples of valid edgelang identifiers:
foo
hisName
uri-prefix
Your_Name1234
a-b_1-c
The Edge language is a case-sensitive language. So identifiers like foo
and
Foo
do mean completely different things.
Identifiers cannot be one of the language keywords, except for variable names.
Keywords
The Edge language has the following keywords:
ge gt le lt eq ne
contains contains-word suffix prefix
my our macro use INIT as
rx wc qw phase END defer
func action
Variables
Variable names consist of two parts: a leading special character called a sigil and a following identifier. The sigil is used to denote the type of the variable. The following three different sigils are supported:
$
is for scalar variables@
is for array variables%
is for hash variables
Scalar variables hold simple values like numbers, strings, booleans, and quantities.
Array variables are a ordered list container for simple values. When being used in boolean contexts, array variables are evaluated to true when they are non-empty, false otherwise.
Hash variables are an unordered list for key-value pairs. When being used in boolean contexts, hash variables are evaluated to true when they are non-empty, false otherwise.
Variables are usually declared by the my
keyword.
Each scalar variable declared in any scopes can only have one data type throughout its lifetime. And each variable’s data type must be specified when it is declared. There are 7 data types for scalar variables:
Str
Num
Bool
Time
Size
SizeRate
CountRate
The user must explicitly specify a type for the scalar variable like this:
my Num $count;
my Str $name;
Array variable must explicitly specify the data type of the element like this:
my Bool @bool-list;
my Time @time-array;
Hash variable must declare the data type of the hash key and hash value. Value-type follows the scope, and key-type follows the variable name, like this:
my Num $city-weight{Str};
my Time $uri-released{Str};
Variable declarations may take an initial value too, as in:
my Num $count = 0;
my Str @domains = qw/ foo.com bar.blah.org /;
my Num %city-weight{Str} = (ShenZhen: 100, Beijing: 50, Shanghai: 30, Guangzhou: 10);
Request-Scoped Variables
Variables declared by the my
keyword only has the scope of a code chunk.
Every running phase in a request processing lifetime
always has its own scope. To share variables across multiple phases of
the same request lifetime, the user can use the our
keyword in place
of my
to declare custom variables. For example:
our Bool $is-mobile;
our Num $my-count;
Rule-Scoped Variables
The user can also introduce custom rule-scoped variables via as expressions.
Special sub-match capturing variables, $1
, $2
, $3
, and etc, can also
be introduced implicitly by using capturing groups (...)
inside regex
literals of a rule condition. Like variables introduced by
as expressions, these capturing variables also have
scope of the containing rule.
Below is an example:
uri-prefix(rx/ \/ ( [a-z]{2} ) \/ ( [a-z]{2} ) /) =>
say("country: $1, lang: $2");
For request URI /us/en/read.html
, this rule will trigger and produce the
response body output:
country: us, lang: en
For multi-condition rules, each condition has its own set of $1
, $2
,
and etc. For example:
uri(rx{ /([a-z]*) });
uri(rx{ /([0-9]*) })
=>
say("result: $1");
For request GET /foo
, we get:
result: foo
and for request GET /123
, we get:
result: 123
In both cases, we can get a meaningful value in the $1
special variable.
Function Names and Function Calls
The language extensively use functions for predicates and actions in rules. The user can also define their own functions if they want to.
Function names are represented by identifiers directly, no sigils involved.
Function calls are denoted by a function name followed by a pair of parentheses enclosing any arguments, as in:
say("hello, ", "world")
Function calls without any arguments can omit the parentheses. For example:
uri()
can be simplified to:
uri
Arguments can be passed either by positions or by names. The say()
builtin
action function, for example, accepts positional arguments as the response
body message pieces, as the previous example demonstrates. Named arguments
are passed with the argument name and a colon character as the prefix,
as in:
redirect(uri: "/foo", code: 302)
This redirect()
call takes two named arguments, uri
and code, which take the values
"/foo"and
302`, respectively.
Builtin functions may require certain arguments to be passed by names, and the others passed by positions. Please consult the documentation of specific builtin functions for the actual usage.
Function calls with at least one argument can be rewritten in a “method call” form with the first argument being the invocant. For example:
say("hello")
can be rewritten as:
"hello".say()
or even:
"hello".say
Function calls with more than one argument can also be rewritten in a similar way, for example:
say("hello", "world")
is semantically equivalent to:
"hello".say("world")
Literal Strings
Single-Quoted Strings
Single-quoted strings are literal string values enclosed by single quotes
(''
), as in
'foo 1234'
The characters $
and @
are always their literal meaning. Scalar and array
variables can never be interpolated.
Only the following escaping sequences are supported in single-quoted strings:
\'
\\
Any other appearances of the \
character in a single-quoted string literal
will be interpreted as a literal \
.
Double-Quoted Strings
Double-quoted strings are literal string values enclosed by double quotes
(""
), as in
"hello, world!\n"
The following escaping sequences are supported in double-quoted strings:
\a
\b
\f
\v
\n
\r
\t
\\
\0
\$
\@
\%
\'
\"
Scalar and array variables can be interpolated into double-quoted string literals, as in:
"Hello, $name!"
"Names: @names"
If there are ambiguous characters coming after the interpolated variable names, then we can use curly braces to disambiguate, as in
"Hello, ${name}ism"
"Hello, @{names}ya"
So literal $
and ‘@’ characters in a double-quoted string must be escaped
by \
, to avoid unwanted interpolations, as in:
"Hello, \$name!"
Numeric Constants
A numeric constant can be written as one of the following forms:
1527
3.14159
-32
-3.14
78e-3 (with a decimal exponent)
0xBEFF (hexadecimal)
0157 (octal)
Regex Literals
A regex literal is for specifying a Perl-compatible regular expression value.
It is denoted by the keyword rx
with a quoting structure. Below are some
examples:
rx{ hello, \w+}
rx/ [0-9]+ /
rx( [a-zA-Z]* )
rx[ ^/abc ]
rx" \d+ - \d+ "
rx' ([^a-z][a-z]+) '
rx! ([^a-z][a-z]+) !
The user is free to use curly braces, slashes, parentheses, brackets, double quotes,
single quotes or exclamation mark in the regex literals. They are all equivalent, except
the requirement on what specific quoting characters need to be escaped
inside the regex string. For example, in the rx(...)
form, use of the
slash character (/
) inside the regex string does not require any escaping.
Use of whitespace characters in the regex value are not significant by
default, except inside a character class construct (e.g., [a-z]
). This
is to encourage the user to format the regex string for better readability.
One or more options can be specified on the regex, for example:
rx:i/hello/
dictates a case-insensitive match of the pattern hello
. Similarly:
rx:s/hello, world/
makes whitespace characters used in the pattern string become significant.
Multiple options can be specified at the same time by stacking them together, as in
rx:i:s/hello, world/
When no options are to be specified, we can also save the rx
prefix and just
use slashes to indicate a regex literal, as in
/\w+/
/hello world/
In edgelang’s regex literals, the meta character .
always match any character, including
the newline character ("\n"
) and the special pattern \s
always match any whitespace
characters including the newline.
In edgelang’s regex literals, scalar variable interpolation is supported, as in
my Str $foo = "hello";
rx:s/$foo, world/;
PS: Avoid using value captures in variable, it will cause unexpected result.
Wildcard Literals
A wildcard literal is for specifying a string matching pattern using
the UNIX-style wildcard syntax. It is denoted by the keyword wc
with
a quoting structure, for instance:
wc{foo???}
wc/*.foo.com/;
wc(/country/??/);
wc[/country/??/];
wc"[a-z].bar.org"
wc'[a-z].*.gov'
wc![a-z].*.gov!
As with regex literals, wildcard literals can also take flexible quoting characters.
Three wildcard meta patterns are supported: *
for matching any sub-string,
?
for matching any one single character, and [...]
for character classes.
One or more options can be specified on the wildcard, for example:
wc:i/hello/
dictates a case-insensitive match of the pattern hello
.
Quoted Words
Quoted words provide a convenient way to specify a list of string literal values without typing too many string enclosing quotes.
It is denoted by the keyword qw
and a subsequent flexible quoting construct.
For example:
qw/ foo bar baz /
is equivalent to:
"foo", "bar", "baz"
Like regex and wildcard literals, the user can choose from various quoting characters for the quoting construct used in quoted words, for instance:
qw{...}
qw[...]
qw(...)
qw'...'
qw"..."
qw!...!
Quantity Values
Quantity values with units are supported as a first-class citizen. A quantity literal is specified by a number and a unit enclosed by squared brackets. For example:
32 [kB/s]
is a quantity for “32 kilo-bytes per second”.
The following time units are supported:
s
,sec
, orsecond
second
ms
millisecond
us
microsecond
ns
nanosecond
min
minute (time)
h
orhour
hour
d
orday
day
month
month
year
year
The r
or req
unit is for number of requests.
The following data size units are supported:
B
orByte
byte
b
orbit
bit
Data size units can take one of the following scale prefixes:
k
x 1000
K
orKi
x 1024
m
x 1000 x 1000
M
orMi
x 1024 x 1024
g
x 1000 x 1000 x 1000
G
orGi
x 1024 x 1024 x 1024
t
x 1000 x 1000 x 1000 x 1000
T
orTi
x 1024 x 1024 x 1024 x 1024
Compound units for data transfer rates can be formed by a data size unit,
a time unit, and a connecting slash character. For example, kB/s
and r/s
.
Quantity values can be coerced into strings directly, for example, action:
say(32 [hour])
gives the following response output:
32 [hour]
One can use arbitrary arithmetic expressions before the unit part, for example:
(1.5 + 2) [kB/s]
yields a quantity equivalent to 3.5 [kB/s]
.
The builtin function convert-unit()
can be used to convert a quantity
value’s unit to a new unit as long as the new compatible unit without changing
the quantity’s physical meaning. For example:
convert-unit(1 [hour], 'sec')
will result in the new quantity value, 3600 [sec]
, which is logically
equivalent.
The to-num()
builtin function can be used to extract the number part
from a quantity value. For instance:
to-num(32 [hour])
will return the number 32
.
Booleans
Boolean values are presented by the values of the builtin function calls
true()
and false()
, respectively. All the relational expressions evaluate
to boolean values as well.
In edgelang, the following values are considered “conditional false”:
- number 0
- string “0”
- the value of
false()
- an empty string
- an empty list or array
- an empty hash table
All other values are considered “conditional true”.
Function calls true()
and false()
are often abbreviated to just true
and false
.
Netaddr
Netaddr constant support CIDR format, can be written as one of the following forms:
192.168.1.1
192.168.1.1/32 -- it's same as 192.168.1.1
192.168.1.0/24
::ffff:192.1.56.10/96
Netaddr can be used with relational-operators.
Whatever
The special term *
represents a whatever literal. Some builtin functions
accept whatever literals as their arguments.
Comments
A comment starts with the character #
, and continues to the end of
the current line. For example:
# this is a comment
Block comments are also supported, as in
#`(
This is a block
comment...
)
Note the backtick character and left parenthesis character directly following
the #
character. Parentheses can still be used inside block comments as
long as they are properly paired. Even nested parentheses are allowed too:
#`(
3 * (2 - (5 - 3))
)
Operators
The following operators are supported, in the order of their precedence:
Precedence Operators
0 post-circumfix [], {}, <>
1 **
2 unary +/-/~, as
3 * / % x
4 + - ~
5 << >>
6 &
7 | ^
8 unary !, > < <= >= == !=
contains contains-word prefix suffix
!contains !contains-word !prefix !suffix
eq ne lt le gt ge
9 ..
10 ?:
The user may use parentheses, ()
, to explicitly change the relative precedence
or associativity of the operators in a single expression.
Arithmetic Operators
The language supports the following binary arithmetic operators:
** power
* multiplication
/ division
% modulo
+ addition
- subtraction
For example:
2 ** (3 * 2) # evaluates to 64
(7 - 2) * 5 # evaluates to 25
The unary prefix operators +
and -
are also supported, as in:
+(32 + 1) # evaluates to 33
-(3.15 * 2) # evaluates to -6.3
String Operators
The language supports the following binary string operators:
x repeat a string for several times and concatenate them together
~ string concatenation
For instance:
"abc" x 3 # evaluates to "abcabcabc"
"hello" ~ "world" # evaluates to "helloworld"
Bit Operators
The following binary bit operators are supported:
<< shift left
>> shift right
& bit AND
| bit OR
^ bit XOR
The unary prefix operator ~
is for the bit NOT operation. Do not confuse
it with the binary operator ~
for string concatenation.
Relational Operators
Use of all relational operators lead to a boolean value for the current expression. Expressions using a relational operators are relational expressions.
The following binary operators are for matching netaddr:
~~ contains
!~~ not contains
like:
client-addr !~~ 192.168.10.0/24 =>
say("it's not come from internal network");
first-x-forwarded-addr ~~ any(12.34.56.1/24, 23.45.1.1/16) =>
say("it comes from the backlist");
The following binary operators compare the two operands numerically:
> greater than
< less than
<= less than or equal to
>= great than or equal to
== equal to
!= not equal to
The following binary operators are for comparing string values alphabetically:
gt greater than
lt less than
le less than or equal to
ge great than or equal to
eq equal to
ne not equal to
There are also 3 special string binary operators for pattern matching in a string value:
contains holds when the right hand side operator is "contained" in
the left hand side operator
contains-word holds when the right hand side operator is "contained" as
a word in the left hand side operator
prefix holds when the right hand side operator is a "prefix" of
the left hand side operator
suffix holds when the right hand operator is a "suffix" of the
left hand side operator
The unary prefix operator !
negates the (boolean) value of the operand.
When the right hand side of the string comparison operators is a pattern like a wildcard or a regex value, then matching anchors are assumed in the pattern. For example:
uri eq rx{ /foo } =>
say("hit");
is equivalent to:
uri contains rx{ \A /foo \z } =>
say("hit");
where the regex pattern \A
only matches the beginning of the string while
\z
only matches the end. The contains
operator, on the other hand,
assumes no implicit matching anchors.
Similarly, the contains-word
operator assumes the surrounding \b
regex
anchor on both sides of the user regex.
Range Operator
The binary operator ..
can be used to form a range expression, as in:
1 .. 5 # equivalent to 1, 2, 3, 4, 5
'a'..'d' # equivalent to 'a', 'b', 'c', 'd'
The value of a range expression is a flattened list of all the individual values in that range.
Ternary Operator
The ternary relational operator ?:
can be used to conditionally choose
between two user expressions according to a user condition.
For example:
$a < 3 ? $a + 1 : $a
this expression evaluates to the value of $a + 1
when the expression
$a < 3
is true, or evaluates to $a
otherwise.
Subscript Operators
The post-circumfix operator []
can be used as subscript of an array.
For example:
my Str @names = ('Tom', 'Bob', 'John');
true =>
say(@names[0]), # output Tom
say(@names[1]), # output Bob
say(@names[2]); # output John
Negative indexes are used to access elements from the end of the array,
for instance, -1
is for the last element, -2
is for the second last
one, and etc.
Similarly, the post-circumfix operator {}
is used to index a hash table,
as in:
my Num %scores{Str} = (Tom: 78, Bob: 100, John: 91);
true =>
say(%scores{'Bob'}); # output 100
The post-circumfix operator <>
and {}
is used to index a hash table via literal
string keys, for example, %scores{"John"}
is equivalent to %scores<John>
,
see details Subscript Operators.
Rules
Edgelang is a rule-based language. Essentially every edgelang program consists of groups of rules.
Basic Rule Layout
The edgelang rules come with two basic parts, a condition, and a
consequent. The condition and the consequent are connected by =>
, and the
whole rule is terminated by a semicolon character. The basic form of rule is
like this:
<condition> => <consequent>;
The condition part of the rule can take one or more relational expressions,
like resp-status == 200
. All the relational expressions are connected
by the comma character (,
), which make all the relational expressions
AND’d together, that is, all the relational expressions must hold true
for the whole condition to be true. The conditions should have no side
effects and this property is enforced by the edgelang compiler. For this
reason, the order of valuation of the relational expressions in the same
condition do not change the result of the whole condition.
The consequent part usually contains one ore more actions. Each action can have side effects like changing some request aspects, performing a 302 redirect, or changing the route the current request will go in the backend. It is also possible to specify a full block of rules as a single action (see the Action Blocks section).
Below is a simple edgelang rule:
uri("/foo") =>
redirect(uri: "/bar", code: 302);
In the condition part, uri("/foo")
is a relational expression. The uri()
function taking some arguments is a predicate, which means it only returns
a true or false value. uri("/foo")
does the following: if the current
request URI matches the /foo
literal string precisely, then return true;
otherwise return false. We have one action in the consequent, i.e., the
redirect()
function call. This action generates a 302 HTTP response initiating
an external redirect to the /bar
URI on the same host. It is worth
noting that the uri()
function takes a positional argument while the
redirect()
function takes 2 named arguments. The edgelang builtin functions
can determine by themselves whether they accept either positional arguments
or named arguments, or even both.
Edgelang is a free format language, so you can use whitespace characters freely. The indentation used before the consequent part in the example above is not significant but just for aesthetic considerations. It is totally valid to write the whole rule in a single line, for example:
uri("/foo") => redirect(uri: "/bar", code: 302);
When the uri()
function takes no argument,
it returns the current request URI as a string, for example:
uri() eq "/foo" =>
redirect(uri: "/bar", code: 302);
The relational expression uri() eq "/foo"
in this example is equivalent
to the uri("/foo")
predicate used previously. The eq
part is a binary
comparison operator that compares whether two string values on the two
sides are exactly the same.
It is worth mentioning that edgelang function calls without any arguments
can omit its parentheses, so uri()
can be abbreviated to uri
, as in:
uri eq "/foo" =>
redirect(uri: "/bar", code: 302);
Multiple Relational Expressions
The user can also specify multiple relational expressions in a single condition, for instance:
uri("/foo"), uri-arg("n") < 1 =>
exit(403);
Here we have one more relational expression in the condition, i.e.,
uri-arg("n") < 1
, which matches when the URI argument, n
, takes a value
less than the number 1. We use a different action, exit(403)
, in this
example, which sends a “403 Permission Denied” response to the client immediately
when it is executed. Note the comma between the two relational expressions
of the condition, it means AND, and the relational expressions on both
sides of the comma must be true at the same time for the whole condition
to be true.
The user can specify even more relational expressions in the same condition, as in:
uri("/foo"), uri-arg("n") < 1, user-agent() contains "Chrome" =>
exit(403);
We have a 3rd relational expression, that tests whether the value of the
User-Agent
request header contains the sub-string, Chrome
.
Multiple Conditions
Edgelang rules can actually take multiple parallel conditions, connected by the semicolon operator. These conditions are logically OR’d together for the current rule.
For example:
uri("/foo"), uri-arg("n") < 1;
uri("/bar"), uri-arg("n") >= 4
=>
exit(403);
When either of these 2 conditions matches, the rule matches. When both of the conditions match, the rule also matches, of course.
Multiple Actions
It is possible to specify multiple actions in the same rule consequent. Consider the following example:
uri("/foo") =>
errlog(level: "warn", "rule matched!"),
say("response body data with an automatic trailing newline"),
say("more body data...");
This example has 3 actions in the consequent. The first action calls the
errlog()
builtin function and generates an error log message to the server
error log file with the error log level warn
. The latter 2 actions
call the say()
functions to output response body data for the current
request.
Unconditional Rules
Some rules may choose to run their actions unconditionally. An edgelang rule,
however, always does require a condition part. To achieve the effect of
unconditional rule triggering, the user can use the always-true predicate
true()
as the sole relational expression in the condition, as in:
true() =>
say("hello world");
In this rule, the action say()
always runs regardless.
Because edgelang function calls without any arguments can omit their
parentheses, it is preferable to write true
instead of true()
, as in:
true =>
say("hello world");
Multiple Rules
Multiple rules specified in the same block are executed in series. The rule written first will be executed first.
Consider the following example:
uri("/foo") =>
say("hello");
uri-arg("n") > 3 =>
say("world");
For the request GET /foo?n=4
, we will get a 200 HTTP response with the
body data:
hello
world
The conditions of multiple rules may get optimized by the edgelang compiler, however, to be matched at the same time, and may be evaluated even before any rules are actually executed. This happens when the edgelang compiler finds it safe in doing so.
Blocks
Blocks are usually formed by a pair of curly braces ({}
), which also
forming a new scope for variables. Each edgelang program has an implicit
top-level block.
In the following example, we have two different $a
variables since they
each belongs to a different block (or scope):
my Num $a = 3;
{
my Str $a = "hello";
}
true =>
say("a = $a"); # output `3` instead of `hello`
Rules are also lexical to the containing block, just like variables. Blocks
can be used to group closely related rules together, as a whole. In such
a setting, some early-executed rules may use the special action done()
to skip all the subsequent rules in the same block. The following example
demonstrates this:
{
uri("/test") =>
print("hello"),
done;
true =>
print("howdy");
}
true =>
say(", outside!");
For request GET /test
, the response body would be hello, outside!
.
Note how the done
action in the 1st rule skips the execution of the 2nd
rule. On the other hand, request GET /foo"
would yield the output howdy, outside!
, since the 1st rule does not match.
However, the done()
action used in the middle of a rule consequent does
not skip subsequent actions in the same consequent. It only affects subsequent
rules in the same block.
Blocks can be nested to an arbitrary depth, as in:
uri-arg("a") => say("outer-most");
{
true => say("2nd level");
{
uri("/foo") => say("3rd level");
}
}
Action Blocks
Blocks can also be used as actions in the rule consequent. Such blocks are called action blocks. This can be used to specify nested rules. For example:
uri-prefix("/foo/") =>
{
uri-arg("a") < 0 =>
say("negative"),
done;
uri-arg("a") == 0 =>
say("zero"),
done;
true =>
say("positive");
};
In this rule, when the condition uri-prefix("/foo/")
is matched, the
3 rules inside the action block are then inspected in series. On the
other hand, when the outermost condition does not match, then the execution
flow will never bother looking at the inner rules at all. This is a very
convenient way of factoring out common conditions of several rules. It
also helps the compiler generate more efficient machine code.
Other kinds of actions can be mixed with such action blocks in the same rule consequent, for instance:
uri-prefix("/foo/") =>
{
uri-arg("a") < 0 => say("negative!");
},
done;
As Expressions
The user can use the as expressions to alias values of expressions into custom variables in rule conditions. These variables can later be referenced in subsequent parts of the condition and/or the consequent part of the rule (i.e., being used in actions).
These variables’ scope is limited to their containing rules.
For example:
uri-prefix("/security01/" as $prefix) =>
rm-uri-prefix($prefix),
set-req-host("sec.foo.com");
Here we alias the value of the expression "/security01/"
to our custom scalar
variable $prefix
, and then we reference this variable in our
rm-uri-prefix()
action without duplicating the constant string value, thus
reducing the risk of introducing typos in the constant string values. If we
make a typo in the variable name, for example, we would get a compiler error
complaining about lack of variable declarations). So use of as expressions to
make variable aliases, not only make the code shorter, but also safer.
We can also use “as expressions” to get values as arbitrary expressions. For instance:
uri-arg("uid") as $uid, looks-like-num($uid), $uid > 0 =>
say("found uid: $uid");
In this rule, we alias the value of the dynamic expression,
uri-arg("uid")
, to a custom variable $uid
, and then reference this
value in the later relational expressions of the condition, as well as
the action of the rule.
Assignment Actions
The assignment operator =
is used to specify an action that assigns a
value to a variable or an expression that can be an lvalue. For example:
my Num $a;
true =>
$a = 3;
Like all the other actions, an assignment expression has no value for itself. So it is not allowed to embed an assignment expression in other expressions. For example, the following example will yield a compile-time error:
? my Num $a;
?
? true =>
? say($a = 3);
This is because the assignment $a = 3
returns no value and it can only
be used as a standalone action.
The assignment:
$a = $a + 3
can be simplified using the operator +=
:
$a += 3
Similarly, *=
, /=
, %=
, x=
, ~=
are provided for the binary operators
*
, /
, %
, x
, and ~
, respectively.
Furthermore, the postfix operator ++
can be used to simplify the += 1
case.
For example:
$a++
is equivalent to $a += 1
or $a = $a + 1
. Similarly, the postfix operator
--
is provided as a shorthand for -= 1
.
Like the standard =
operator, all these assignment variations do not
take any values themselves and can only be used as standalone actions.
Running Phases
OpenResty® processes each client request in different running phases.
The following phases are currently supported, edgelang code will run in the rewrite
phase by default,
and other phases need to be used with the Defer Blocks.
rewrite
For request rewrites and redirects
resp-header
When response header is ready
resp-body
The response body phase
More running phases may be added in the future.
Defer Blocks
Defer blocks are a special kind of code blocks. They will be delayed until the
specified phase start. The currently supported phase is: resp-header
, resp-body
.
Note: If resp-body defer blocks are used, Content-Length
in response header will be reset
and use chunk mode. Defer blocks cannot nest defer blocks.
Consider the following examples:
true =>
defer resp-header {
errlog(level: 'error', 'defer log in resp-header');
};
true =>
defer resp-body {
errlog(level: 'error', 'defer log in resp-body');
};
Junctions
A junction is a single value that is equivalent to multiple values. The builtin
functions any
, all
, and none
are used to construct junctions from lists
of values or arrays. Junctions provide a very concise way to express relation
constraints between lists of values. For example, for testing if any elements
in array @foo
is greater than 3, we can write:
any(@foo) > 3
Or if we want to test if all the elements are greater than 3:
all(@foo) > 3
The user can also specify multiple discrete values directly, for instance:
any(1, 3, 5) <= 1
It is also possible to put junctions on both sides of the relational operator:
any(2, 3) > all(-1, 1)
To test if a value does not appear in a list of values, one can write:
$a eq none('foo', 'bar', 'baz')
Junctions can only be used in relational expressions.
Implicit junctions are automatically created via the any()
function when
multiple values are used on one side of the relational operator. For example,
when an array value appears on one side of the relational operator:
@foo > 3
which is equivalent to:
any(@foo) > 3
Similarly, for function calls like uri-arg()
which may return multiple
values:
uri-arg("name") eq 'admin'
It is equivalent to:
any(uri-arg("name")) eq 'admin'
Use of negative relational operators with junctions is potentially problematic if interpreted naively. Consider the following example:
$a != any(1, 2, 3)
This really means the following for an English speaker:
!($a == any(1, 2, 3))
To avoid such surprises for English speakers, edgelang automatically does
this transformation for the user when the relational operators ne
or
!=
are used and any()
is used on the right hand side.
Junctions can only be used on the top level of a relational expression. Use of junctions as function call arguments, for example, are not allowed.
Nested junctions are not supported yet.
Virtual Servers
A virtual server is a single domain name or wildcard domain name that represents a separate “host”. It is not uncommon for a lot of virtual servers or domain names share the same OpenResty® server instance.
Each edgelang program is associated by a single virtual server and each virtual servers are usually separated and isolated. Virtual servers are usually not specified directly inside the edgelang source code, but rather, are specified externally when invoking the edgelang compiler. In the context of the OpenResty® Edge platform, a virtual server is represented by the Application concept there and such information will be automatically fed into the edgelang compiler when OpenResty® Edge invokes it.
When a wildcard domain name, like *.foo.com
, is specified as the virtual
server, the user can use the host()
builtin predicate function to specify
conditions on concrete sub-domains. For example:
# common rules for *.foo.com go here...
host("api.foo.com") => {
# rules for the sub-domain api.foo.com go here...
}, done;
host("blog.foo.com") => {
# rules for the sub-domain blog.foo.com go here...
}, done;
User-Defined Actions
The user can define their own parametric actions by grouping some other actions together. The general syntax for defining custom actions is like below:
action <name>(<arg>...) =
<action1>,
<action2>,
...
<actionN>;
Parameters must be declared with the type information, just like local variables declared by the my keyword.
For example:
action say-hi(Str $who) =
say("hi, $who!"),
exit(200);
true =>
say-hi("Tom");
An HTTP request will trigger an HTTP 200 response with the following body:
hi, Tom!
Multiple parameters can also be specified.
User-defined actions are a great way to introduce your own vocabulary to the actions that you can use in rule consequents.
Recursive actions can also be defined, as in:
action count-down(Num $n) =
say($n),
$n > 0 ? count-down($n - 1) : say("done");
true => count-down(5);
This will yield the response body output as follows:
5
4
3
2
1
0
done
The maximum depth of recursion is carefully limited by the compiler to avoid infinite recursions.
User-Defined Functions
The user can define their own functions which can be used both in rule conditions and consequent. The general syntax for defining custom functions is as follows:
func <name>(<arg>...) = <expression>
Parameters must be declared with the type information, just like local variables declared by the my keyword.
The value of <expression>
after the =
sign is the value of the whole
function.
Consider the following example:
func x-powered-by () =
resp-header("X-Powered-By");
x-powered-by contains rx:i/\b php \b/ =>
errlog(level: "notice", "found a PHP server: ", x-powered-by);
This example defines its own function x-powered-by
which takes no argument
and is evaluated to the value of the expression resp-header("X-Powered-By")
.
User-defined functions can also take parameters. Consider the following example:
func bit-is-set(Num $num, Num $pos) =
$num & (1 << ($pos - 1));
bit-is-set(3, 1)
=>
say("the 1st bit is set!");
The bit-is-set
function takes a number as the first argument and a position
of the bit in that number to test. It returns true when the specified bit
is set and returns false otherwise.
It is worth mentioning we already have a test-bit builtin
predicate function that does exactly what the bit-is-set()
user function
does in this example.
Modules
Edge modules are reusable files of edgelang source that can be shared between various different edgelang programs. Modules usually contain various definitions of user-defined actions and/or user-defined functions.
To load a module, the user edgelang program can use the use
statement,
as follows:
use Foo;
The edgelang compiler will search a file named Foo.edge
in the module
search paths. The user can specify the -I PATH
options on the edgelang
compiler command line to add custom paths to the default module search
paths. For example:
edgelang -I /foo/bar -I /baz/blah test.edge
The user can also specify edge modules to preload on the command line with
the -M NAME
option, as in:
edgelang -I /path/to/modules -M Foo -M Bar test.edge
Calling External Code
It is supported to call foreign libraries written in the target language. For example, when the target language is Lua, then the user can call into arbitrary Lua modules from within their edgelang program if they have enough permissions.
Calling external code is often achieved by the foreign-call()
builtin
function. It takes the following named arguments:
module
The name of the foreign module. In the case of Lua, it’s the name of the Lua module. This argument is optional.
func
The name of the function in that module or in the default namespace of the target language. This argument is required.
The positional arguments (if any) will be passed directly to the specified function in the specified module (if any).
Below is an example of calling the random
function in the standard Lua
module math
:
true =>
say(foreign-call(module: "math", func: "random", 1, 10));
The foreign-call()
call in this example is equivalent to the following
Lua expression:
math.random(1, 10)
To call external C library code, the user can first write a simple Lua wrapper module using LuaJIT’s excellent FFI, and then call into this Lua module as usual.
Importing External Code
Please refer to this document to create global Lua modules.
Builtin Predicate Functions
The Edge language provides the following builtin predicate functions.
cache-status
syntax: cache-status()
Returns the upstream cache status.
For example:
true =>
defer resp-header {
set-resp-header("Cache-Status", cache-status);
};
cache-creation-time
syntax: cache-creation-time()
phase: resp-header
Returns the time since the upstream’s cache was created.
client-addr
syntax: client-addr()
Returns the client address.
client-asn
syntax: client-asn()
syntax: client-asn(asn1, asn2, ...)
Returns true when the client address belongs one of the autonomous system number specified in the arguments; returns false otherwise.
For example:
client-asn("7018", "8023") =>
say("Welcome, our dear guest!");
When no arguments are specified, it will return the current autonomous system number for the client:
For example:
client-asn eq "7018" =>
say("Welcome, our dear guest!");
client-continent
syntax: client-continent()
syntax: client-continent(continent1, continent2, ...)
Returns true when the client address is from one of the continents specified in the arguments; returns false otherwise.
All continent codes are here:
AF = Africa
AS = Asia
EU = Europe
NA = North America
SA = South America
OC = Oceania
AN = Antarctica
For example:
client-continent("AS") =>
say("Welcome, our dear guest from Asia Region!");
When no arguments are specified, it will return the current continent name for the client:
client-continent eq "AS" =>
say("Welcome, our dear guest from Asia Region!");
client-country
syntax: client-country()
syntax: client-country(country1, country2, ...)
Returns true when the client address is from one of the countries specified in the arguments; returns false otherwise.
you can get all two-letter country codes from wikipedia.
The following are some typical country codes:
US = United States of America
CA = Canada
CN = China
RU = Russian Federation
JP = Japan
IN = India
FR = France
DE = Germany
For example:
client-country("CN") =>
say("Welcome, our dear guest from China!");
When no arguments are specified, it will return the current country name for the client:
client-country eq "CN" =>
say("Welcome, our dear guest from China!");
client-port
syntax: client-port()
Returns the client port.
client-province
syntax: client-province()
syntax: client-province(province1, province2, ...)
Returns true when the client address is from one of the provinces specified in the arguments; returns false otherwise.
For example:
client-province("California") =>
say("Welcome, our dear guest from California!");
When no arguments are specified, it will return the current province name for the client:
client-province eq "California" =>
say("Welcome, our dear guest from California!");
Check all Chinese province codes here.
client-city
syntax: client-city()
syntax: client-city(city1, city2, ...)
Returns true when the client address is from one of the cities specified in the arguments; returns false otherwise.
For example:
client-city("Los Angeles") =>
say("Welcome, our dear guest from Los Angeles!");
When no arguments are specified, it will return the current city name for the client:
client-city eq "Los Angeles" =>
say("Welcome, our dear guest from Los Angeles!");
client-isp
syntax: client-isp()
syntax: client-isp(isp1, isp2, ...)
Returns true when the client ISP is from one of the ISPs specified in the arguments; returns false otherwise.
For example:
client-isp("ChinaTelecom") =>
say("our guest's ISP is ChinaTelecom!");
When no arguments are specified, it will return the current ISP name for the client:
client-isp eq "ChinaTelecom" =>
say("our guest's ISP is ChinaTelecom!");
Check some typical Chinese ISP codes here.
client-org
syntax: client-org()
syntax: client-org(asn1, asn2, ...)
Returns true when the client address belongs to one of the autonomous system organization specified in the arguments; returns false otherwise.
For example:
client-org("Korea Telecom", "AT&T Services, Inc.") =>
say("Welcome, our dear guest from ", client-org());
When no arguments are specified, it will return the current autonomous system organization name for the client:
For example:
client-org eq "AT&T Services, Inc." =>
say("Welcome, our dear guest from AT&T!");
client-subnet
syntax: client-subnet()
subsystem: dns
Returns the client subnet in DNS queries, and returns nil
when not found subnet in DNS queries.
It support netaddr
constant, for example:
client-subnet ~~ 127.0.0.1/24 =>
errlog("match");
NOTICE Only parse IPv4 in DNS queries now.
decode-base64
syntax: decode-base64(digest)
Decodes the input string argument as a base64 digest.
decode-hex
syntax: decode-hex(str)
Decodes the input string argument as a hexadecimal digest.
defined
syntax: defined(val)
Returns true when the argument value is defined; returns false otherwise.
disable-convert-head-method-to-get
syntax: disable-convert-head-method-get()
Disable the conversion of the “HEAD” method to “GET” for caching.
encode-base64
syntax: encode-base64(str)
Encodes the input string argument to a base64 digest.
false
syntax: false()
Returns the boolean false value.
first-x-forwarded-addr
syntax: first-x-forwarded-addr()
Returns the first address in the X-Forwarded-For
request header.
host
syntax: host()
syntax: host(pattern...)
When no arguments are specified, returns the value of the host name specified by the request.
When arguments are specified, this function returns true when the request
host name matches any of the patterns specified by the arguments using
the eq
operator. The argument pattern can be one of regexes, literal
strings, or wildcards.
Below is an example:
host("foo.com", wc"*.foo.com") =>
say("hit!");
This is equivalent to the following form:
host eq any("foo.com", wc"*.foo.com") =>
say("hit!");
The former style is recommended since it is simpler.
http-time
syntax: http-time()
syntax: http-time(quantity-val)
Generates HTTP time formatted string for response header values like Last-Modified
and Expires
.
When no argument is specified, it’ll use the current time as default value.
When argument is specified, only accepts quantity typed value with time unit.
Below are the examples:
# 1st
true =>
http-time.say;
# 2nd
true =>
say(http-time(now));
# 3rd
true =>
say(http-time(1513068009 [s]));
Returns value is a string, like Tue, 12 Dec 2017 08:40:09 GMT
.
ip-asn
syntax: ip-asn(netaddr)
Return the current autonomous system number for the specified IP:
Below is an example:
ip-asn("127.0.0.1") eq "7018" =>
say("Welcome, our dear guest!");
ip-continent
syntax: ip-continent(netaddr)
subsystem: dns
Returns the continent name for specified netaddr, which can be parsed from DNS queries.
Below is an example:
ip-continent(client-subnet) eq 'AP' =>
errlog("match");
ip-country
syntax: ip-country(netaddr)
subsystem: dns
Returns the country name for specified netaddr, which can be parsed from DNS queries.
Below is an example:
ip-country(client-subnet) eq 'CN' =>
errlog("match");
ip-province
syntax: ip-province(netaddr)
subsystem: dns
Returns the province name for specified netaddr, which can be parsed from DNS queries.
Below is an example:
ip-province(client-subnet) eq 'Guangdong' =>
errlog("match");
ip-city
syntax: ip-city(netaddr)
subsystem: dns
Returns the city name for specified netaddr, which can be parsed from DNS queries.
Below is an example:
ip-city(client-subnet) eq 'Zhuhai' =>
errlog("match");
ip-isp
syntax: ip-isp(netaddr)
subsystem: dns
Returns the isp name for specified netaddr, which can be parsed from DNS queries.
Below is an example:
ip-isp(client-subnet) eq 'ChinaTelecom' =>
errlog("match");
ip-org
syntax: ip-org(netaddr)
Return the current autonomous system organization name for the specified IP:
Below is an example:
ip-org("127.0.0.1") eq "AT&T Services, Inc." =>
say("Welcome, our dear guest from AT&T!");
is-empty
syntax: is-empty(value)
Returns true when the argument value is empty(not defined, empty string or true
value);
returns false otherwise.
inject-csrf-token
syntax: inject-csrf-token()
Note: This feature only applies to form requests, if an AJAX request is used on an HTML page, the CSRF token will not be successfully injected.
This action adds a piece of JavaScript code to the end of the response content where the Content-Type
is text/html
. This code will automatically add the _edge_csrf_token
parameter to the form request parameter on the page so that it can be carried when the form request is initiated. In conjunction with the validate-csrf-token action, CSRF protection can be implemented. This action can only be used in the defer
block of resp-body
. As the response body is modified, the Accept-Encoding
request header also needs to be removed to avoid the encoding effect.
The following is an example of CSRF protection implemented:
my Str $csrf-res;
true =>
rm-req-header("Accept-Encoding"),
defer resp-body {
inject-csrf-token();
},
$csrf-res = validate-csrf-token(3600),
{
$csrf-res ne "ok" =>
waflog($csrf-res, action: "block", rule-name: "csrf_protection"),
exit(403);
};
last-x-forwarded-addr
syntax: last-x-forwarded-addr()
Returns the last address in the X-Forwarded-For
request header.
looks-like-int
syntax: looks-like-int(value)
Returns true when the argument value looks like an integer, i.e., either a string value whose content looks like an integer or the value itself is an integer value or a number with a zero decimal part. Returns false otherwise
The following calls will all yield true:
looks-like-int(32)
looks-like-int(3.00)
looks-like-int("561")
looks-like-int('0')
looks-like-num
syntax: looks-like-num(value)
Returns true when the argument value looks like a number, i.e., either a string value whose content looks like a number or the value itself is a number.
The following calls will all yield true:
looks-like-num(3.14)
looks-like-num("-532.3")
lower-case
syntax: lower-case(value)
Returns a string with all the characters in the string argument converted to lower-case letters.
md5-hex
syntax: md5-hex(value)
Returns a hexadecimal representation of the MD5 digest of the argument value.
escape-uri
syntax: escape-uri(str)
Escape str
as a URI component.
unescape-uri
syntax: unescape-uri(str)
Unescape str
as an escaped URI component.
str-len
syntax: str-len(str)
Return the length of the str
.
modsec-amp
syntax: modsec-amp(var)
The function is the same as the “&” operator in the Modsecurity rule, which is as follows:
- When the incoming variable is
nil
, 0 is returned - When the incoming variable is a scalar type and is not
nil
, 1 is returned - When the incoming variable is of type array and is not
nil
, the length of the array is returned
my Num $a;
my Str $b = "hello";
my Str @c = ("hello", "world");
true =>
say(modsec-amp($a)),
say(modsec-amp($b)),
say(modsec-amp(@c)),
done;
match-ip-list
syntax: match-ip-list(name: IP_LIST_NAME, IP)
Returns true if the IP address matches one of the entries in the IP list; otherwise, returns false.
IP lists need to be created in advance, either in the application or in the global configuration. When specifying an application IP list, prefix the IP list name with app:
. When specifying a global IP list, prefix the IP list name with global:
.
Match using the application IP list:
match-ip-list(name: "app:ip-list-1", client-addr) =>
say("matched"),
done;
Match using the global IP list:
match-ip-list(name: "global:ip-list-1", client-addr) =>
say("matched"),
done;
now
syntax: now()
Returns a floating-point number for the elapsed time in seconds (including milliseconds as the decimal part) from the epoch for the current time stamp from the OpenResty cached time (no syscall involved unlike Lua’s date library).
now-secs
syntax: now-secs()
Returns a integer number for the elapsed time in seconds from the epoch for the current time stamp from the OpenResty cached time (no syscall involved unlike Lua’s date library).
post-arg
syntax: post-arg(pattern...)
syntax: post-arg(*)
Returns the values of POST query arguments whose names match any of the pattern
arguments using the eq
operator.
The argument pattern can be one of regexes, literal strings, or wildcards.
multipart/form-data
is not supported.
Below is an example for using the value of POST query argument limitrate
to limit the response body data sending data rate:
post-arg("limitrate") as $rate, looks-like-int($rate) =>
limit-resp-data-rate($rate [Kb/s]);
When the argument is a whatever value, i.e., *
, it returns values of
all POST query arguments in the request. In boolean context, it simply evaluates
to true when there is any POST query arguments.
random-pick
syntax: random-pick(value...)
Returns a uniformed random pick of the argument values.
For example,
random-pick("foo", "bar", "baz")
will return either "foo"
, "bar"
, or "baz"
with equal probability.
rand
syntax: rand()
Returns a random number in the range [0, 1]
.
rand-bytes
syntax: rand-bytes(len)
Returns a string that contains the specified len of rand bytes.
random-hit
syntax: random-hit(ratio)
Returns true randomly according to the probability specified by the ratio
argument. The ratio value must be in the range [0, 1]
where 0 means never
while 1 means 100%
, i.e., always. For example, when ratio is 0.2, this
function returns true by a chance of 20%
and false otherwise.
referer
syntax: referer()
syntax: referer(pattern...)
When it is called without any arguments, it returns the value of the function
call req-header("Referer")
.
When some arguments are specified, these arguments are treated as patterns.
It returns true when the referer value matches any of the patterns using
the eq
operator.
The argument pattern can be one of regexes, literal strings, or wildcards.
For example:
referer(wc{*/search.html}, rx{.*?/find\.html}) =>
say("hit!");
This rule is equivalent to the following form feeding no arguments to the
referer
call:
referer-host eq any(wc{*/search.html}, rx{.*?/find\.html}) =>
say("hit!");
The former style is recommended since it is simpler.
reg-domain
syntax: reg-domain()
syntax: reg-domain(pattern...)
When no arguments are given, it returns the registered domain name in the
server host the client is requesting. For example, sub-domain names like
www.openresty.org
is not a registered domain, while openresty.org
is.
When some arguments are specified, these arguments are treated as patterns.
It returns true when the referer host value matches any of the patterns
using the eq
operator.
The argument patterns can be either regexes, literal strings, or wildcards.
For example:
reg-domain("openresty.org", "agentzh.org") =>
say("hit!");
is equivalent to:
reg-domain eq any("openresty.org", "agentzh.org") =>
say("hit!");
The former style is recommended since it is simpler. The former style can be further simplified to the following using the quoted-words syntax:
reg-domain(qw/openresty.org agentzh.org/) =>
say("hit!");
req-charset
syntax: req-charset()
Returns the charset
parameter value (if any) in the request header
Content-Type
.
req-cookie
syntax: req-cookie(pattern...)
syntax: req-cookie(*)
Returns the values of the request cookies whose names match any of the
pattern arguments using the eq
operator.
In boolean context, it simply evaluates to true when there is any matching request cookie names, and evaluates to false otherwise.
The argument pattern can be one of regexes, literal strings, or wildcards.
Below is an example:
req-cookie("mobile_type") =>
say("cookie mobile_type is present!");
req-cookie("mobile_type") > 0 =>
say("cookie mobile_type takes a value greater than 0!");
When the argument is a whatever value, i.e., *
, it returns the values of all
the cookies brought with the request. In boolean context, it simply evaluates
to true when there is any request cookies.
The cookie names can also be patterns like regexes and wildcards. In such cases, cookie names which match any of these patterns will be chosen and their values will be returned.
req-header
syntax: req-header(pattern...)
Returns the values of the request headers whose names match any of the
pattern arguments using the eq
operator.
In boolean context, it simply evaluates to true when there is any matching request header names, and evaluates to false otherwise.
The argument pattern can be one of regexes, literal strings, or wildcards.
Below is an example:
req-header("X-WAP-Profile", "WAP-Profile") =>
say("either header X-WAF-Profile or header WAF-Profile is present!");
The header names can also be patterns like regexes and wildcards. In such cases, header names which match any of these patterns will be chosen and their values will be returned.
duplicate-req-header
syntax: duplicate-req-header()
Returns true
if there are duplicate request headers, otherwise returns false
.
Below is an example:
duplicate-req-header =>
say("duplicate request headers found!");
max-req-header-name-len
syntax: max-req-header-name-len()
Returns the length of the longest name in the request header.
Below is an example:
max-req-header-name-len > 100 =>
say("Found a request header name longer than 100");
max-req-header-value-len
syntax: max-req-header-value-len()
Returns the length of the longest value in the request header.
Below is an example:
max-req-header-value-len > 100 =>
say("Found a request header value longer than 100");
req-id
syntax: req-id()
Returns the value of the request id for the current request. The request id contains information that can be used to uniquely identify a request within an OpenResty Edge installation.
The request id is always a string of 24 characters.
Below is an example:
true =>
add-resp-header("X-Request-Id", req-id)
req-latency
syntax: req-latency()
Returns the latency of the request, it’s a quantity typed value.
For example 0.01 [s]
means 0.01 second
.
req-bytes
syntax: req-bytes()
Returns the request bytes (including request line, header, and request body)
resp-bytes
syntax: resp-bytes()
Returns the number of bytes sent to a client.
req-method
syntax: req-method(pattern...)
When no arguments are specified, returns the request method string like GET
, POST
, and DELETE
.
When arguments are specified, these arguments are treated as patterns matching against the current
request method string. Returns true
when any of the user pattern matches; returns false
otherwise.
req-uri
syntax: req-uri()
full original request URI (with arguments).
The following example return the request URI of the request:
true =>
say(req-uri());
resp-header
syntax: resp-header(pattern...)
Returns the values of the response headers whose names match any of the
pattern arguments using the eq
operator.
In boolean context, it simply evaluates to true when there is any matching response header names, and evaluates to false otherwise.
The argument pattern can be one of regexes, literal strings, or wildcards.
For example:
resp-header("X-WAP-Profile", "WAP-Profile") =>
say("either header X-WAF-Profile or header WAF-Profile is present!");
The header names can also be patterns like regexes and wildcards. In such cases, header names which match any of these patterns will be chosen and their values will be returned.
resp-header-param
syntax: resp-header-param(header-name, param-name)
Returns the value of the specified header parameter in the specified response header.
For instance:
resp-header-param("Cache-Control", "s-maxage") =>
rm-resp-header-param("Cache-Control", "s-maxage");
It removes the s-maxage
parameter from the Cache-Control
response header
when it exists.
resp-body
syntax: resp-body()
Returns the value of the response body. It can only be used in resp-body defer blocks.
For instance:
true =>
defer resp-body {
errlog("body: ", resp-body);
};
resp-mime-type
syntax: resp-mime-type()
syntax: reps-mime-type(pattern...)
When no arguments are specified, returns the MIME-type of the response,
i.e., the value of the Content-Type
response header, excluding any
parameters like charset=utf-8
.
When some arguments are specified, these arguments are treated as patterns.
It returns true when the response MIME-type value matches any of the patterns
using the eq
operator.
The argument pattern can be one of regexes, literal strings, or wildcards.
For example:
resp-mime-type("text/html", wc"*javascript") =>
say("hit!");
This rule is equivalent to the following form feeding no arguments to the
resp-mime-type
call:
resp-mime-type eq any("text/html", wc"*javascript") =>
say("hit!");
The former style is recommended since it is simpler.
resp-status
syntax: resp-status()
syntax: resp-status(code...)
When no arguments are specified, it returns the status code of the current response.
When arguments are specified, these arguments are treated as code to be compared with the current response status code. Returns true when any of the specified code is matched; returns false otherwise.
For example:
resp-status(404, 500, 502, 503) =>
say("found a known bad response status code: ", resp-status);
scheme
syntax: scheme()
Returns the protocol scheme of the current request, like http
and https
.
For example:
scheme() eq "http" =>
redirect(scheme: "https", host: host(), uri: req-uri(), code: 307);
server-addr
syntax: server-addr()
phase: rewrite resp-header resp-body
Returns the address of the server which accepted current request.
NOTICE It cannot work under the phase ssl-cert
, the other phases are fine.
Here is an example:
true =>
say("address: ", server-addr);
We may get one of the following answers, it depends on your server’s listening address:
# IPv4
address: 127.0.0.1
# IPv6
address: ::1
# Unix domain
address: unix:/tmp/nginx.sock
server-port
syntax: server-port()
Returns the port of the server which accepted current request.
substr
syntax: substr(str, start[, end])
Returns the sub-string starting from the subscript start
(1-based) and ending at end
argument.
Negative subscripts indicate positions from the end of the string. For example, -1 means the last character, -2 means the second last one, and etc.
When the end
argument is omitted, it means all the characters until the
end of the string.
Below are some examples:
my Str $s = "hello world";
true =>
say(substr($s, 1, 5)), # output: hello
say(substr($s, 7)), # output: world
say(substr($s, -5, -2)), # output: worl
say(substr($s, -5)); # output: world
subst
syntax: subst(subject, regex, replacement,)
syntax: subst(subject, regex, replacement, g: BOOL)
Substitutes the first match of the Perl compatible regular expression regex
on the subject
argument string with the string or function argument replacement
by default.
it will does global substitution when the named argument g
is true
.
Below are some examples:
my Str $s = "hello world";
true =>
say(subst($s, rx/l/, "g")), # heglo world
say(subst($s, rx/l/, "g", g: true)); # heggo worgd
system-hostname
syntax: host_name = system-hostname()
Returns the host name of system, and it is the same as the return value of command hostname
.
For example:
true =>
say("host name: ", system-hostname);
ssl-client-s-dn
syntax: client_subject_dn = ssl-client-s-dn()
Returns the “subject DN” string of the client certificate, such as:
CN=client.com,OU=dev,O=orinc,L=xm,ST=fj,C=cn
For example:
true =>
say("client subject dn: ", ssl-client-s-dn);
ssl-client-i-dn
syntax: issuer_subject_dn = ssl-client-i-dn()
Returns the “issuer DN” string of the client certificate, such as:
CN=rootca.com,OU=dev,O=orinc,L=xm,ST=fj,C=cn
For example:
true =>
say("issuer subject dn: ", ssl-client-i-dn);
ssl-client-serial
syntax: client_serial = ssl-client-serial()
Returns the “serial number” of the client certificate for an established SSL connection, such as:
045CA7F023CAC0FD592B4D5DE5E7C6AF
For example:
true =>
say("ssl client serial: ", ssl-client-serial);
ssl-client-verify-result
syntax: result = ssl-client-verify-result()
Returns the verification result of client certificates.
The results returned may be the following values:
NONE
, SUCCESS
, FAILED:unable to verify the first certificate
For example:
true =>
say("result: ", ssl-client-verify-result);
to-int
syntax: to-int(value)
syntax: to-int(value, method: METHOD)
Converts the argument value to an integer. Strings will be converted to numbers according to the 10-base representation. Quantity values will get their unit part stripped off.
The decimal will be forensic according to the parameters ceil
/floor
/round
.
If there are no named argument, the default is floor
.
For example:
true =>
to-int("10.1", method: "ceil");
to-num
syntax: to-num(value)
Converts the argument value to a number. Strings will be converted to numbers according to the 10-base representation. Quantity values will get their unit part stripped off. Numbers will just get through.
to-hex
syntax: to-hex(value)
Converts the argument value to a hex string.
true
syntax: true()
Returns the boolean true value.
ua-contains
syntax: ua-contains(pattern...)
This is just a shorthand for the expression
user-agent contains any(pattern1, pattern2, ...)
.
ua-is-mobile
syntax: ua-is-mobile()
Returns true when the client looks like a mobile device; returns false
otherwise. This is achieved by checking the User-Agent
request header
sent by the client.
upper-case
syntax: upper-case(value)
Returns a string with all the characters in the string argument converted to upper-case letters.
upstream-addr
syntax: $addr = upstream-addr()
Returns the upstream address in string format, like 192.168.0.1:8080
.
For example:
true =>
defer resp-header {
errlog(level: "warn", upstream-addr);
};
uri
syntax: uri()
syntax: uri(pattern...)
When no arguments are specified, returns the URI of the request. Note that the URI string does not include any URI arguments.
When some arguments are specified, these arguments are treated as patterns.
It returns true when the URI value matches any of the patterns using the
eq
operator.
The argument pattern can be one of regexes, literal strings, or wildcards.
For example:
# the condition is true for request URIs `/foo/`, `/bar/`, and `/bar/blah`,
# but the condition is false for `/blah/foo/`, `/blah/bar/`, and `/bar`:
uri("/foo/", wc"/bar/*") =>
say("hit!");
This rule is equivalent to the following form feeding no arguments to the
uri
predicate:
uri eq any("/foo/", wc"/bar/*") =>
say("hit!");
The former style is recommended since it is simpler.
uri-arg
syntax: uri-arg(name...)
Returns the value of the specified URI parameter.
Below is an example for using the value of the URI argument limitrate
to limit the response body data sending data rate:
uri-arg("limitrate") as $rate, looks-like-int($rate) =>
limit-resp-data-rate($rate [Kb/s], after: 1 [MB]);
duplicate-uri-arg
syntax: duplicate-uri-arg()
Returns true
if there are duplicate URI arguments, otherwise returns false
.
Below is an example:
duplicate-uri-arg =>
say("duplicate URI arguments found!");
query-string
syntax: query-string()
Returns the values of the arguments in the request line.
we will got foo=bar&a=b
when the request line is GET /uri?foo=bar&a=b
with the following edge code.
true =>
say(query-string);
sorted-query-string
syntax: sorted-query-string()
Returns the sorted values of the arguments in the request line.
we will got a=1&b=2&c=3
when the request line is GET /uri?b=2&a=1&c=3
with the following edge code.
true =>
say(sorted-query-string);
uri-basename
syntax: uri-basename()
syntax: uri-basename(pattern...)
Without any arguments, it returns the basename of the resource specified
in the request URI. For example, for the URI /en/company/about-us.html
,
it returns about-us
as the base name. And for /static/download/foo.tar.gz
,
it returns foo
.
When some arguments are specified, these arguments are treated as patterns.
It returns true when the URI value matches any of the patterns using the
eq
operator. For example:
uri-basename("foo", rx/bar\w+/) =>
say("hit!");
uri-contains
syntax: uri-contains(pattern...)
This is a shorthand for the expression uri contains any(pattern1, pattern2, ...)
.
uri-prefix
syntax: uri-prefix(pattern...)
In boolean context, this is a shorthand for the expression uri prefix any(pattern1, pattern2, ...)
.
In string context (like in a as expression), it returns the sub-string that actually matches the first pattern that can be matched.
uri-seg
syntax: uri-seg(index...)
syntax: uri-seg(*)
This function treats the URI path string as multiple segments separated
by slashes (/
) and returns the segments of the specified indexes. The
segment indexes are form 1, and increments from left to right of the URI
path.
For example, for request URI /foo/bar/baz
, uri-seg(1)
returns foo
,
uri-seg(2)
returns bar
, and uri-seg(3)
returns baz
. Multiple indexes
can be specified at the same time as well, as in uri-seg(2, 5)
.
When the whatever value, *
, is specified as the sole argument, this function
returns all the URI path segment values.
uri-suffix
syntax: uri-suffix(pattern...)
In boolean contexts, this is a shorthand for the expression
uri suffix any(pattern1, pattern2, ...)
.
In string context (like in a as expression), it returns the sub-string that actually matches the first pattern that can be matched.
user-agent
syntax: user-agent()
syntax: user-agent(pattern...)
Without any arguments, this function is just a shorthand for
req-header("User-Agent")
.
With some arguments, the call is equivalent to
user-agent eq any(pattern1, pattern2, ...)
, i.e., for checking whether
the user agent string matches
any of the user patterns with the operator eq
.
uuid-v4
syntax: uuid-v4()
Generates a UUID version 4 string value.
userid
syntax: userid()
Generates a user id string value.
Builtin Action Functions
add-req-header
syntax: add-req-header(name, value)
syntax: add-req-header(name1, value1, name2, value2, ...)
syntax: add-req-header(%name-value-pairs)
Adds new request headers, without overriding any existing request headers of the same names.
For example:
true =>
add-req-header("X-Foo", 1234);
If you want to override any existing request headers, please use the set-req-header builtin action instead.
add-resp-header
syntax: add-resp-header(header, value)
syntax: add-resp-header(header1, value1, header2, value2, ...)
syntax: add-resp-header(%name-value-pairs)
phase: rewrite resp-header
Adds new response headers to the current request. Existing headers with the same name are not affected. If you want to override existing same-name headers, please use set-resp-header instead.
Below is an example:
true =>
add-resp-header("X-Powered-By", "OpenResty Edge");
add-uri-arg
syntax: add-uri-arg(name, value)
syntax: add-uri-arg(name1, value1, name2, value2, ...)
syntax: add-uri-arg(%name-value-pairs)
Adds new URI arguments to the current request. Existing URI arguments with the same name are not affected. If you want to override existing same-name URI arguments, please use set-uri-arg instead.
Below is an example:
true =>
add-uri-arg("uid", "1234");
add-uri-prefix
syntax: add-uri-prefix(prefix)
Adds a new prefix string to the current request URI.
Note that the prefix value does not need to end with a slash (/
) because
the existing URI string must already starts with a slash anyway.
Consider the following example:
true =>
add-uri-prefix("/en/us");
For request GET /install.html
, this rule will make the URI become
/en/us/install.html
.
apply-std-mime-types
syntax: apply-std-mime-types()
syntax: apply-std-mime-types(force: true)
Sets the standard response header Content-Type
based on the request file
suffix.
By default, this only applies to the responses with the response header
Content-Type
field is empty. But user can override existing Content-Type
in response headers by specifying the named arguments, force: true
.
append-proxy-header-value
syntax: append-proxy-header-value(header, value)
syntax: append-proxy-header-value(header1, value1, header2, value2, ...)
Appends proxy header value to the proxied server.
When the header field is not empty, the header field will the existing value
with the value
appended to it, separated by a comma.
Otherwise the header filed will be the value
.
Like:
true =>
appear-proxy-header-value("X-Forwarded-For", client-addr);
# will pass `192.168.1.1,10.10.1.1` as `X-Forwarded-For` to the proxied server,
# when the original `X-Forwarded-For` is `192.168.1.1` and client-addr is `10.10.1.1`.
List of headers which can not used in this API:
- Host
- Connection
- Upgrade
- Content-Length
- Transfer-Encoding
- If-Modified-Since
- If-None-Match
block-req
syntax:
block-req(key: KEY, target-rate: RATE, reject-rate: RATE,
block-threshold: COUNT,
observe-interval: COUNT,
block-time: TIME,
log-headers: BOOL,
reject-action: REJECT-ACTION,
status-code: STATUS-CODE,
clearance-time: CLEARANCE-TIME,
page-template-id: PAGE-TEMPLATE-ID)
Limits the request rate around the specified user key.
The named argument key
is optional. When omitted, it is equivalent to
a constant key.
The named argument target-rate
is the maximum rate we want to shape into.
When the incoming rate exceeds reject-rate
, the request handler will
immediately reject the current request with a 503 error page (for HTTP/HTTPS
applications) or drop the packet immediately (for DNS applications).
When the incoming rate is between the target-rate
and the reject-rate
,
this action will wait an appropriate amount of time to match the target-rate
for the current request.
The value of reject-rate
must be no smaller than target-rate
.
Both of the rate values must take a unit like [r/s]
and [r/min]
.
observe-interval
is used to set the size of time window for each observation interval in seconds;
and block-threshold
is used to set the number of consecutive observation intervals.
When the incoming rate exceeds reject-rate
in each successive observation interval,
the request handler will immediately block request with a 503 error page
in block-time
seconds.
When the log-headers
argument is true, the request headers will be printed in the error log.
The named parameter reject-action
is optional and supports the following actions:
enable_hcaptcha
: Triggers hCaptcha. The parameterHCAPTCHA-CLEARANCE-TIME
specifies the duration (in seconds) for which re-verification is not required after successful validation.enable_edge_captcha
: Activates edge captcha. The parameterEDGE-CAPTCHA-CLEARANCE-TIME
determines the period (in seconds) during which re-verification is unnecessary following successful validation.error_page
: Returns a custom error page. TheSTATUS-CODE
parameter defines the HTTP status code to be returned.close_connection
: Immediately terminates the connection. This action is equivalent toerror_page
with aSTATUS-CODE
of 444.redirect_validate
: Initiates a redirect validation process.js_challenge
: Implements a JavaScript challenge.page_template
: Returns a specific page template, identified by its unique ID.
Below is an example:
true =>
block-req(key: client-addr, target-rate: 10 [r/s], reject-rate: 20 [r/s],
block-threshold: 2, observe-interval: 30, block-time: 60);
The user can initiate multiple block-req
calls for different keys and
rates in a single request handler.
basic-authenticate
syntax: basic-authenticate(auth-id: AUTH-ID)
Enable the HTTP basic authenticate.
The named argument auth_id
is is a splice of type and authentication list id.
If this authentication list is configured within the application, the auth-id
argument should be app-auth:<list_id>
.
If this authentication list is configured within the global, the auth-id
argument should be global-auth:<list_id>
.
Below is an example:
basic-authenticate(auth-id: "app-auth:1") =>
say("ok");
basic-authenticate(auth-id: "global-auth:1") =>
say("ok");
enable-otel-trace
syntax: enable-otel-trace()
Enables otel tracing.
Below is an example:
random-hit(0.05) =>
enable-otel-trace();
enable-ssl-client-verify
syntax: enable-ssl-client-verify()
Enables verification of client certificates, if validation fails, will return 400
status code and exit the current request.
Below is an example:
true =>
enable-ssl-client-verify();
foreign-call
syntax: foreign-call(module: <module>, func: <func>, arg...)
syntax: foreign-call(func: <func>, arg...)
syntax: foreign-call(func: <func>)
Initiates a call into external functions in the target language (like Lua).
The optional named argument module
specifies the foreign module name.
If omitted, defaults to the standard namespace used by the foreign language.
The func
named argument specified the function name of the foreign call.
This argument is required.
Any positional arguments will be passed into the foreign function call as arguments.
See Calling External Code for more details.
enable-edge-captcha
syntax: enable-edge-captcha(clearance-time: CLEARANCE-TIME)
syntax: enable-edge-captcha(clearance-time: CLEARANCE-TIME, page-template-id: PAGE-TEMPLATE-ID)
Enables Edge captcha for the current request. If the request has already passed verification, it will be allowed through.
The clearance-time
argument specifies the validity period (in seconds) for a successful verification.
The optional page-template-id
argument specifies the ID of a custom captcha page template. This template must be pre-configured in the global page templates.
Below is an example:
Use the default template without specifying the page-template-id parameter.
true =>
enable-edge-captcha(clearance-time: 30),
done;
Specify the page template.
true =>
enable-edge-captcha(clearance-time: 60, page-template-id: 1),
done;
enable-hcaptcha
syntax: enable-hcaptcha(clearance-time: CLEARANCE-TIME)
syntax: enable-hcaptcha(clearance-time: CLEARANCE-TIME, page-template-id: PAGE-TEMPLATE-ID)
Before you can use hCaptcha, you need to configure hCaptcha’s site key and secret key in the global configuration: hCaptcha
Enables hCaptcha verification for the current request. If the request has already passed verification, it will be allowed through.
The clearance-time
argument specifies the validity period (in seconds) for a successful verification.
The optional page-template-id
argument specifies the ID of a custom hCaptcha page template. This template must be pre-configured in the global page templates.
Below is an example:
Use the default template without specifying the page-template-id parameter.
true =>
enable-hcaptcha(clearance-time: 30),
done;
Specify the page template:
true =>
enable-hcaptcha(clearance-time: 60, page-template-id: 1),
done;
enable-proxy-cache
syntax: enable-proxy-cache(key: KEY)
Enable proxy cache with the user-supplied cache key for the current request, proxy cache is disabled by default.
enable-global-cache
syntax: enable-global-cache()
Enable global cache. The global cache will be shared among different applications. Global cache is disabled by default.
enable-gateway-gzip
syntax: enable-gateway-gzip()
syntax: enable-gateway-gzip(enabled)
phase: resp-header
Dynamically turns on/off gzip compression for the current request.
When no argument is specified, means to enable gateway gzip.
When a bool argument is specified, means to enable or disable gateway gzip.
Below is an example:
uri-prefix("/css/") =>
defer resp-header {
enable-gateway-gzip;
};
enable-proxy-cache-revalidate
syntax: enable-proxy-cache-revalidate()
syntax: enable-proxy-cache-revalidate(enabled)
phase: rewrite
Whether to enable the proxy_cache_revalidate
function.
When no argument is specified, means to enable the proxy_cache_revalidate
function.
Below is an example:
true =>
enable-proxy-cache-revalidate(true);
enforce-proxy-cache
syntax: enforce-proxy-cache(time)
Similar to set-proxy-cache-default-ttl, but will enforce caching the current response regardless of the response header settings (i.e., ignoring Cache-Control, Set-Cookie, Expires and etc).
errlog
syntax: errlog(level: LEVEL, msg...)
syntax: errlog(msg...)
Produces an error log message with the specified log level via the named
argument level
. When level
is omitted, defaults to the error
log
level.
The level
can be one of the following:
- error
- warn
- stderr
- emerg
- alert
- crit
- notice
- info
- debug
The message part can be multiple string arguments. This function will concatenate them automatically.
Some examples:
true =>
errlog(level: "alert", "Something bad", " just happened!"),
errlog("The user is not authorized");
exit
syntax: exit(code)
Exits the current request’s processing with the status code code. If no response has sent yet upon this call, this call will also generate a default error page for the specified status code if it is recognized.
To shutdown the connection immediately, use the special exit code 444
.
expires
syntax: expires(time)
syntax: expires(time, force: true)
Adds and modifies the response headers Expires
and Cache-Control
for
the specified expiration time.
By default, this only applies to responses with the status code 200, 201,
204, 206, 301, 302, 303, 304, 307, or 308. But user can enforce it for
any status code by specifying the named argument, force: true
.
The time
positional argument must be a quantity value
taking a time unit, like [sec]
(for seconds), [min]
(for minutes),
[hour]
(for hours), and [day]
(for days).
Below is an example:
uri-prefix("/css/") =>
expires(1 [day]);
This action does not affect the proxy cache expiration time. See cache-expires also.
limit-req-concurrency
syntax: limit-req-concurrency(key: KEY, target-n: COUNT, reject-n: COUNT, log-headers: BOOL)
Limits the incoming request’s concurrency level at the user-supplied key.
The actual concurrency level after running this action will be guaranteed to be
no more than the target-n
named argument value. When the incoming concurrency
level is between target-n
and reject-n
, the current request will be delayed
by an appropriate amount of time to satisfy the target concurrency level.
When the incoming request concurrency level is exceeding the reject-n
value,
then the current request will be immediately rejected with a 503 error page
(for HTTP/HTTPS applications) or drop the packet (for DNS applications).
When the log-headers
argument is true, the request headers will be printed in the error log.
Below is an example:
true =>
limit-req-concurrency(key: client-addr, target-n: 100, reject-n: 200);
limit-req-count
syntax: limit-req-count(key: KEY, target-n: NUM, reset-time: SECONDS, log-headers: BOOL)
Limits the request number NUM
during the specified time window SECONDS
.
The named argument key
is optional. When omitted, it is equivalent to
a constant key.
The named argument target-n
is the maximum request number.
The named argument reset-time
is the time window in seconds..
When the request number exceeds NUM
, the request handler will
immediately reject the current request with a 503 error page (for HTTP/HTTPS
applications) or drop the packet immediately (for DNS applications).
When the log-headers
argument is true, the request headers will be printed in the error log.
Below is an example:
true =>
limit-req-count(key: client-addr, target-n: 10, reset-time: 60);
limit-req-rate
syntax:
limit-req-rate(key: KEY, target-rate: RATE, reject-rate: RATE,
reject-action: ACTION,
hcaptcha-clearance-time: HCAPTCHA-CLEARANCE-TIME,
edge-captcha-clearance-time: EDGE-CAPTCHA-CLEARANCE-TIME,
redirect-validate-clearance-time: REDIRECT-VALIDATE-CLEARANCE-TIME,
error-page-status-code: STATUS-CODE,
log-headers: BOOL,
page-template-id: PAGE-TEMPLATE-ID)
Limits the request rate around the specified user key.
The named argument key
is optional. When omitted, it is equivalent to
a constant key.
The named argument target-rate
is the maximum rate we want to shape into.
When the incoming rate exceeds reject-rate
, the request handler will
immediately reject the current request with a 503 error page (for HTTP/HTTPS
applications) or drop the packet immediately (for DNS applications).
When the incoming rate is between the target-rate
and the reject-rate
,
this action will wait an appropriate amount of time to match the target-rate
for the current request.
The value of reject-rate
must be no smaller than target-rate
.
Both of the rate values must take a unit like [r/s]
and [r/min]
.
The named parameter reject-action
is optional and supports the following actions:
enable_hcaptcha
: Triggers hCaptcha. The parameterHCAPTCHA-CLEARANCE-TIME
specifies the duration (in seconds) for which re-verification is not required after successful validation.enable_edge_captcha
: Activates edge captcha. The parameterEDGE-CAPTCHA-CLEARANCE-TIME
determines the period (in seconds) during which re-verification is unnecessary following successful validation.error_page
: Returns a custom error page. TheSTATUS-CODE
parameter defines the HTTP status code to be returned.close_connection
: Immediately terminates the connection. This action is equivalent toerror_page
with aSTATUS-CODE
of 444.redirect_validate
: Initiates a redirect validation process.js_challenge
: Implements a JavaScript challenge.page_template
: Returns a specific page template, identified by its unique ID.
When the log-headers
argument is true, the request headers will be printed in the error log.
Below is an example:
true =>
limit-req-rate(key: client-addr,
target-rate: 10 [r/s],
reject-rate: 20 [r/s],
reject-action: "enable_hcaptcha",
hcaptcha-clearance-time: 50);
The user can initiate multiple limit-req-rate
calls for different keys and
rates in a single request handler.
limit-resp-data-rate
syntax: limit-resp-data-rate(rate)
syntax: limit-resp-data-rate(rate, after: size)
Limits the data rate when sending response (body) data. The positional
argument rate
specifies the rate for the maximum sending speed. It
must be a quantity value taking a rate unit, like [kB/s]
.
The optional named argument, after
, takes a size argument with a size
unit like kB
and mB
.
Below is an example:
true =>
limit-resp-data-rate(100 [kB/s], after 200 [kB]);
Note that the lower-case k
prefix means a scale of 1000 while the upper-case
K
prefix means 1024. Similarly, the lower-case b
unit means bit, while
upper-case B
means byte, i.e., octet.
local-time
syntax: local-time()
syntax: local-time(year: YEAR, month: MONTH, day: MDAY, hour: HOUR, min: MINUTE, sec: SECOND)
Returns the number of non-leap seconds from the epoch (usually January 1, 1970 00:00:00 UTC) to the time specified by parameters with the current system time zone.
It’s a quantity typed value.
For example 2019-01-01 00:00:00
in GMT+8 will come to 1546272000 [s]
.
If there are no parameters when called, the current Unix Epoch time will be returned, the return value equals now.
If there are some parameters, the rest will be filled with default values:
- YEAR: 0
- MONTH: 1
- MDAY: 1
- HOUR: 0
- MINUTE: 0
- SECOND: 0
Examples:
true =>
local-time().say; # current timestamp (quantity typed value)
# 1546272000 [s]
true =>
local-time(year: 2019, month: 1, day: 1, hour: 0, min: 0, sec: 0).say,
local-time(year: 2019).say;
local-time-day
syntax: local-time-day()
Returns the day of month in the current timezone.
For example 2019-01-02 03:04:05
in the current timezone will come to 2
.
true =>
local-time-day().say;
local-time-hour
syntax: local-time-hour()
Returns the current hour in the current timezone.
For example 2019-01-02 03:04:05
in the current timezone will come to 3
.
true =>
local-time-hour().say;
local-time-min
syntax: local-time-min()
Returns the current minute in the current timezone.
For example 2019-01-02 03:04:05
in the current timezone will come to 4
.
true =>
local-time-min().say;
local-time-sec
syntax: local-time-sec()
Returns the current second in the current timezone.
For example 2019-01-02 03:04:05
in the current timezone will come to 5
.
true =>
local-time-sec().say;
syntax: print(msg...)
Generates custom response body data pieces. When the response header is not sent yet, the response will be automatically sent before sending out body data, for obvious reasons.
Unlike the say action, this action does not append a newline character to the user messages.
For example:
true =>
print("hello", ", world!"),
print(" oh, yeah");
redirect
syntax: redirect(uri: URI)
syntax: redirect(host: HOST, uri: URI, args: ARGS)
syntax: redirect(scheme: SCHEME, host: HOST, port: PORT, uri: URI, code: CODE)
phase: rewrite
Sends an HTTP redirect response. It takes the following named arguments:
uri
The URI string, excluding any querystring suffixes or host/scheme prefixes.
args
The URI querystring or a hash table with the argument key-value pairs.Defaults to none.
host
The host name to be redirected to. This is optional. Defaults to the current server.
port
The port to be redirected to. This is optional. Defaults is
80
forhttp
and443
forhttps
.scheme
The protocol scheme, like
http
andhttps
. Defaults to the current request’s protocol scheme.code
The status code to be used. It should be either 301, 302, 303, or 307. Defaults to 302.
For example:
uri("/foo") =>
redirect(uri: "/blah/bah.html");
uri("/foo") =>
redirect(scheme: "https", host: "a.foo.com", port: 443, uri: "/blah/bah.html",
args: "a=1&b=4", code: 301);
replace-resp-filter
syntax: replace-resp-filter(string, replacement)
syntax: replace-resp-filter(string, replacement, g: BOOL)
syntax: replace-resp-filter(regex, replacement)
syntax: replace-resp-filter(regex, replacement, g: BOOL)
This function replaces content in the response body. It can match either a fixed string or a regular expression pattern.
It can only be used in the resp-body defer
blocks.
Parameters:
string
orregex
: The string or regular expression to match.replacement
: Specifies what to replace the matched content with. It can be a constant string or a user-defined function.- If it’s a string, it can include subpattern capture variables (like
$1
,$2
, etc.).
- If it’s a string, it can include subpattern capture variables (like
g
: Optional boolean parameter, defaults to false.- When true, replaces all occurrences of the match.
- When false, only the first occurrence is replaced.
Examples:
replace-resp-filter("hello", "hi")
replaces the first “hello” with “hi”replace-resp-filter("hello", "hi", g: true)
replaces all “hello” with “hi”replace-resp-filter(/\d+/, "number")
replaces the first number with “number”
This function is useful for modifying response content before sending it to the client.
Usage Examples:
host('test1.com') =>
defer resp-body {
replace-resp-filter(/\d+/, "number", g: true);
};
host('test2.com') =>
defer resp-body {
replace-resp-filter(/(\d+)/, "number: $1", g: true);
};
Advanced Usage:
For more complex replacement needs, such as transforming subpattern capture variables, you can use a user-defined function. The following example demonstrates how to convert captured text to lowercase:
func transform(Str $full-match, Str @groups) =
"before: $full-match" ~ ', after: ' ~ lower-case(@groups[0]) ~ ' ' ~ lower-case(@groups[1]) ~ @groups[2];
true =>
defer resp-body {
replace-resp-filter(rx:i/(hello)\s*(world)(!)/, transform);
},
say("HELLO WORLD!");
In this example:
- The
transform
function takes two parameters:$full-match
: The complete matched content@groups
: An array of subpattern capture variables
@groups[0]
represents the first capture group (HELLO),@groups[1]
the second (WORLD), and so on.- This example will ultimately return:
before: HELLO WORLD!, after: hello world!
By using user-defined functions, you can implement more flexible and powerful content replacement logic.
rewrite-uri-seg
syntax: rewrite-uri-seg(index, replacement)
syntax: rewrite-uri-seg(index1, replacement1, index2, replacement2, ...)
This function treats the URI path string as multiple segments separated
by slashes (/
) and replaces the segments of the specified indexes, with the specified replacement
argument values. The segment indexes are form 1, and increments from left to right of the URI path.
For example, for request URI /foo/bar/baz
, rewrite-uri-seg(1, "qux")
yields a new
URI /qux/bar/baz
, and rewrite-uri-seg(2, "qux")
yields /foo/qux/baz
.
Multiple indexes can be specified at the same time
as well, for example rewrite-uri-seg(2, "qux", 3, "foo")
yields /foo/qux/foo
.
rm-req-cookie
syntax: rm-req-cookie(name)
syntax: rm-req-cookie(name1, name2, ...)
Removes request cookies whose names matching any of the arguments.
Here is an example:
true =>
rm-req-cookie("foo", "bar");
rm-req-header
syntax: rm-req-header(pattern...)
Removes request headers whose names matching any of the user patterns specified
by the positional arguments, using the eq
relational operator.
The patterns can be either a literal string, a regex, or a wildcard.
Here is an example:
true =>
rm-req-header("Authorization", rx/X-.*/, wc/Internal-*/);
This rule unconditionally removes any request headers with the name Authorization
,
any names started with X-
, or any names started with Internal-
.
rm-resp-cookie
syntax: rm-resp-cookie(name...)
Remove one or more cookies with the specified name in the response header.
Here is an example:
true =>
rm-resp-cookie("_uid", "foo");
This rule unconditionally removes any response cookies with the name _uid
and foo
.
If you want to delete all cookies, use rm-resp-header("Set-Cookie")
.
rm-resp-header
syntax: rm-resp-header(pattern...)
Removes response headers whose names matching any of the user patterns specified
by the positional arguments, using the eq
relational operator.
The patterns can be either a literal string, a regex, or a wildcard.
Here is an example:
true =>
rm-resp-header("Set-Cookie", rx/X-.*/, wc/Internal-*/);
This rule unconditionally removes any response headers with the name Set-Cookie
,
any names started with X-
, or any names started with Internal-
.
rm-uri-arg
syntax: rm-uri-arg(name...)
Removes the URI arguments by their names.
For example:
true =>
rm-uri-arg("foo");
rm-uri-prefix
syntax: rm-uri-prefix(pattern...)
Removes the URI prefix matching the first one in the user-supplied patterns specified by arguments.
For example:
true =>
rm-uri-prefix("/foo/", rx{/foo\d+/});
For request URI /foo/hello
, this rule will turn the URI into /hello
.
And for request /foo1234/
, this rule will yield the new URI /
.
rm-uri-seg
syntax: rm-uri-seg(index...)
This function treats the URI path string as multiple segments separated
by slashes (/
) and removes the segments of the specified indexes. The
segment indexes are form 1, and increments from left to right of the URI
path.
For example, for request URI /foo/bar/baz
, rm-uri-seg(1)
yields a new
URI /bar/baz
, rm-uri-seg(2)
yields /foo/baz
, and rm-uri-seg(3)
returns /foo/bar/
. Multiple indexes can be specified at the same time
as well, as in rm-uri-seg(2, 5)
.
say
syntax: say(msg...)
Generates custom response body data pieces with a trailing newline character automatically appended. When the response header is sent yet, the response will be automatically sent before sending out body data, for obvious reasons.
If you do not want a trailing newline character to be appended, please use the print action instead.
For example:
true =>
say("hello", ", world!"),
say(" oh, yeah");
set-error-page
syntax: set-error-page(resp-body: CONTENT, content-type: CONTENT-TYPE, error_code…)
syntax: set-error-page(refetch-url: URL, content-type: CONTENT-TYPE, error_code…)
syntax: set-error-page(page-template-id: ID, content-type: CONTENT-TYPE, error_code…)
Sets the error page for the specified error status code
with optional content-type
.
Error page can be set in two ways:
- resp-body: raw html content.
- refetch-url: static resource url.
- page-template-id: ID of global page template
Note that two or more ways cannot appear at the same time or an error will be reported.
List of error status code supported in this API:
- 403
- 404
- 500
- 501
- 502
- 503
- 504
For example:
true =>
set-error-page(404, resp-body: "<h1>Not Found</h1>");
true =>
set-error-page(500, refetch-url: "http://example.com/error.html")
set-otel-span-name
syntax: set-otel-span-name('name')
phase: rewrite
The default span name is the request URI. Use this API to change the span name.
Below is an example:
true =>
set-otel-span-name('new-span-name');
set-proxy-cache-default-ttl
syntax: set-proxy-cache-default-ttl(time, status: STATUS)
Sets the default expiration time for the proxy cache when the response status is STATUS
.
The default status
is 200
, only support 200
, 301
, 302
for now.
The time
positional argument must be a quantity value
taking a time unit, like [sec]
(for seconds), [min]
(for minutes),
[hour]
(for hours), and [day]
(for days).
Below is an example:
uri-prefix("/css/") =>
set-proxy-cache-default-ttl(1 [day]);
This action does not affect the current response’s Expires
or Cache-Control
response headers. If you want to override browser cache duration, you can write:
uri-prefix("/css/") =>
set-proxy-cache-default-ttl(1 [day]);
expires(12 [hour]);
It is possible to specify different expiration times for the node cache (via set-proxy-cache-default-ttl) and for the browser (via expires).
See also expires.
set-proxy-cache-use-stale
syntax: set-proxy-cache-use-stale('off')
syntax: set-proxy-cache-use-stale('http_500', 'invalid_header', ...)
phase: rewrite
Modify the proxy_cache_use_stale
configuration for the current request,
Passing off will disable the proxy-cache-use-stale
function.
Below is an example:
true =>
set-proxy-cache-use-stale('http_500', 'invalid_header');
set-req-cookie
syntax: set-req-cookie(name, value)
syntax: set-req-cookie(name1, value1, name2, value2, ...)
Sets new request cookie, overriding any existing cookies of the same name.
Below is an example:
true
=>
set-req-cookie("foo", "foo", "bar", "bar"),
say("foo:" ~ req-cookie("foo")),
say("bar:" ~ req-cookie("bar"));
set-req-header
syntax: set-req-header(name, value)
syntax: set-req-header(name1, value1, name2, value2, ...)
syntax: set-req-header(%name-value-pairs)
Sets request headers, overriding any existing request headers of the same names.
For example:
uri-prefix("/foo/") =>
set-req-header("X-Debug", 1);
If you want to add new request headers without overriding existing ones, please use the add-req-header builtin action instead.
req-body
syntax: req-body()
Get request body.
For example:
req-body =>
req-body.say;
set-req-body
syntax: set-req-body(body)
Sets request body, overriding current request body.
For example:
uri-prefix("/foo/") =>
set-req-body("foo");
set-proxy-host
syntax: set-proxy-host(host)
The default proxy host is the current request host.
set-proxy-header
syntax: set-proxy-header(header, value)
syntax: set-proxy-header(header1, value1, header2, value2, ...)
Sets proxy headers to the proxied server. Existing headers with the same name will be removed.
You can use this function to turn a connection between a client and server from HTTP/1.1 into WebSocket, below is an example:
true =>
set-proxy-header("Upgrade", "WebSocket",
"Connection", "Upgrade");
List of headers which can not be set using this API:
- Content-Length
- Transfer-Encoding
- If-Modified-Since
- If-None-Match
set-proxy-uri
syntax: set-proxy-uri(uri, [query-string: QUERY-STRING] )
Sets proxy uri to the proxied server. The URI value should not contain any query string or host/port parts. The query-string is optional.
true =>
set-proxy-uri("/foo.html");
true =>
set-proxy-uri("/foo.html", query-string: "foo=bar");
set-req-host
syntax: set-req-host(host)
Sets the request Host
header to the value of the host
positional argument.
This is just a shorthand for set-req-header("Host", host)
.
This will not make the current request re-match new virtual machines. But
rather, it usually just affects the Host
request forwarded to the upstream
servers.
For example:
host("images.foo.com") =>
set-req-host("images.foo.com.s3.amazonaws.com");
set-resp-cookie
syntax: set-resp-cookie(name, value, domain: DOMAIN, path: PATH, http-only: BOOL, expires: TIME, max-age: TIME)
Sets a new response cookie, overriding any existing cookies of the same name.
set-resp-cookie-samesite
syntax: set-resp-cookie-samesite(value)
syntax: set-resp-cookie-samesite(value, names: NAMES)
Sets the SameSite property of the response cookie, value
can be Strict
and Lax
. All response cookies will be modified by default, and the response cookie can be specified via the names
parameter.
Below is an example:
true =>
set-resp-cookie-samesite("Lax", names: ("cookie1", "cookie2"));
set-resp-header
syntax: set-resp-header(header, value)
syntax: set-resp-header(header1, value1, header2, value2, ...)
syntax: set-resp-header(%name-value-pairs)
phase: rewrite resp-header
Sets response headers to the current response. Existing headers with the same name will be removed. If you do not want to override existing same-name headers, please use add-resp-header instead.
Below is an example:
true =>
set-resp-header("X-Powered-By", "OpenResty Edge");
set-resp-body
syntax: set-resp-body(value)
phase: resp-body
Sets response body to the current response.
It can only be used in the resp-body defer blocks.
Below is an example:
true =>
defer resp-body {
set-resp-body("hello world");
};
capture-resp-body
syntax: capture-resp-body(size)
phase: rewrite resp-header
Captures response body to the log variable $response_body
and we need to set the maximum value in bytes.
If the size of the response body exceeds the set maximum, only part of the response body content will be captured.
Takes a size
argument with a size unit like kB
and mB
.
Below is an example:
true =>
capture-resp-body(1 [kB]);
true =>
defer resp-header {
{
resp-status(403) =>
capture-resp-body(8192);
};
};
set-resp-status
syntax: set-resp-status(code)
Sets the current response’s status code to the value of the code
argument.
Do not call this action after the response header is already sent out (like triggered by a print or say call).
Below is an example:
req-header("User-Agent") eq "" =>
set-resp-status(450),
say("custom body message for 450 status"),
exit(450),
done;
set-proxy-retry-condition
syntax: set-proxy-retry-condition(...)
phase: rewrite
Specifies in which cases a request should be passed to another upstream server, default: error, timeout
.
The supported values are error, timeout, invalid_header, http_500, http_502, http_503, http_504, http_403, http_404, http_429, non_idempotent.
- error: an error occurred while establishing a connection to the server, passing a request to it, or reading a response header;
- timeout: a timeout occurred while establishing a connection to the server, passing a request to it, or reading response headers;
- invalid_header: a server returned an empty or invalid response;
- http_CODE: a server returned the specified HTTP status code;
- non_idempotent: usually requests for non-idempotent methods (POST, LOCK, PATCH) will only be passed to one server, enabling this option explicitly allows such requests to be retried;
true =>
set-proxy-retry-condition('http_404'),
set-proxy-retries(1),
done;
In the example, if the upstream server returns a 404 status code, 1 retry will be performed.
set-proxy-retries
syntax: set-proxy-retries(num)
Set the number of retries to upstream. default 0, which means the request won’t be passed to another upstream server even it meet the retry condition.
set-proxy-timeouts
syntax: set-proxy-timeouts(connect: TIMEOUT?, send: TIMEOUT?, read: TIMEOUT?)
syntax: set-proxy-timeouts(connect: TIMEOUT)
syntax: set-proxy-timeouts(send: TIMEOUT)
syntax: set-proxy-timeouts(read: TIMEOUT)
Set proxy timeout, default: connect timeout: 60s, send timeout: 60s, read timeout: 60s.
TIMEOUT
is a quantity value, like: 60 [s], 60 [ms]。
set-proxy-recursion-depth
syntax: set-proxy-recursion-depth()
syntax: set-proxy-recursion-depth(DEPTH)
Set proxy recursion max depth to prevent proxy recursion, the default DEPTH is -1, which means the proxy recursion detection is disabled.
Note when this feature is turned on, addition request header OR-Proxy-Recursion-Depth
will be send to upstream.
set-uri
syntax: set-uri(uri)
Sets the current request URI to a new value. The URI value should not contain any query string or host/port parts.
This action is mainly useful for changing the request URI to be forwarded to the upstream via the proxy.
set-uri-arg
syntax: set-uri-arg(name, value)
syntax: set-uri-arg(name1, value1, name2, value2, ...)
syntax: set-uri-arg(%name-value-pairs)
Sets URI arguments in the current request. Existing URI arguments with the same name will be removed. If you do not want to override existing same-name URI arguments, please use add-uri-arg instead.
Below is an example:
true =>
set-uri-arg("uid", "1234");
set-upstream-name
syntax: set-upstream-name(upstream-1, weight-1?, upstream-2?, weight2?)
phase: rewrite
Sets proxied to a single upstream or multiple upstreams with weights.
upstream-1
, upstream-2
means the upstream name, we will get the upstream’s info at runtime.
If there is no such upstream in application, we will get info from global upstreams.
uri('/') =>
set-upstream-name('upstream-1', 1, 'upstream-2', 1);
set-backup-upstream-name
syntax: set-backup-upstream-name(upstream-1, upstream-2?)
phase: rewrite
Sets backup upstreams for the application. Request will be proxied to backup upstreams when main upstreams
are all error and match retry conditions.
uri('/') =>
set-upstream-name('upstream-2'),
set-proxy-retry-condition('error'),
set-proxy-retries(1),
set-backup-upstream-name('upstream-1');
set-upstream-addr
syntax: set-upstream-addr(ip: IP, [host: DOMAIN], port: PORT, [scheme: SCHEME])
phase: rewrite
Sets single upstream with address. Cannot use IP and host arguments at the same time.
Scheme argument default value is http
.
uri('/ip') =>
set-upstream-addr(ip: '127.0.0.1', port: 80);
uri('/domain') =>
set-upstream-addr(host: 'localhost', port: 80);
uri('/scheme') =>
set-upstream-addr(host: 'localhost', port: 443, scheme: 'https');
upstream-has-live-nodes
syntax: upstream-has-live-nodes(upstream-name)
phase: rewrite
Check if the upstream is healthy.
The following cases will return true
:
- Presence of one or more healthy upstream nodes
- Upstream health check is not turned on
The following cases will return false
- No healthy upstream nodes exist
- No upstream of the specified name exists
Note
The upstream health check for a node is only turned on when the node starts accessing the upstreamupstream-has-live-nodes('upstream-1') =>
set-upstream-addr(ip: '127.0.0.1', port: 80);
set-upstream-retry-uri
syntax: set-upstream-retry-uri(uri)
phase: rewrite
Set the retry URL when proxying to the upstream server fails. The original URI will be used to retry the specified number of retries, and then the new URI will be used for 1 retry.
The argument uri
supports Edgelang Str variables and string constants.
true =>
set-upstream-retry-uri("/hello"),
set-proxy-retry-condition('http_404'),
done;
In the example, if the upstream server returns a 404 status code, it will use /hello
to retry.
set-max-body-size
syntax: set-max-body-size(size)
phase: rewrite
Sets the size of maximum POST body this request will accept. For requests with valid Content-Length header, this method will check immediately and terminate request processing with 413 Request Entity Too Large if it is greater than size. For chunked encoding requests and HTTP/2 requests, this will check as the buffers are being processed.
If it is set to 0 it will disable this check.
Takes a size
argument with a size unit like kB
and mB
.
true =>
set-max-body-size( 1 [kB]); # 1000 bytes
set-access-log-off
syntax: set-access-log-off()
phase: rewrite
Dynamically turns off access log for the current request.
true =>
set-access-log-off();
sleep
syntax: sleep(time)
Sleeps for the specified seconds without blocking. One can specify time resolution up to 0.001 seconds (i.e., one millisecond).
Below is an example:
true =>
sleep(0.5);
utc-time
syntax: utc-time()
syntax: utc-time(year: YEAR, month: MONTH, day: MDAY, hour: HOUR, min: MINUTE, sec: SECOND)
Returns the number of non-leap seconds from the epoch (usually January 1, 1970 00:00:00 UTC) to the time specified by parameters with the UTC time zone.
It’s a quantity typed value.
For example 2019-01-01 00:00:00
in UTC will come to 1546300800 [s]
.
If there are no parameters when called, the current Unix Epoch time will be returned, the return value equals now.
If there are some parameters, the rest will be filled with default values:
- YEAR: 0
- MONTH: 1
- MDAY: 1
- HOUR: 0
- MINUTE: 0
- SECOND: 0
true =>
utc-time().say; # current timestamp (quantity typed value)
# 1546300800 [s]
true =>
utc-time(year: 2019, month: 1, day: 1, hour: 0, min: 0, sec: 0).say,
utc-time(year: 2019).say;
waf-mark-risk
syntax: waf-mark-risk(level: LEVEL, msg: MESSAGE)
Mark the request as evil with the optional level
and msg
.
The basic workflow of WAF is as follows:
- mark a request as evil with risk level,
- we will sum up the risk based on the remote address,
- we will do the
block
action (configured in the admin site) when the sum risk reaches the threshold score (configured in the admin site).
The level
argument can be:
definite
means the current request is definite dangerous, will do theblock
action anyway (no matter what the threshold level is).high
means high risk.middle
means middle risk.low
means low risk (default).debug
means debug rule, will not do theblock
action anyway.
The msg
argument is used to describe the WAF rule.
It will be shown in the admin site when a request matches this rule.
For example:
uri contains rx:s{root\.exe}
=>
waf-mark-risk(level: 'definite');
uri contains any('nessustest', 'appscan_fingerprint')
=>
waf-mark-risk(msg: 'Request Indicates a Security Scanner Scanned the Site');
waf-config
syntax: waf-config(action: NAME, url: URL)
NOTE: this action is deprecated, you are recommended to use run-waf-rule-sets
instead.
Configure the waf block action, action
is required, it can be:
log
, means log only,reject
, means 403 reject,redirect
, means 302 redirect to the specified url.
The url
is required when the action
is redirect
.
like:
uri-prefix("/api/")
=>
waf-config(action: "redirect", url: "http://foo.com/bar");
uri-prefix("/static/")
=>
waf-config(action: "reject");
run-waf-rule-sets
syntax: run-waf-rule-sets(action: ACTION, url: URL, key: KEY, threshold: THRESHOLD, observe-time: TIME, clearance-time: TIME, page-template-id: ID, name-1, name-2, ...)
Run the specified waf rule sets.
And will do the action
when the total risk score (with the same key
) reached the threshold
in the observe-time
.
The total risk score is sum by the key
and the default value of key
is client-addr
.
The action
it can be:
log
, means log only, it’s default,reject
, means 403 reject,redirect
, means 302 redirect to the specified url,hcaptcha
, return an captcha page by using the https://www.hcaptcha.com/,edge-captcha
, return an captcha page by using edge self.page-template
, returns the page rendered according to the page template,support variables::CLIENT_IP::
,::HOST::
,::HCAPTCHA_BOX::
,::CAPTCHA_BOX::
, etc.close-connection
: close the HTTP connection directly.redirect-validate
: perform a 302 redirect validation. If the validation fails, return 403 directly. It can be used to defend against DDoS attacks.js-challenge
: challenge the request with JavaScript.
The url
is required when the action
is redirect
.
When the action
is hcaptcha
, edge-captcha
, or page-template
, clearance-time
is required. After verification, all requests with the same key
will be allowed to pass within the clearance-time
.
example:
true
=>
run-waf-rule-sets(action: "hcaptcha",
key: client-addr,
threshold: 100,
observe-time: 60 [s],
clearance-time: 300 [s],
"14", "15"
);
The 14
and 15
in the example correspond to the OpenResty Edge’s XSS rule set(application_attack_xss)
and SQL injection rule set(application_attack_sqli)
, respectively.
set-ssl-protocols
syntax: set-ssl-protocols("protocol1", "protocol2", ...),
Sets the SSL handshake protocol for HTTPS requests. The optional protocols are: SSLv2, SSLv3, TLSv1, TLSv1.1, TLSv1.2, TLSv1.3.
example:
true =>
set-ssl-protocols("TLSv1.1", "TLSv1.2"),
done;
set-ssl-ciphers
syntax: set-ssl-ciphers("cipher1:cipher2:...")
Sets the SSL ciphers for HTTPS requests.
example:
true =>
set-ssl-ciphers("DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA"),
done;
set-uploaded-file-args
syntax: set-uploaded-file-args(max-content-len, max-file-count)
Set parameters for parsing uploaded files.
max-content-len
indicates the length of the cached file content, 0 means not to retain the file content, and the default value is 0.
File content is often used for webshell detection.
max-file-count
indicates the number of cached uploaded files, 0 means no limit, and the default value is 1.
example:
true =>
# 102400 = 100KB
set-uploaded-file-args(max-content-len: 102400, max-file-count: 10);
done;
uploaded-file-extensions
syntax: uploaded-file-extensions()
Get the uploaded file extension.
example:
any(uploaded-file-extensions) eq any("txt") =>
say("found txt file");
uploaded-file-contents
syntax: uploaded-file-contents()
Get the uploaded file content. By default, the content of the file is not cached, and it can be cached after setting the length through set-uploaded-file-args.
example:
any(uploaded-file-contents) contains any("attack data") =>
say("found attack data");
uploaded-file-names
syntax: uploaded-file-names()
Get the uploaded file names.
example:
true =>
say(uploaded-file-names);
uploaded-file-combined-size
syntax: uploaded-file-combined-size()
Get the sum of uploaded file sizes.
example:
uploaded-file-combined-size > 1024 =>
say("file too large");
uploaded-file-contents-matched
syntax: uploaded-file-contents-matched()
Check that all uploaded files have matching extensions and content.
Returns true
if it matches, otherwise returns false
.
example:
uploaded-file-contents-matched =>
say("all file extensions and contents match");
req-args-combined-size
syntax: req-args-combined-size()
Gets the sum of the URI and POST parameter sizes.
For example a=1&b=2
will return 4, which are a
, 1
, b
, 2
respectively.
example:
req-args-combined-size > 1024 =>
say("args too large");
validate-url-encoding
syntax: validate-url-encoding(data)
Check if the data
string is invalid URL-encoded. Illegal return true
, legal return false
.
example:
validate-url-encoding(req-uri) == false =>
say("valid request url");
validate-byte-range
syntax: validate-byte-range(data, "range1", "range2", ...)
Checks that the characters in the data
string are all in the required range (range1, range2, …).
Return false
if in the requested range, true
if not.
example:
validate-byte-range(any(uri-arg-names), "1-255") == true =>
say("invalid uri arg names"),
done;
validate-csrf-token
syntax: validate-csrf-token()
syntax: validate-csrf-token(ttl: TTL)
This action is used in conjunction with the inject-csrf-token
action for CSRF protection. If the request method is one of HEAD
, GET
, TRACE
, OPTIONS
, the action will return ok
directly. Otherwise it checks if the _edge_csrf_token
parameter in the URI parameter or POST form parameter is valid and returns ok
, and if not, the corresponding error message is returned.
The error message may be:
- missing csrf token: There is no CSRF token parameter.
- invalid csrf token: The CSRF token is illegal and may be a forged parameter.
- expired csrf token: The CSRF token parameter has expired.
The validity of the CSRF Token can be set via the ttl
parameter, with a default expiry time of 3600 seconds. If this parameter is 0, then the CSRF Token will never expire.
The following is an example of CSRF protection implemented:
my Str $csrf-res;
true =>
rm-req-header("Accept-Encoding"),
defer resp-body {
inject-csrf-token();
},
$csrf-res = validate-csrf-token(3600),
{
$csrf-res ne "ok" =>
waflog($csrf-res, action: "block", rule-name: "csrf_protection"),
exit(403);
};
waflog
syntax: waflog(msg)
syntax: waflog(msg, action: ACTION, rule-name: RULENAME)
Produces a WAF log, and if needed, you can specify the interception action and rule name to be displayed in the log through the action and rule-name parameters.
Some examples:
true =>
waflog("log"),
waflog("forbidden", action: "block", rule-name: "custom-rule");
http-version
syntax: http-version()
Get the version of the HTTP request, the values are 0.9, 1.0, 1.1, 2.0.
Some examples:
http-version() eq "1.1" =>
say(http-version());
req-line
syntax: req-line()
Get the HTTP request line.
Some examples:
true =>
say(req-line());
skip-json-values
syntax: skip-json-values(uri-arg-values:true, post-arg-values:true, req-cookie-values:true)
WAF skips the inspection of URI parameters, POST parameters, or cookie parameters with a value of a JSON string.
Some examples:
true =>
skip-json-values(uri-arg-values: true),
run-waf-rule-sets(action: "block", threshold: 0, "12"),
done;
is-json-string
syntax: is-json-string(str)
Check if a string is in JSON format.
Some examples:
is-json-string('{"k":"v"}') =>
say("is json string"),
done;
set-proxy-ignore-no-cache
syntax: set-proxy-ignore-no-cache(enable)
Set to ignore or not ignore Cache-Control: no-cache
and Cache-Control: no-store
.
Some examples:
true =>
enable-proxy-cache(key: uri),
set-proxy-cache-default-ttl(1 [min]),
set-proxy-ignore-no-cache(),
set-upstream('my-upstream');
set-ngx-var
Syntax: set-ngx-var(key, value)
This directive is used to set a key-value pair in ngx.var
.
Some examples:
true =>
set-ngx-var("foo", 32);
ngx-var
Syntax: val = ngx-var(key)
This directive is used to get the value of a specified key from ngx.var
.
Some examples:
true =>
say(ngx-var("foo"));
set-ctx-var
Syntax: set-ctx-var(key, value)
This directive is used to set a key-value pair in ngx.ctx._edge_ctx
.
Some examples:
true =>
set-ctx-var("foo", 32);
ctx-var
Syntax: val = ctx-var(key)
This directive is used to get the value of a specified key from ngx.ctx._edge_ctx
.
Some examples:
true =>
say(ctx-var("foo"));
run-slow-ratio-circuit-breaker
Syntax: run-circuit-breaker(key: KEY, window-time: WINDOWN_TIME, open-time: OPEN_TIME, hopen-time: HOPEN_TIME, failure-time: FAILURE_TIME, failure-percent: FAILURE_PERCENT, min-reqs-in-window: MIN_REQS_IN_WINDOW, open-action: OPEN_ACTION, resp-status: RESP_STATUS, resp-body: RESP_BODY)
Syntax: run-slow-ratio-circuit-breaker(key: KEY, window-time: WINDOWN_TIME, open-time: OPEN_TIME, hopen-time: HOPEN_TIME, failure-time: FAILURE_TIME, failure-percent: FAILURE_PERCENT, min-reqs-in-window: MIN_REQS_IN_WINDOW, open-action: OPEN_ACTION, resp-status: RESP_STATUS, resp-body: RESP_BODY)
Syntax: run-failure-ratio-circuit-breaker(key: KEY, window-time: WINDOWN_TIME, open-time: OPEN_TIME, hopen-time: HOPEN_TIME, failure-status: FAILURE_STATUS, failure-percent: FAILURE_PERCENT, min-reqs-in-window: MIN_REQS_IN_WINDOW, open-action: OPEN_ACTION, resp-status: RESP_STATUS, resp-body: RESP_BODY)
Syntax: run-failure-count-circuit-breaker(key: KEY, window-time: WINDOWN_TIME, open-time: OPEN_TIME, hopen-time: HOPEN_TIME, failure-status: FAILURE_STATUS, failure-count: FAILURE_COUNT, min-reqs-in-window: MIN_REQS_IN_WINDOW, open-action: OPEN_ACTION, resp-status: RESP_STATUS, resp-body: RESP_BODY)
These instructions are used to enable circuit breakers. The currently supported circuit breakers include slow request ratio circuit breakers, error ratio circuit breakers, and error count circuit breakers. The default is to use the “slow request ratio circuit breaker.”
Different circuit breakers are identified by different key
.
window-time
: The sliding window length used to calculate the statistical time range for error or slow response ratios.open-time
: The duration for which the circuit breaker remains open after being tripped, during which all requests undergo a specificopen-action
.hopen-time
: The duration of the half-open state, which is the phase where the circuit breaker attempts to recover by conducting a limited number of test requests.failure-time
: The time threshold for a request to be considered slow request.failure-status
: The status list considered a failed request, such as 502 and 503.failure-percent
: The percentage threshold of failed or slow requests that trigger the tripping of the circuit breaker.failure-count
: The count threshold of failed requests that trigger the tripping of the circuit breaker.min-reqs-in-window
: The minimum number of requests that must be reached within the sliding window time to calculate the failure percentage and count and consider tripping.open-action
: The action executed when the circuit breaker is in the open state, currently supporting values such asexit
to return a default response andredirect
to redirect to an alternative service, etc.resp-status
: The HTTP status code returned to requests when the circuit breaker is open andopen-action
is set toexit
.resp-body
: The response body content returned to requests when the circuit breaker is open andopen-action
is set toexit
.redirect-url
: Whenopen-action
is set toredirect
, the URL to which the requests are redirected after the circuit breaker is open.
Some examples:
true =>
run-slow-ratio-circuit-breaker(key: "example", window-time: 60, failure-time: 500,
failure-percent: 50, min-reqs-in-window: 2);
my Num @status-codes-1 = (502, 503, 504);
true =>
run-failure-ratio-circuit-breaker(key: "example", window-time: 60, @status-codes,
failure-percent: 50, min-reqs-in-window: 4),
my Num @status-codes-2 = (502, 503, 504);
true =>
run-failure-count-circuit-breaker(key: "example", window-time: 60, @status-codes,
failure-count: 2, min-reqs-in-window: 4),
req-rejected
Syntax: req-rejected()
This directive was first introduced in OpenResty Edge 24.9.1-7
. It is used to check if an HTTP request has been marked as rejected by rate-limiting actions such as limit-req-rate, limit-req-count, limit-req-concurrency, or block-req.
Examples:
req-rejected() =>
errlog("rejected"),
exit(503);
In this example, it logs the rejected requests and returns a 503 status code.
req-header-has-underscore
Syntax: req-header-has-underscore()
This directive was first introduced in OpenResty Edge 24.9.1-7
. It is used to check if there are any underscores in the keys of HTTP request headers.
Examples:
req-header-has-underscore() =>
exit(400);
In this example, it prohibits requests with underscores in HTTP header keys by returning a 400 status code.
Case Study
case 1. Append ‘charset’ attribute to the Content-Type response header
true =>
defer resp-header {
{
resp-header("Content-type") contains any("html", "javascript", "xml"),
resp-header("Content-type") !contains "charset=utf-8" =>
set-resp-header("Content-type", resp-header("Content-type") ~ "; charset=utf-8");
};
};
case 2. Add the HTTP response header when the header does not exist
true =>
defer resp-header {
{
! resp-header("Access-Control-Allow-Origin") =>
set-resp-header("Access-Control-Allow-Origin", "*");
};
};
case 3. Replace the content in the response body
true =>
defer resp-body {
replace-resp-filter(rx{http://example.com/}, "https://new.example.com/", g: true);
};
Note: If the response is already encoded, it may not be able to be replaced successfully. You need to set the request header Accept-Encoding to avoid encoding.
case 4. insert javascript snippet into the response body
true =>
defer resp-body {
replace-resp-filter(rx{<head>}, "<script>the script you want to insert</script><head>");
};
Note: If the response is already encoded, it may not be able to be replaced successfully. You need to set the request header Accept-Encoding to avoid encoding.
Author
Yichun Zhang <yichun@openresty.com>, OpenResty Inc.
Copyright & License
Copyright (C) 2017-2020 by OpenResty Inc. All rights reserved.
This document is proprietary and contains confidential information. Redistribution of this document without written permission from the copyright holders is prohibited at all times.