Field value extraction functions

Last Updated：2025-11-14

Field value extraction functions

Introduction

A common use case of key-value extraction functions is shown in the following figure. After processing into structured data, it can be further used in SQL analysis scenarios.

e_regex function

Function definition

Get the field value and return the corresponding string.

Syntax description

e_regex(field, regex, fields_info=None, mode="fill-auto", pack_json='')

Parameter description

Parameter name	Parameter description	Parameter type	Required or not	Parameter default	Parameter range
field	The field name to be extracted	String	Yes	-	-
regex	Regular expression	String	Yes	-	-
fields_info	The target field name after matching. This parameter must be configured when the regular expression parameter does not configure the name of the named capture.	List<Table>	No	-	-
mode	Field overwriting mode. The default is fill-auto	String	No	overwrite	fill/fill-auto/add/add-auto/overwrite/overwrite-auto
pack_json	Pack all matching results of the regular expression into the field specified by pack_json. The default value is empty, indicating no packing.	String	No	-	-

Example

Example 1

Original log:

{"content": "1234abcd5678"}

Processing rules:

e_regex("content", "\d+", [{'target1':'long'}])

Processing results:

{"content": "1234abcd5678", "target1": 1234}

Example 2

Original log:

{"content": "1234abcd"}

Processing rules:

e_regex("content", "(?<target1>\d+)(.*)", [{'target2':'string'}])

Processing results:

{"content": "1234abcd5678", "target1": "1234", "target2": abcd}

Example 3

Original log:

{"content": "1234abcd5678"}

Processing rules:

e_regex("content", "\d+", [{'target1':'long'}, {'target2':'long'}])

Processing results:

{"content": "1234abcd5678", "target1": 1234, "target2": 5678}

Example 4

Original log:

{"content": "1234abcd5678"}

Processing rules:

e_regex("content", "\d+", [{'target1':'long'}, {'target2':'long'}], pack_json='new')

Processing results:

{"content": "1234abcd5678", "new": {"target1": 1234, "target2": 5678}}

e_json function

Function definition

Extract field values from JSON.

Syntax description

e_json(field, depth=100, prefix="", suffix="", fmt="simple", sep=".", mode="fill-auto")

Parameter description

Parameter name	Parameter description	Parameter type	Required or not	Parameter default	Parameter range
field	The field name to be extracted	String	Yes	-	-
depth	The depth of field expansion. The value range is 1-2000, 1 means only expanding the first layer, the default is 100 layers	Int	No	100	1~2000
prefix	The prefix added to the field name when expanding.	String	No	-	-
suffix	The suffix added to the field name when expanding.	String	No	-	-
fmt	Formatting method	String	No	simple	-
sep	The separator for formatting parent-child nodes. It needs to be set when fmt is full, parent, or root. Default to.	String	No	-	simple (default value): indicates using the node name as the field name; full: indicates combining the parent node and the current node as the field name; parent: indicates using the complete path as the field name; root: indicates combining the root node and the current node as the field name
mode	Field overwriting mode. The default is fill-auto	String	No	fill-auto	fill/fill-auto/add/add-auto/overwrite/overwrite-auto

Example

Example 1

Original log:

{"content": "{\"a\": \"a1\", \"b\": \"b1\"}"}

Processing rules:

e_json("content")

Processing results:

{"content": "{\"a\": \"a1\", \"b\": \"b1\"}", "a": "a1", "b", "b1"}

Example 2

Original log:

{"content": "{\"a\": \"a1\", \"b\": \"b1\"}"}

Processing rules:

e_json("content", prefix="_", suffix="__")

Processing results:

{"content": "{\"a\": \"a1\", \"b\": \"b1\"}", "_a__": "a1", "_b__", "b1"}

e_sep function

Function definition

Extract field value content based on specified characters (multiple characters).

Syntax description

e_kv(src_field, fields_info, sep=" ", quote="", restrict=false, mode="fill-auto")

Parameter description

Parameter name	Parameter description	Parameter type	Required or not	Parameter default	Parameter range
src_field	The field name to be extracted	String	Yes	-	-
fields_info	The target field name after matching.	List<Table>	Yes	-	-
sep	Separator, not limited to a single character.	String	No	Space	-
quote	Quote character, used to wrap the value.	String	No	-	-
restrict	Default value: false When the number of extracted values is inconsistent with the number of target fields input by the user: true: ignore, no extraction processing is performed; false: try to match the first few fields	String	No	false	true/false
mode	Field overwriting mode. The default is fill-auto	String	No	fill-auto	fill/fill-auto/add/add-auto/overwrite/overwrite-auto

Example

Example 1

Original log:

{"content": "a1 b1"}

Processing rules:

e_sep('content', [{'a':'string'}, {'b':'string'}])

Processing results:

{"content": "a1 b1", "a": "a1", "b", "b1"}

Example 2

Original log:

{"content": "a1 b1"}

Processing rules:

e_sep('k1', [{'a':'string'}])

Processing results:

{"content": "a1 b1", "a": "a1"}

Example 3

Original log:

{"content": "a1 b1"}

Processing rules:

e_sep('k1', [{'a':'string'}, {'b':'string'}, {'c':'string'}])

Processing results:

{"content": "a1 b1", "a": "a1", "b", "b1"}

e_csv function

Function definition

Extract field value content based on specified characters (multiple characters).

Syntax description

e_csv(src_field, fields_info, sep=",", quote="", restrict=false, mode="fill-auto")

Parameter description

Parameter name	Parameter description	Parameter type	Required or not	Parameter default	Parameter range
src_field	The field name to be extracted	String	Yes	-	-
fields_info	The target field name after matching.	List<Table>	Yes	-	-
sep	Separator, not limited to a single character.	String	No	,	-
quote	Quote character, used to wrap the value.	String	No	-	-
restrict	Default value: false When the number of extracted values is inconsistent with the number of target fields input by the user: true: ignore, no extraction processing is performed; false: try to match the first few fields	String	No	false	true/false
mode	Field overwriting mode. The default is fill-auto	String	No	fill-auto	fill/fill-auto/add/add-auto/overwrite/overwrite-auto

Example

Example 1

Original log:

{"content": "a1,b1"}

Processing rules:

e_csv('content', [{'a':'string'}, {'b':'string'}])

Processing results:

{"content": "a1,b1", "a": "a1", "b", "b1"}

Example 2

Original log:

{"content": "a1,b1"}

Processing rules:

e_csv('k1', [{'a':'string'}])

Processing results:

{"content": "a1,b1", "a": "a1"}

Example 3

Original log:

{"content": "a1,b1"}

Processing rules:

e_csv('k1', [{'a':'string'}, {'b':'string'}, {'c':'string'}])

Processing results:

{"content": "a1,b1", "a": "a1", "b", "b1"}

e_psv function

Function definition

Extract field value content based on specified characters (multiple characters).

Syntax description

e_psv(src_field, fields_info, sep="|", quote="", restrict=false, mode="fill-auto")

Parameter description

Parameter name	Parameter description	Parameter type	Required or not	Parameter default	Parameter range
src_field	The field name to be extracted	String	Yes	-	-
fields_info	The target field name after matching.	List<Table>	Yes	-	-
sep	Separator, not limited to a single character.	String	No	\|	-
quote	Quote character, used to wrap the value.	String	No	-	-
restrict	Default value: false When the number of extracted values is inconsistent with the number of target fields input by the user: true: ignore, no extraction processing is performed; false: try to match the first few fields	String	No	false	true/false
mode	Field overwriting mode. The default is fill-auto	String	No	fill-auto	fill/fill-auto/add/add-auto/overwrite/overwrite-auto

Example

Example 1

Original log:

{"content": "a1|b1"}

Processing rules:

e_psv('content', [{'a':'string'}, {'b':'string'}])

Processing results:

{"content": "a1|b1", "a": "a1", "b", "b1"}

Example 2

Original log:

{"content": "a1|b1"}

Processing rules:

e_psv('k1', [{'a':'string'}])

Processing results:

{"content": "a1|b1", "a": "a1"}

Example 3

Original log:

{"content": "a1|b1"}

Processing rules:

e_psv('k1', [{'a':'string'}, {'b':'string'}, {'c':'string'}])

Processing results:

{"content": "a1|b1", "a": "a1", "b", "b1"}

e_tsv function

Function definition

Extract field value content based on specified characters (multiple characters).

Syntax description

e_tsv(src_field, fields_info, sep="\t", quote="", restrict=false, mode="fill-auto")

Parameter description

Parameter name	Parameter description	Parameter type	Required or not	Parameter default	Parameter range
src_field	The field name to be extracted	String	Yes	-	-
fields_info	The target field name after matching.	List<Table>	Yes	-	-
sep	Separator, not limited to a single character.	String	No	\t	-
quote	Quote character, used to wrap the value.	String	No	-	-
restrict	Default value: false When the number of extracted values is inconsistent with the number of target fields input by the user: true: ignore, no extraction processing is performed; false: try to match the first few fields	String	No	false	true/false
mode	Field overwriting mode. The default is fill-auto	String	No	fill-auto	fill/fill-auto/add/add-auto/overwrite/overwrite-auto

Example

Example 1

Original log:

{"content": "a1\tb1"}

Processing rules:

e_tsv('content', [{'a':'string'}, {'b':'string'}])

Processing results:

{"content": "a1\tb1", "a": "a1", "b", "b1"}

Example 2

Original log:

{"content": "a1\tb1"}

Processing rules:

e_tsv('k1', [{'a':'string'}])

Processing results:

{"content": "a1\tb1", "a": "a1"}

Example 3

Original log:

{"content": "a1\tb1"}

Processing rules:

e_tsv('k1', [{'a':'string'}, {'b':'string'}, {'c':'string'}])

Processing results:

{"content": "a1\tb1", "a": "a1", "b", "b1"}

e_kv function

Function definition

Extract field values based on two-level separators.

Syntax description

e_kv(src_field, reg, keyIndex, valueIndex, fields_info=None, mode="fill-auto")

Parameter description

Parameter name	Parameter description	Parameter type	Required or not	Parameter default	Parameter range
src_field	The field name to be extracted	String	Yes	-	-
reg	The separator string of the regular expression for keywords and values	String	Yes	-	-
keyIndex	The subscript of the key, indicating which one of the regular expression matching results the key takes	Int	Yes	-	-
valueIndex	The subscript of the value, indicating which one of the regular expression matching results the value takes	Int	Yes	-	-
fields_info	The target field name after matching.	List<Table>	No	-	-
mode	Field overwriting mode. The default is fill-auto	String	No	fill-auto	fill/fill-auto/add/add-auto/overwrite/overwrite-auto

Example

Example 1

Original log:

{"content": "a:a1, b:b1"}

Processing rules:

e_kv('content', '([a-z]+):([a-z0-9]+)', 1, 2, [{'a':'string'}, {'b':'string'}])

Processing results:

{"content": "a:a1, b:b1", "a": "a1", "b", "b1"}

Mapping enrichment functions

Process control function

百度智能云

Log Service

Field value extraction functions

Field value extraction functions

Introduction

e_regex function

Function definition

Syntax description

Parameter description

Example

e_json function

Function definition

Syntax description

Parameter description

Example

e_sep function

Function definition

Syntax description

Parameter description

Example

e_csv function

Function definition

Syntax description

Parameter description

Example

e_psv function

Function definition

Syntax description

Parameter description

Example

e_tsv function

Function definition

Syntax description

Parameter description

Example

e_kv function

Function definition

Syntax description

Parameter description

Example