Compared to Microsoft KQL
Differences between Berserk's KQL and Microsoft's Kusto Query Language
Berserk implements the Kusto Query Language (KQL) as used in Azure Data Explorer, Azure Monitor, and Microsoft Sentinel. For the most part, queries that work in Azure Data Explorer will also work in Berserk. This page lists the places where Berserk diverges — either to better fit its internals and performance model, or to add features specific to observability workloads.
Coming from Microsoft Kusto
If you already write KQL for Azure Data Explorer, most of your queries work unchanged. A handful of habits are worth adjusting:
- Don't declare schema, and don't cast just to read a field. Every field — including nested ones — resolves automatically from the raw record, so query it directly; there's no column list to maintain. Use bracket notation for keys that contain dots:
resource['service.name'], notresource.service.name. - Filter on bare fields and let Berserk coerce. Write
where status == 500orwhere level == "error"straight on a dynamic field — Berserk compares by native type and keeps the indexes engaged. Passing a field to a typed function (avg(value),bin(timestamp, 5m), …) coerces it automatically via theasXXXfamily, so you rarely need a manualtolong()/todouble(). - Cast only to cross types, and only in a projection. Reach for
to*()when a value is stored as the wrong type (a number kept as a string), and put it inextend/project, never in awhere— a cast inside a filter forces a per-row scan and disables pruning. =~is idiomatic here. Case-insensitive=~is index-friendly in Berserk (unlike ADX, where it's discouraged for performance), so preferlevel =~ "error"totolower(level) == "error".hasandcontainsperform the same — pick by meaning, not speed.- Every query is time-bounded. There is always an effective
timestamprange — from the time picker,--since/--until, or an explicitwhere timestamp …— so you never scan all of history by accident.
The sections below detail each of these differences.
Schema and Field Resolution
The biggest difference from Microsoft KQL is how Berserk handles schema.
Microsoft Kusto requires fixed schemas — every column must be defined in advance, and referencing an unknown column is an error.
Berserk stores the full original record in a special $raw column (a dynamic value). Unknown column names are automatically resolved from $raw, so you can query nested fields like resource.service.name without declaring them first. This is called permissive mode and is the default. Strict mode (matching Microsoft behavior) is available but not the default.
String Coercion and the asXXX Family
Because permissive mode resolves fields from $raw as dynamic, Berserk needs a rule for how a dynamic value becomes a typed one. It uses two regimes that behave differently on purpose.
Comparisons and scan predicates are compared by native type — never coerced. A bare where field == "x" works directly on a dynamic field and keeps the indexes engaged (bloom / shard / range). A value that can't match is simply not equal — a numeric field == "5" is false, not coerced. Don't wrap a scan predicate in tostring() / tolong(); that forces per-row evaluation and disables pruning.
Typed function arguments are auto-coerced via the asXXX family. When a dynamic field is passed to a function or operator that expects a concrete type, Berserk injects the matching extractor — asstring, aslong, asint, asdouble, asbool, asdatetime, astimespan, or asnumeric — so observability data feeds typed functions with no explicit cast:
summarize avg(value) by bin(timestamp, 5m) // value auto-coerces to numeric (asnumeric)
extend host = toupper(resource.host.name) // asstring extracts the string, then upper-casesasT extracts the value when it is already that type (or a dynamic carrying it), and otherwise yields null. It never converts across types — that is the key difference from to*():
| Input | asT() — extract-or-null | to*() — convert |
|---|---|---|
a dynamic carrying a T | the value | the value |
| a value of a different type | null | parsed/converted if possible |
string "42" into a long | aslong → null | tolong → 42 |
Use an explicit to*() only to cross types — a number or datetime stored as a string — and only in project/extend, never in a filter:
extend t = todatetime(attributes.event_time) // event_time is a STRING → parse it (asdatetime would be null)
extend n = tolong(attributes.count_str) // numeric stored as a string → parse itCompared to Microsoft Kusto: Microsoft KQL has only the converting to*() functions and requires an explicit cast to feed a dynamic into a typed context. Berserk adds the non-reifying asXXX family and applies it automatically for typed arguments, so data that arrives entirely as dynamic works without manual casts — while scan predicates stay bare and index-friendly. If a typed function returns unexpected nulls, the stored value isn't the type you assumed: check gettype(field) and add an explicit to*() in a projection.
Time-Bounded Queries
Microsoft Kusto does not require a time filter — queries can scan entire tables.
Berserk is a time-series database, and every query must be bounded by timestamp. When you select a range in the Time Picker or pass --since/--until to the CLI, Berserk inserts a where timestamp between (<START> .. <END>) clause into your query behind the scenes. If your query already includes an explicit time filter (e.g. | where timestamp > ago(1h)), that takes precedence over the time picker or CLI parameters.
Implicit Result Limit
Microsoft Kusto returns up to 500,000 records by default (configurable via set truncationmaxrecords).
Berserk applies an implicit | take 2000 to queries that have no operator limiting result size. This keeps queries fast by default. To retrieve more rows, add an explicit limit — for example | take 10000, | tail 100, or any aggregation like | summarize ... that naturally bounds the output.
Null Strings
Berserk matches Microsoft Kusto here: a string is never null, so isnull("") returns false — use isempty() to test for an empty-or-missing string. In string equality, a null or absent value counts as the empty string (attr == "" matches absent rows, attr != "" does not), again matching Microsoft Kusto. Non-string comparisons are unaffected — null == 0 is false.
string has no distinct null representation today (an absent string reads as ""); see the string type documentation for details.
Null Comparisons
The full treatment — including three-valued logic in filters and how dynamic values compare against typed values — lives on Nulls, Dynamics, and Coercion. The summary: Berserk matches Microsoft Kusto's null-comparison semantics for non-string types, which are deliberately uneven:
- Null against a concrete value is two-valued:
int(null) == 4isfalse, andint(null) != 4istrue— sowhere i != 5keeps rows whereiis null. - Null against null is null:
int(null) == int(null)is the null bool, nottrue. - Ordering against null is null:
int(null) < 4is null.
A null predicate result filters the row out, and not() / and / or follow three-valued (Kleene) logic: not(null) is null, null and false is false, null or true is true. The practical consequence: where not(x > 5) drops rows where x is null — the negation of an unknown is still unknown. Test for null explicitly with isnull() / isnotnull(); x == int(null) is not a null check (it yields null, never true).
String Search Performance
Microsoft Kusto recommends has over contains because has uses a term index and is significantly faster.
Berserk uses bloom filters and columnar indexing to accelerate all string search operators. While case-sensitive variants (has_cs, contains_cs, ==) are still fastest, the performance gap between has and contains is much smaller than in Microsoft Kusto.
Case-Insensitive Matching of Non-ASCII Text
Berserk's case-insensitive operators (=~, has, contains, startswith, endswith, …) fold ASCII case exactly like Microsoft Kusto.
For non-ASCII text, case folding is best-effort and can diverge from Microsoft Kusto in rare cases — for example characters whose upper- and lower-case forms differ in byte length (İ/i, ß/ss) or whose forms share byte sequences (σ/Σ/ς). Berserk deliberately uses a fast byte-oriented fold rather than full Unicode case mapping: the cost of full Unicode folding on every scanned value is not justified for how rarely these characters carry meaning in log and trace data. This is an intentional divergence, not a bug.
When you need exact matching of non-ASCII text, use the case-sensitive variants (==, has_cs, contains_cs).
Berserk-Specific Functions and Operators
These functions and operators are Berserk extensions that do not exist in Microsoft KQL. This table is generated from the YAML function definitions — add custom: true to a function's YAML to include it here.
| Name | Kind | Description |
|---|---|---|
annotate | operator | Adds type annotations to dynamic columns, enabling forward-flow type inference |
current_table | scalar | Returns the table name for the current row. Used internally by the search operator. |
deriv | aggregate | Computes the derivative (rate of change) for a gauge metric. Unlike rate(), |
extract_log_template | scalar | Normalizes a string into a structural template by replacing variable tokens (numbers, UUIDs, IPs, hex values, quoted strings) with typed placeholders. Useful for grouping log messages by structure. |
fieldstats | operator | Analyzes dynamic column values to discover field paths and their statistics, |
log_template_hash | scalar | Computes a hash of the structural log template, for grouping similar logs without allocating the template string. Equivalent to hashing the output of extract_log_template, but with zero heap allocations. |
log_template_id | scalar | Returns a stable 16-character hex string identifying the structural log template of the input line. Equivalent to formatting the output of log_template_hash as zero-padded lowercase hex — small enough to store on rows as an indexed attribute, large enough to make per-template groupings collision-free for log volumes encountered in practice. |
log_template_regex | scalar | Generates a regex pattern that matches log lines with the same structural template. Variable tokens (numbers, UUIDs, IPs, hex, quoted strings) are replaced with regex wildcards while literal text is preserved. The output is designed for use with `matches regex` to leverage bloom filter optimization. |
make_graph | aggregate | Folds parent-linked rows into a `dynamic` `{nodes, edges}` graph: one node per |
merge_graphs | aggregate | Unions canonical `{nodes, edges}` graphs (produced by `make_graph` or |
otel-log-stats | operator | Single-pass OTEL log exploration: discovers attributes and computes |
otel_histogram_percentile | aggregate | Aggregate that merges OpenTelemetry histogram data points and extracts one |
otel_histogram_rate | aggregate | Per-second rate of observation count for an OpenTelemetry histogram metric. |
otel_rate | aggregate | Computes the per-second rate from an OpenTelemetry type=sum metric. |
rate | aggregate | Computes the per-second rate of change for a counter metric, handling counter |
trace-find | operator | Finds traces by structural span relationships (ancestor/descendant/sibling) and correlated logs. Evaluates parent-child relationships between spans within each trace and returns matching traces. Optional output clauses (`summarize`, `where`) control what data is extracted from each matching trace. Predicates inside `{ }` blocks use standard KQL where-clause syntax. |
Datetime Precision
Microsoft Kusto works primarily with microsecond-precision datetime and timespan types.
Berserk supports nanosecond precision internally and provides additional functions for working with Unix timestamps at different precisions: unixtime_seconds_todatetime, unixtime_milliseconds_todatetime, unixtime_microseconds_todatetime, and unixtime_nanoseconds_todatetime.
Unsupported Features
The following Microsoft KQL features are not yet available in Berserk:
- Control commands — only
.show tables,.show databases, and.show table <name> schema as jsonare supported - Materialized views
- External tables
- Stored functions (user-defined functions via
.create function) - Cross-cluster and cross-database queries
- Workbooks integration
Not Yet Implemented Functions
These are standard Microsoft KQL functions that Berserk recognizes but has not yet implemented. Using them produces a helpful error message. This list is generated from the engine source code.
convert_angle, convert_energy, convert_force, convert_length, convert_mass, convert_speed, convert_temperature, convert_volume, current_cluster_endpoint, current_database, current_principal, current_principal_details, current_principal_is_member_of, cursor_after, extent_id, extent_tags, format_ipv4, format_ipv4_mask, geo_angle, geo_azimuth, geo_closest_point_on_line, geo_closest_point_on_polygon, geo_distance_2points, geo_distance_point_to_line, geo_distance_point_to_polygon, geo_from_wkt, geo_geohash_neighbors, geo_geohash_to_central_point, geo_geohash_to_polygon, geo_h3cell_children, geo_h3cell_level, geo_h3cell_neighbors, geo_h3cell_parent, geo_h3cell_rings, geo_h3cell_to_central_point, geo_h3cell_to_polygon, geo_info_from_ip_address, geo_intersection_2lines, geo_intersection_2polygons, geo_intersection_line_with_polygon, geo_intersects_2lines, geo_intersects_2polygons, geo_intersects_line_with_polygon, geo_line_buffer, geo_line_centroid, geo_line_densify, geo_line_interpolate_point, geo_line_length, geo_line_locate_point, geo_line_simplify, geo_line_to_s2cells, geo_point_buffer, geo_point_in_circle, geo_point_in_polygon, geo_point_to_geohash, geo_point_to_h3cell, geo_point_to_s2cell, geo_polygon_area, geo_polygon_buffer, geo_polygon_centroid, geo_polygon_densify, geo_polygon_perimeter, geo_polygon_simplify, geo_polygon_to_h3cells, geo_polygon_to_s2cells, geo_s2cell_neighbors, geo_s2cell_to_central_point, geo_s2cell_to_polygon, geo_simplify_polygons_array, geo_union_lines_array, geo_union_polygons_array, has_any_ipv4, has_any_ipv4_prefix, has_ipv4, has_ipv4_prefix, hll_merge, ipv4_compare, ipv4_is_in_any_range, ipv4_is_in_range, ipv4_is_match, ipv4_is_private, ipv4_netmask_suffix, ipv4_range_to_cidr_list, ipv6_compare, ipv6_is_in_any_range, ipv6_is_in_range, ipv6_is_match, merge_tdigest, parse_csv, todecimal, toscalar