Berserk Docs

Compared to Microsoft KQL

Differences between Berserk's KQL and Microsoft's Kusto Query Language

Berserk implements the Kusto Query Language (KQL) as used in Azure Data Explorer, Azure Monitor, and Microsoft Sentinel. For the most part, queries that work in Azure Data Explorer will also work in Berserk. This page lists the places where Berserk diverges — either to better fit its internals and performance model, or to add features specific to observability workloads.

Coming from Microsoft Kusto

If you already write KQL for Azure Data Explorer, most of your queries work unchanged. A handful of habits are worth adjusting:

  • Don't declare schema, and don't cast just to read a field. Every field — including nested ones — resolves automatically from the raw record, so query it directly; there's no column list to maintain. Use bracket notation for keys that contain dots: resource['service.name'], not resource.service.name.
  • Filter on bare fields and let Berserk coerce. Write where status == 500 or where level == "error" straight on a dynamic field — Berserk compares by native type and keeps the indexes engaged. Passing a field to a typed function (avg(value), bin(timestamp, 5m), …) coerces it automatically via the asXXX family, so you rarely need a manual tolong() / todouble().
  • Cast only to cross types, and only in a projection. Reach for to*() when a value is stored as the wrong type (a number kept as a string), and put it in extend/project, never in a where — a cast inside a filter forces a per-row scan and disables pruning.
  • =~ is idiomatic here. Case-insensitive =~ is index-friendly in Berserk (unlike ADX, where it's discouraged for performance), so prefer level =~ "error" to tolower(level) == "error". has and contains perform the same — pick by meaning, not speed.
  • Every query is time-bounded. There is always an effective timestamp range — from the time picker, --since/--until, or an explicit where timestamp … — so you never scan all of history by accident.

The sections below detail each of these differences.

Schema and Field Resolution

The biggest difference from Microsoft KQL is how Berserk handles schema.

Microsoft Kusto requires fixed schemas — every column must be defined in advance, and referencing an unknown column is an error.

Berserk stores the full original record in a special $raw column (a dynamic value). Unknown column names are automatically resolved from $raw, so you can query nested fields like resource.service.name without declaring them first. This is called permissive mode and is the default. Strict mode (matching Microsoft behavior) is available but not the default.

String Coercion and the asXXX Family

Because permissive mode resolves fields from $raw as dynamic, Berserk needs a rule for how a dynamic value becomes a typed one. It uses two regimes that behave differently on purpose.

Comparisons and scan predicates are compared by native type — never coerced. A bare where field == "x" works directly on a dynamic field and keeps the indexes engaged (bloom / shard / range). A value that can't match is simply not equal — a numeric field == "5" is false, not coerced. Don't wrap a scan predicate in tostring() / tolong(); that forces per-row evaluation and disables pruning.

Typed function arguments are auto-coerced via the asXXX family. When a dynamic field is passed to a function or operator that expects a concrete type, Berserk injects the matching extractor — asstring, aslong, asint, asdouble, asbool, asdatetime, astimespan, or asnumeric — so observability data feeds typed functions with no explicit cast:

summarize avg(value) by bin(timestamp, 5m)   // value auto-coerces to numeric (asnumeric)
extend host = toupper(resource.host.name)     // asstring extracts the string, then upper-cases

asT extracts the value when it is already that type (or a dynamic carrying it), and otherwise yields null. It never converts across types — that is the key difference from to*():

InputasT() — extract-or-nullto*() — convert
a dynamic carrying a Tthe valuethe value
a value of a different typenullparsed/converted if possible
string "42" into a longaslongnulltolong42

Use an explicit to*() only to cross types — a number or datetime stored as a string — and only in project/extend, never in a filter:

extend t = todatetime(attributes.event_time)  // event_time is a STRING → parse it (asdatetime would be null)
extend n = tolong(attributes.count_str)        // numeric stored as a string → parse it

Compared to Microsoft Kusto: Microsoft KQL has only the converting to*() functions and requires an explicit cast to feed a dynamic into a typed context. Berserk adds the non-reifying asXXX family and applies it automatically for typed arguments, so data that arrives entirely as dynamic works without manual casts — while scan predicates stay bare and index-friendly. If a typed function returns unexpected nulls, the stored value isn't the type you assumed: check gettype(field) and add an explicit to*() in a projection.

Time-Bounded Queries

Microsoft Kusto does not require a time filter — queries can scan entire tables.

Berserk is a time-series database, and every query must be bounded by timestamp. When you select a range in the Time Picker or pass --since/--until to the CLI, Berserk inserts a where timestamp between (<START> .. <END>) clause into your query behind the scenes. If your query already includes an explicit time filter (e.g. | where timestamp > ago(1h)), that takes precedence over the time picker or CLI parameters.

Implicit Result Limit

Microsoft Kusto returns up to 500,000 records by default (configurable via set truncationmaxrecords).

Berserk applies an implicit | take 2000 to queries that have no operator limiting result size. This keeps queries fast by default. To retrieve more rows, add an explicit limit — for example | take 10000, | tail 100, or any aggregation like | summarize ... that naturally bounds the output.

Null Strings

Berserk matches Microsoft Kusto here: a string is never null, so isnull("") returns false — use isempty() to test for an empty-or-missing string. In string equality, a null or absent value counts as the empty string (attr == "" matches absent rows, attr != "" does not), again matching Microsoft Kusto. Non-string comparisons are unaffected — null == 0 is false.

string has no distinct null representation today (an absent string reads as ""); see the string type documentation for details.

Null Comparisons

The full treatment — including three-valued logic in filters and how dynamic values compare against typed values — lives on Nulls, Dynamics, and Coercion. The summary: Berserk matches Microsoft Kusto's null-comparison semantics for non-string types, which are deliberately uneven:

  • Null against a concrete value is two-valued: int(null) == 4 is false, and int(null) != 4 is true — so where i != 5 keeps rows where i is null.
  • Null against null is null: int(null) == int(null) is the null bool, not true.
  • Ordering against null is null: int(null) < 4 is null.

A null predicate result filters the row out, and not() / and / or follow three-valued (Kleene) logic: not(null) is null, null and false is false, null or true is true. The practical consequence: where not(x > 5) drops rows where x is null — the negation of an unknown is still unknown. Test for null explicitly with isnull() / isnotnull(); x == int(null) is not a null check (it yields null, never true).

String Search Performance

Microsoft Kusto recommends has over contains because has uses a term index and is significantly faster.

Berserk uses bloom filters and columnar indexing to accelerate all string search operators. While case-sensitive variants (has_cs, contains_cs, ==) are still fastest, the performance gap between has and contains is much smaller than in Microsoft Kusto.

Case-Insensitive Matching of Non-ASCII Text

Berserk's case-insensitive operators (=~, has, contains, startswith, endswith, …) fold ASCII case exactly like Microsoft Kusto.

For non-ASCII text, case folding is best-effort and can diverge from Microsoft Kusto in rare cases — for example characters whose upper- and lower-case forms differ in byte length (İ/i, ß/ss) or whose forms share byte sequences (σ/Σ/ς). Berserk deliberately uses a fast byte-oriented fold rather than full Unicode case mapping: the cost of full Unicode folding on every scanned value is not justified for how rarely these characters carry meaning in log and trace data. This is an intentional divergence, not a bug.

When you need exact matching of non-ASCII text, use the case-sensitive variants (==, has_cs, contains_cs).

Berserk-Specific Functions and Operators

These functions and operators are Berserk extensions that do not exist in Microsoft KQL. This table is generated from the YAML function definitions — add custom: true to a function's YAML to include it here.

NameKindDescription
annotateoperatorAdds type annotations to dynamic columns, enabling forward-flow type inference
current_tablescalarReturns the table name for the current row. Used internally by the search operator.
derivaggregateComputes the derivative (rate of change) for a gauge metric. Unlike rate(),
extract_log_templatescalarNormalizes a string into a structural template by replacing variable tokens (numbers, UUIDs, IPs, hex values, quoted strings) with typed placeholders. Useful for grouping log messages by structure.
fieldstatsoperatorAnalyzes dynamic column values to discover field paths and their statistics,
log_template_hashscalarComputes a hash of the structural log template, for grouping similar logs without allocating the template string. Equivalent to hashing the output of extract_log_template, but with zero heap allocations.
log_template_idscalarReturns a stable 16-character hex string identifying the structural log template of the input line. Equivalent to formatting the output of log_template_hash as zero-padded lowercase hex — small enough to store on rows as an indexed attribute, large enough to make per-template groupings collision-free for log volumes encountered in practice.
log_template_regexscalarGenerates a regex pattern that matches log lines with the same structural template. Variable tokens (numbers, UUIDs, IPs, hex, quoted strings) are replaced with regex wildcards while literal text is preserved. The output is designed for use with `matches regex` to leverage bloom filter optimization.
make_graphaggregateFolds parent-linked rows into a `dynamic` `{nodes, edges}` graph: one node per
merge_graphsaggregateUnions canonical `{nodes, edges}` graphs (produced by `make_graph` or
otel-log-statsoperatorSingle-pass OTEL log exploration: discovers attributes and computes
otel_histogram_percentileaggregateAggregate that merges OpenTelemetry histogram data points and extracts one
otel_histogram_rateaggregatePer-second rate of observation count for an OpenTelemetry histogram metric.
otel_rateaggregateComputes the per-second rate from an OpenTelemetry type=sum metric.
rateaggregateComputes the per-second rate of change for a counter metric, handling counter
trace-findoperatorFinds traces by structural span relationships (ancestor/descendant/sibling) and correlated logs. Evaluates parent-child relationships between spans within each trace and returns matching traces. Optional output clauses (`summarize`, `where`) control what data is extracted from each matching trace. Predicates inside `{ }` blocks use standard KQL where-clause syntax.

Datetime Precision

Microsoft Kusto works primarily with microsecond-precision datetime and timespan types.

Berserk supports nanosecond precision internally and provides additional functions for working with Unix timestamps at different precisions: unixtime_seconds_todatetime, unixtime_milliseconds_todatetime, unixtime_microseconds_todatetime, and unixtime_nanoseconds_todatetime.

Unsupported Features

The following Microsoft KQL features are not yet available in Berserk:

  • Control commands — only .show tables, .show databases, and .show table <name> schema as json are supported
  • Materialized views
  • External tables
  • Stored functions (user-defined functions via .create function)
  • Cross-cluster and cross-database queries
  • Workbooks integration

Not Yet Implemented Functions

These are standard Microsoft KQL functions that Berserk recognizes but has not yet implemented. Using them produces a helpful error message. This list is generated from the engine source code.

convert_angle, convert_energy, convert_force, convert_length, convert_mass, convert_speed, convert_temperature, convert_volume, current_cluster_endpoint, current_database, current_principal, current_principal_details, current_principal_is_member_of, cursor_after, extent_id, extent_tags, format_ipv4, format_ipv4_mask, geo_angle, geo_azimuth, geo_closest_point_on_line, geo_closest_point_on_polygon, geo_distance_2points, geo_distance_point_to_line, geo_distance_point_to_polygon, geo_from_wkt, geo_geohash_neighbors, geo_geohash_to_central_point, geo_geohash_to_polygon, geo_h3cell_children, geo_h3cell_level, geo_h3cell_neighbors, geo_h3cell_parent, geo_h3cell_rings, geo_h3cell_to_central_point, geo_h3cell_to_polygon, geo_info_from_ip_address, geo_intersection_2lines, geo_intersection_2polygons, geo_intersection_line_with_polygon, geo_intersects_2lines, geo_intersects_2polygons, geo_intersects_line_with_polygon, geo_line_buffer, geo_line_centroid, geo_line_densify, geo_line_interpolate_point, geo_line_length, geo_line_locate_point, geo_line_simplify, geo_line_to_s2cells, geo_point_buffer, geo_point_in_circle, geo_point_in_polygon, geo_point_to_geohash, geo_point_to_h3cell, geo_point_to_s2cell, geo_polygon_area, geo_polygon_buffer, geo_polygon_centroid, geo_polygon_densify, geo_polygon_perimeter, geo_polygon_simplify, geo_polygon_to_h3cells, geo_polygon_to_s2cells, geo_s2cell_neighbors, geo_s2cell_to_central_point, geo_s2cell_to_polygon, geo_simplify_polygons_array, geo_union_lines_array, geo_union_polygons_array, has_any_ipv4, has_any_ipv4_prefix, has_ipv4, has_ipv4_prefix, hll_merge, ipv4_compare, ipv4_is_in_any_range, ipv4_is_in_range, ipv4_is_match, ipv4_is_private, ipv4_netmask_suffix, ipv4_range_to_cidr_list, ipv6_compare, ipv6_is_in_any_range, ipv6_is_in_range, ipv6_is_match, merge_tdigest, parse_csv, todecimal, toscalar

On this page