Data Analysis - SurrealTools/Documentation GitHub Wiki

COUNT Function

The count() function has a couple of different uses...

Sub query count

SELECT * FROM count((SELECT * FROM entries)) is way faster then SELECT count() AS total FROM user GROUP BY ALL;

SELECT * FROM count((SELECT * FROM person WHERE name > "Jack" ORDER BY id)) > 9;

The first is to aggregate data in a GROUP BY clause...

SELECT count() AS total FROM user GROUP BY ALL;
SELECT count() FROM user GROUP BY count;
-- and after 1.0.0-beta.9 you will be able to do
SELECT count() FROM user GROUP ALL;
SELECT count(age > 35) FROM [{ age: 33 }, { age: 45 }, { age: 39 }] GROUP ALL;

The second is to count the total number of array values within a single record...

CREATE person:tobie SET tags = ['Golang', 'Rust', 'JavaScript'];
SELECT count(tags) AS total_tags FROM person;

The third is to count the number of remote graph edge connections...

SELECT count(->knows->person) AS total_friends FROM person;

Aggregate on LIVE queries

You won't ever be able to run aggregate (GROUP BY clauses) on LIVE queries, but you can do it a different/better way... 2. What you will be able to do is create a computed view...

DEFINE TABLE person_by_age AS
  SELECT math::mean(age), country FROM person GROUP BY country
;

and then create a live query which watches that table/view...

LIVE SELECT * FROM person_by_age;

This would then be more efficient, as the live query is only being 'activated' for single record changes, and not potentially large whole tables.

Q and A

**Question: **Do I get it right that I can actually speed ordered queries by defining a table which is already ordered by a field that I need?

DEFINE TABLE pools_by_liquidity AS SELECT * FROM pool ORDER BY liquidityUSD

**Answer: **'foreign tables' don't have orders, or limits. But they are great for speeding up GROUP BY or WHERE filters on a specific table.

Question: I have datalake-like tables (append-only): user, login-log. It's collecting just fine, but also I want to have a new BI-like table, say, user-statistic, where I want to show the user plus the count of login attempts within the last 30 days.

DEFINE TABLE login;

DEFINE TABLE logins_by_user_by_month AS
  SELECT
    count() AS total,
    time::group(time, 'month') AS month,
    user
  FROM
    login
  GROUP BY user, month
;

CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();

SELECT * FROM logins_by_user_by_month WHERE user = user:tobie AND month = "2022-09-00";

Solutions

you need to add the ‘agility’ field to the SELECT expression. Then you can group by the ‘agility’ field.

SELECT math::sum(strength), agility FROM player GROUP BY agility;

INSERT INTO player (agility, strength, scores) VALUES (10, 10, [97, 83, 79]);
INSERT INTO player (agility, strength, scores) VALUES (10, 50, [87, 90, 88]);

Basically, the math::sum() function needs to know if it is an aggregate function or not. It detects whether it is an aggregate function by seeing if there is a GROUP BY clause.

If there is a GROUP BY clause it performs an aggregate across the different records...

SELECT math::sum(strength) FROM player GROUP BY ALL;

[
  {
    "time": "127.166µs",
    "status": "OK",
    "result": [
      {
        "math::sum": 60
      }
    ]
  }
]

If there is no GROUP BY clause, then the function sums up values in a field...

SELECT id, math::sum(scores) AS total_score FROM player;

[
  {
    "time": "111.833µs",
    "status": "OK",
    "result": [
      {
        "id": "player:44gcrkjx7qexkqqi7ohs",
        "total_score": 265
      },
      {
        "id": "player:kp98yuru87qvc6rjm2g1",
        "total_score": 259
      }
    ]
  }
]

However...

SELECT math::sum(strength) FROM player;

Should return

[
  {
    "time": "203.041µs",
    "status": "OK",
    "result": [
      {
        "math::sum": 50
      },
      {
        "math::sum": 10
      }
    ]
  }
]

SELECT math::sum(math::sum(scores)) FROM player GROUP BY ALL;

will return all of the aggregated scores, for all players together. the GROUP BY aggregate applies to the outermost function only.