Data Analysis - SurrealTools/Documentation GitHub Wiki
COUNT Function
The count() function has a couple of different uses...
Sub query count
SELECT * FROM count((SELECT * FROM entries)) is way faster then SELECT count() AS total FROM user GROUP BY ALL;
SELECT * FROM count((SELECT * FROM person WHERE name > "Jack" ORDER BY id)) > 9;
- The first is to aggregate data in a GROUP BY clause...
SELECT count() AS total FROM user GROUP BY ALL;
SELECT count() FROM user GROUP BY count;
-- and after 1.0.0-beta.9 you will be able to do
SELECT count() FROM user GROUP ALL;
SELECT count(age > 35) FROM [{ age: 33 }, { age: 45 }, { age: 39 }] GROUP ALL;
- The second is to count the total number of array values within a single record...
CREATE person:tobie SET tags = ['Golang', 'Rust', 'JavaScript'];
SELECT count(tags) AS total_tags FROM person;
- The third is to count the number of remote graph edge connections...
SELECT count(->knows->person) AS total_friends FROM person;
Aggregate on LIVE queries
You won't ever be able to run aggregate (GROUP BY clauses) on LIVE queries, but you can do it a different/better way... 2. What you will be able to do is create a computed view...
DEFINE TABLE person_by_age AS
SELECT math::mean(age), country FROM person GROUP BY country
;
and then create a live query which watches that table/view...
LIVE SELECT * FROM person_by_age;
This would then be more efficient, as the live query is only being 'activated' for single record changes, and not potentially large whole tables.
Q and A
**Question: **Do I get it right that I can actually speed ordered queries by defining a table which is already ordered by a field that I need?
DEFINE TABLE pools_by_liquidity AS SELECT * FROM pool ORDER BY liquidityUSD
**Answer: **'foreign tables' don't have orders, or limits. But they are great for speeding up GROUP BY or WHERE filters on a specific table.
Question: I have datalake-like tables (append-only): user, login-log. It's collecting just fine, but also I want to have a new BI-like table, say, user-statistic, where I want to show the user plus the count of login attempts within the last 30 days.
DEFINE TABLE login;
DEFINE TABLE logins_by_user_by_month AS
SELECT
count() AS total,
time::group(time, 'month') AS month,
user
FROM
login
GROUP BY user, month
;
CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();
CREATE login SET user = user:tobie, time = time::now();
SELECT * FROM logins_by_user_by_month WHERE user = user:tobie AND month = "2022-09-00";
Solutions
you need to add the ‘agility’ field to the SELECT expression. Then you can group by the ‘agility’ field.
SELECT math::sum(strength), agility FROM player GROUP BY agility;
INSERT INTO player (agility, strength, scores) VALUES (10, 10, [97, 83, 79]);
INSERT INTO player (agility, strength, scores) VALUES (10, 50, [87, 90, 88]);
Basically, the math::sum() function needs to know if it is an aggregate function or not. It detects whether it is an aggregate function by seeing if there is a GROUP BY clause.
If there is a GROUP BY clause it performs an aggregate across the different records...
SELECT math::sum(strength) FROM player GROUP BY ALL;
[
{
"time": "127.166µs",
"status": "OK",
"result": [
{
"math::sum": 60
}
]
}
]
If there is no GROUP BY clause, then the function sums up values in a field...
SELECT id, math::sum(scores) AS total_score FROM player;
[
{
"time": "111.833µs",
"status": "OK",
"result": [
{
"id": "player:44gcrkjx7qexkqqi7ohs",
"total_score": 265
},
{
"id": "player:kp98yuru87qvc6rjm2g1",
"total_score": 259
}
]
}
]
However...
SELECT math::sum(strength) FROM player;
Should return
[
{
"time": "203.041µs",
"status": "OK",
"result": [
{
"math::sum": 50
},
{
"math::sum": 10
}
]
}
]
SELECT math::sum(math::sum(scores)) FROM player GROUP BY ALL;
will return all of the aggregated scores, for all players together. the GROUP BY aggregate applies to the outermost function only.