Shapes
Shapes are the core primitive for controlling sync in the ElectricSQL system.
What is a Shape?
Electric syncs little subsets of your Postgres data into local apps and services. Those subsets are defined using Shapes.
Little subsets
Imagine a Postgres database in the cloud with lots of data stored in it. It's often impractical or unwanted to sync all of this data over the network onto a local device.
A shape is a way of defining a subset of that data that you'd like to sync into a local app. Defining shapes allows you to sync just the data you want and just the data that's practical to sync onto the local device.
A client can choose to sync one shape, or lots of shapes. Many clients can sync the same shape. Multiple shapes can overlap.
Defining shapes
Shapes are defined by:
- a table, such as
items
- an optional where clause to filter which rows are included in the shape
- an optional columns clause to select which columns are included
A shape contains all of the rows in the table that match the where clause, if provided. If a columns clause is provided, the synced rows will only contain those selected columns.
Limitations
Shapes are currently single table. Shape definitions are immutable.
Table
This is the root table of the shape. All shapes must specify a table and it must match a table in your Postgres database.
The value can be just a tablename like projects
, or can be a qualified tablename prefixed by the database schema using a .
delimiter, such as foo.projects
. If you don't provide a schema prefix, then the table is assumed to be in the public.
schema.
Where clause
Shapes can define an optional where clause to filter out which rows from the table are included in the shape. Only rows that match the where clause will be included.
The where clause must be a valid PostgreSQL query expression in SQL syntax, e.g.:
title='Electric'
status IN ('backlog', 'todo')
Where clauses support:
- columns of numerical types,
boolean
,uuid
,text
,interval
, date and time types (with the exception oftimetz
), Arrays (but not yet Enums) - operators that work on those types: arithmetics, comparisons, logical/boolean operators like
OR
, string operators likeLIKE
, etc.
You can use AND
and OR
to group multiple conditions, e.g.:
title='Electric' OR title='SQL'
title='Electric' AND status='todo'
Where clauses are limited in that they:
- can only refer to columns in the target row
- can't perform joins or refer to other tables
- can't use non-deterministic SQL functions like
count()
ornow()
See known_functions.ex
and parser.ex
for the source of truth on which types, operators and functions are currently supported. If you need a feature that isn't supported yet, please raise a feature request.
Throughput
Where clause evaluation impacts data throughput. Some where clauses are optimized.
Columns
This is an optional list of columns to select. When specified, only the columns listed are synced. When not specified all columns are synced.
For example:
columns=id,title,status
- only include theid
,title
andstatus
columnscolumns=id,"Status-Check"
- only includeid
andStatus-Check
columns, quoting the identifiers where necessary
The specified columns must always include the primary key column(s), and should be formed as a comma separated list of column names — exactly as they are in the database schema. If the identifier was defined as case sensitive and/or with special characters, then you must quote it.
Subscribing to shapes
Local clients establish shape subscriptions, typically using client libraries. These sync data from the Electric sync engine into the client using the HTTP API.
The sync service maintains shape subscriptions and streams any new data and data changes to the local client. In the client, shapes can be held as objects in memory, for example using a useShape
hook, or in a normalised store or database like PGlite.
HTTP
You can sync shapes manually using the GET /v1/shape
endpoint. First make an initial sync request to get the current data for the Shape, such as:
curl -i 'http://localhost:3000/v1/shape?table=foo&offset=-1'
Then switch into a live mode to use long-polling to receive real-time updates:
curl -i 'http://localhost:3000/v1/shape?table=foo&live=true&offset=...&handle=...'
These requests both return an array of Shape Log entries. You can process these manually, or use a higher-level client.
Typescript
You can use the Typescript Client to process the Shape Log and materialised it into a Shape
object for you.
First install using:
npm i @electric-sql/client
Instantiate a ShapeStream
and materialise into a Shape
:
import { ShapeStream, Shape } from '@electric-sql/client'
const stream = new ShapeStream({
url: `http://localhost:3000/v1/shape`,
params: {
table: `foo`
}
})
const shape = new Shape(stream)
// Returns promise that resolves with the latest shape data once it's fully loaded
await shape.rows
You can register a callback to be notified whenever the shape data changes:
shape.subscribe(({ rows }) => {
// rows is an array of the latest value of each row in a shape.
})
Or you can use framework integrations like the useShape
hook to automatically bind materialised shapes to your components.
See the Quickstart and HTTP API docs for more information.
Throughput
Electric evaluates where clauses when processing changes from Postgres and matching them to shape logs. If there are lots of shapes, this means we have to evaluate lots of where clauses. This has an impact on data throughput.
There are two kinds of where clauses:
- optimized where clauses: a subset of clauses that we've optimized the evaluation of
- non-optimized where clauses: all other where clauses
With non-optimized where clauses, throughput is inversely proportional to the number of shapes. If you have 10 shapes, Electric can process 1,400 changes per second. If you have 100 shapes, throughput drops to 140 changes per second.
With optimized where clauses, Electric can evaluate millions of clauses at once and maintain a consistent throughput of ~5,000 row changes per second no matter how many shapes you have. If you have 10 shapes, Electric can process 5,000 changes per second. If you have 1,000 shapes, throughput remains at 5,000 changes per second.
For more details see the benchmarks.
Optimized where clauses
We currently optimize the evaluation of the following clauses:
field = constant
- literal equality checks against a constant value. We optimize this by indexing shapes by their constant, allowing a single lookup to retrieve all shapes for that constant instead of evaluating the where clause for each shape. Note that this index is internal to Electric and unrelated to Postgres indexes.field = constant AND another_condition
- thefield = constant
part of the where clause is optimized as above, and any shapes that match are iterated through to check the other condition. Providing the first condition is enough to filter out most of the shapes, the write processing will be fast. If howeverfield = const
matches for a large number of shapes, then the write processing will be slower since each of the shapes will need to be iterated through.a_non_optimized_condition AND field = constant
- as above. The order of the clauses is not important (Electric will filter by optimized clauses first).
Need additional where clause optimization?
We plan to optimize a much larger subset of Postgres where clauses. If you need a particular clause optimized, please raise an issue on GitHub or let us know on Discord.
Limitations
Single table
Shapes are currently single table only.
In the old version of Electric, Shapes had an include tree that allowed you to sync nested relations. The new Electric has not yet implemented support for include trees.
You can upvote and discuss adding support for include trees here:
Include tree workarounds
There are some practical workarounds you can already use to sync related data, based on subscribing to multiple shapes and joining in the client.
For a one-level deep include tree, such as "sync this project with its issues", you can sync one shape for projects where="id=..."
and another for issues where="project_id=..."
.
For multi-level include trees, such as "sync this project with its issues and their comments", you can denormalise the project_id
onto the lower tables so that you can also sync comments where="project_id=1234"
.
Where necessary, you can use triggers to update these denormalised columns.
Immutable
Shape definitions are currently immutable.
Once a shape subscription has been started, it's definition cannot be changed. If you want to change the data in a shape, you need to start a new subscription.
You can upvote and discuss adding support for mutable shapes here:
Dropping tables
When dropping a table from Postgres you need to manually delete all shapes that are defined on that table. This is especially important if you intend to recreate the table afterwards (possibly with a different schema) as the shape will contain stale data from the old table. Therefore, recreating the table only works if you first delete the shape.
Electric does not yet automatically delete shapes when tables are dropped because Postgres does not stream DDL statements (such as DROP TABLE
) on the logical replication stream that Electric uses to detect changes. However, we are actively exploring approaches for automated shape deletion in this GitHub issue.