Compare commits

...

6 Commits

Author SHA1 Message Date
2a01aa8d96 Redefine spec properly 2025-09-30 09:22:10 +02:00
045a9d5e84 Update spec with RFC6902 2025-09-26 10:23:22 +02:00
e5d685b84c Update spec with client side processing 2025-09-26 09:47:09 +02:00
1a7be88105 Refine types a bit more 2025-09-26 09:27:08 +02:00
df397f5f1d Enable serialization of events 2025-09-26 09:23:40 +02:00
db6f63f096 Add testify 2025-09-26 09:23:34 +02:00
4 changed files with 229 additions and 37 deletions

238
Spec.md
View File

@@ -1,36 +1,214 @@
Event log based store
# Event Log Based Store
The data rows of our table are to be recreated from an event log
All interactions with the rows is to happen exclusively via events from/in the log
For performance reasons we are to cache these data rows as well for quick lookup
All data rows are reconstructed exclusively from an event log. All interactions with data rows must occur via events in the log. For performance, data rows are cached for quick lookup.
Events in the log are to take form of:
Events are defined as:
type Event struct {
Seq int
Type "create"|"update"|"delete"
Hash string
ItemID string // uuid-v4
EventID string // uuid-v4
Data map[string]interface{}
Timestamp datetime
Seq int `json:"seq"` // Server-generated sequence number (applied order)
Hash string `json:"hash"` // Server-generated hash, guarantees event was processed
ItemID string `json:"item_id"` // Client-defined item identifier
EventID string `json:"event_id"` // Server-generated event identifier (uuid-v4)
Collection string `json:"collection"` // Client-defined collection/table name
Data string `json:"data"` // JSON array of RFC6902 patches
Timestamp time.Time `json:"timestamp"` // Server-generated timestamp (when processed)
}
Events are divided into 3 types, create update and delete events
Create events simply create the object as given in Data
Delete events simply mark an object as deleted (not actually delete!) via its ItemID
Update events are to modify a field of a row and never more than one field
Therefore its data is only the diff in the form of "age = 3"
When creating an event only the Type and ItemID must be provided
Data is optional (delete events have no data)
Hash, EventID and Seq are to be computed server side
When creating an event, only Data, Collection, and ItemID are required from the client. Hash, EventID, Seq, and Timestamp are computed server-side.
Server-side event processing:
- Retrieve the latest event for the collection.
- Assign the next sequence number (incremented from the latest).
- Generate a new EventID (uuid-v4).
- Assign the current timestamp.
- Compute the event hash as a function of the current event's data and the previous event's hash.
- Serialize the event manually (not via json.Marshal or %+v) to ensure field order for hashing.
- Apply the patch to the cached data row.
Event log compaction:
- Every 2 days, merge and compact the event log for each collection.
- All events older than 2 days are resolved, and a new minimal event log is generated that produces the same state.
- Sequence numbers (Seq) are never reset and always increment from the last value.
- Before merging or deleting old events, save the original event log as a timestamped backup file.
Client requirements:
- Must be able to apply patches and fetch objects.
- Must store:
- last_seq: sequence number of the last processed event
- last_hash: hash of the last processed event
- events: local event log of all processed events
- pending_events: locally generated events not yet sent to the server
- On startup, fetch new events from the server since last_seq and apply them.
- When modifying objects, generate events and append to pending_events.
- Periodically or opportunistically send pending_events to the server.
- Persist the event log (events and pending_events) locally.
- If the server merges the event log, the client detects divergence by comparing last_seq and last_hash.
- If sequence matches but hash differs, the server sends the full event log; the client reconstructs its state from this log.
If the server merges the event log and the client has unsent local events:
- Client fetches the merged events from the server.
- Applies merged events to local state.
- Reapplies unsent local events on top of the updated state.
- Resends these events to the server.
If a client sends events after the event log has been merged:
- The server accepts and applies these events as usual, regardless of the client's log state.
Merging the event log must not alter the resulting data state.
Required endpoints:
GET /api/<collection>/sync?last_seq=<last_seq>&last_hash=<last_hash>
- Returns all events after the specified last_seq and last_hash.
- If the provided seq and hash do not match the server's, returns the entire event log (client is out of sync).
PATCH /api/<collection>/events
- Accepts a JSON array of RFC6902 patch objects.
Server processing:
- As new events arrive, process the event log and update the cached state for the collection.
- The current state is available for clients that do not wish to process the event log.
- Only new events need to be applied to the current state; no need to reprocess the entire log each time.
- Track the last event processed for each collection (sequence number and hash).
On startup, the server must:
- Automatically create required collections: one for events and one for items (data state).
- Events must be collection-agnostic and support any collection; at least one example collection is created at startup.
- Ensure required columns exist in collections; if missing, reject PATCH requests with an error.
- Each collection maintains its own sequence number, hash, and event log.
---
## RFC6902
https://datatracker.ietf.org/doc/html/rfc6902
Operation objects MUST have exactly one "op" member, whose value
indicates the operation to perform. Its value MUST be one of "add",
"remove", "replace", "move", "copy", or "test"; other values are
errors. The semantics of each object is defined below.
Additionally, operation objects MUST have exactly one "path" member.
That member's value is a string containing a JSON-Pointer value
[RFC6901] that references a location within the target document (the
"target location") where the operation is performed.
The meanings of other operation object members are defined by
operation (see the subsections below). Members that are not
explicitly defined for the operation in question MUST be ignored
(i.e., the operation will complete as if the undefined member did not
appear in the object).
Note that the ordering of members in JSON objects is not significant;
therefore, the following operation objects are equivalent:
{ "op": "add", "path": "/a/b/c", "value": "foo" }
{ "path": "/a/b/c", "op": "add", "value": "foo" }
{ "value": "foo", "path": "/a/b/c", "op": "add" }
Operations are applied to the data structures represented by a JSON
document, i.e., after any unescaping (see [RFC4627], Section 2.5)
takes place.
## add
The "add" operation performs one of the following functions,
depending upon what the target location references:
o If the target location specifies an array index, a new value is
inserted into the array at the specified index.
o If the target location specifies an object member that does not
already exist, a new member is added to the object.
o If the target location specifies an object member that does exist,
that member's value is replaced.
The operation object MUST contain a "value" member whose content
specifies the value to be added.
For example:
{ "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] }
When the operation is applied, the target location MUST reference one
of:
o The root of the target document - whereupon the specified value
becomes the entire content of the target document.
o A member to add to an existing object - whereupon the supplied
value is added to that object at the indicated location. If the
member already exists, it is replaced by the specified value.
o An element to add to an existing array - whereupon the supplied
value is added to the array at the indicated location. Any
elements at or above the specified index are shifted one position
to the right. The specified index MUST NOT be greater than the
number of elements in the array. If the "-" character is used to
index the end of the array (see [RFC6901]), this has the effect of
appending the value to the array.
Because this operation is designed to add to existing objects and
arrays, its target location will often not exist. Although the
pointer's error handling algorithm will thus be invoked, this
specification defines the error handling behavior for "add" pointers
to ignore that error and add the value as specified.
However, the object itself or an array containing it does need to
exist, and it remains an error for that not to be the case. For
example, an "add" with a target location of "/a/b" starting with this
document:
{ "a": { "foo": 1 } }
is not an error, because "a" exists, and "b" will be added to its
value. It is an error in this document:
{ "q": { "bar": 2 } }
because "a" does not exist.
## remove
The "remove" operation removes the value at the target location.
The target location MUST exist for the operation to be successful.
For example:
{ "op": "remove", "path": "/a/b/c" }
If removing an element from an array, any elements above the
specified index are shifted one position to the left.
## replace
The "replace" operation replaces the value at the target location
with a new value. The operation object MUST contain a "value" member
whose content specifies the replacement value.
The target location MUST exist for the operation to be successful.
For example:
{ "op": "replace", "path": "/a/b/c", "value": 42 }
This operation is functionally identical to a "remove" operation for
a value, followed immediately by an "add" operation at the same
location with the replacement value.
## move
The "move" operation removes the value at a specified location and
adds it to the target location.
The operation object MUST contain a "from" member, which is a string
containing a JSON Pointer value that references the location in the
target document to move the value from.
The "from" location MUST exist for the operation to be successful.
For example:
{ "op": "move", "from": "/a/b/c", "path": "/a/b/d" }
This operation is functionally identical to a "remove" operation on
the "from" location, followed immediately by an "add" operation at
the target location with the value that was just removed.
The "from" location MUST NOT be a proper prefix of the "path"
location; i.e., a location cannot be moved into one of its children.
## copy
The "copy" operation copies the value at a specified location to the
target location.
The operation object MUST contain a "from" member, which is a string
containing a JSON Pointer value that references the location in the
target document to copy the value from.
The "from" location MUST exist for the operation to be successful.
For example:
{ "op": "copy", "from": "/a/b/c", "path": "/a/b/e" }
This operation is functionally identical to an "add" operation at the
target location using the value specified in the "from" member.
## test
I think we don't care about this one
On the server side with an incoming event:
Grab the latest event
Assign the event a sequence number that is incremented from the latest
Create its EventID (generate a uuid-v4)
Assign it a Timestamp
Compute the hash from the dump of the current event PLUS the previous event's hash
And only then apply the patch
For create events that is insert objects
For delete events that is mark objects as deleted
For update events get the object, apply the diff and sav the object

4
go.mod
View File

@@ -8,10 +8,12 @@ require (
git.site.quack-lab.dev/dave/cylogger v1.4.0
github.com/google/uuid v1.6.0
github.com/pocketbase/pocketbase v0.30.0
github.com/stretchr/testify v1.4.0
)
require (
github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 // indirect
github.com/davecgh/go-spew v1.1.0 // indirect
github.com/disintegration/imaging v1.6.2 // indirect
github.com/domodwyer/mailyak/v3 v3.6.2 // indirect
github.com/dustin/go-humanize v1.0.1 // indirect
@@ -26,6 +28,7 @@ require (
github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
github.com/ncruces/go-strftime v0.1.9 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/pocketbase/dbx v1.11.0 // indirect
github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
github.com/spf13/cast v1.9.2 // indirect
@@ -41,6 +44,7 @@ require (
golang.org/x/sys v0.35.0 // indirect
golang.org/x/text v0.28.0 // indirect
golang.org/x/tools v0.36.0 // indirect
gopkg.in/yaml.v2 v2.2.2 // indirect
modernc.org/libc v1.66.3 // indirect
modernc.org/mathutil v1.7.1 // indirect
modernc.org/memory v1.11.0 // indirect

1
go.sum
View File

@@ -102,6 +102,7 @@ golang.org/x/tools v0.36.0 h1:kWS0uv/zsvHEle1LbV5LE8QujrxB3wfQyxHfhOk0Qkg=
golang.org/x/tools v0.36.0/go.mod h1:WBDiHKJK8YgLHlcQPYQzNCkUxUypCaa5ZegCVutKm+s=
google.golang.org/appengine v1.6.5 h1:tycE03LOZYQNhDpS27tcQdAzLCVMaj7QT2SXxebnpCM=
google.golang.org/appengine v1.6.5/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v2 v2.2.2 h1:ZCJp+EgiOT7lHqUV2J862kp8Qj64Jo6az82+3Td9dZw=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=

View File

@@ -3,13 +3,22 @@ package main
import "time"
type Event struct {
Seq int
Type string
Hash string
ItemID string
EventID string
Data map[string]interface{}
Timestamp time.Time
// Server generated sequence number of the event - ie when it was applied
Seq int `json:"seq"`
// Type of the event - create, update, delete, defined by the client
Type string `json:"type"`
// Hash of the event - server generated, gurantees the event was processed
Hash string `json:"hash"`
// ItemID of the item that is to be manipulated, defined by the client
ItemID string `json:"item_id"`
// EventID of the event - server generated, gurantees the event was processed
EventID string `json:"event_id"`
// Collection of the item that is to be manipulated, defined by the client
Collection string `json:"collection"`
// Data that is to be used for manipulation; for create events that's the full objects and for update events that's the diff
Data map[string]interface{} `json:"data"`
// Timestamp of the event - server generated, when the event was processed
Timestamp time.Time `json:"timestamp"`
}
type SLItem struct {