An Introduction to Flow¶
If you’re a brand new Flow user, you’re in the right place! We’re going to walk through the basics of Flow by building a shopping cart backend.
Your First collection¶
To start with, we’re going to define a Flow collection that holds data about each user. We’ll have this collection accept user JSON documents via the REST API, and we’ll materialize the data in a Postgres table to make it available to our marketing team. Our devcontainer comes with a Postgres instance that’s started automatically, so all of this should “just work” in that environment.
Flow collections are declared in a YAML file, like so:
collections:
- name: examples/shopping/users
key: [/id]
schema: user.schema.yaml
Note that the schema is defined in a separate file. This is a common pattern because it allows your schemas to be reused and composed. The actual schema is defined as:
description: "A user who may buy things from our site"
type: object
properties:
id: { type: integer }
name: { type: string }
email:
type: string
format: email
required: [id, name, email]
We can apply our collection to a local Flow instance by running:
$ flowctl build && flowctl develop
Now that it’s applied, we’ll leave that terminal running and open a new one to simulate some users being added.
curl -H 'Content-Type: application/json' -d @- 'http://localhost:8081/ingest' <<EOF
{
"examples/shopping/users": [
{
"id": 6,
"name": "Donkey Kong",
"email": "bigguy@dk.com"
},
{
"id": 7,
"name": "Echo",
"email": "explorer@ocean.net"
},
{
"id": 8,
"name": "Gordon Freeman",
"email": "mfreeman@apeture.com"
}
]
}
EOF
This will print out some JSON with information about the writing of the new data, which we’ll come back to later. Let’s check out our data in Postgres:
$ psql 'postgresql://flow:flow@localhost:5432/flow?sslmode=disable' -c "select id, email, name from shopping_users;"
id | email | name
----+-----------------------+----------------
6 | bigguy@dk.com | Donkey Kong
7 | explorer@ocean.net | Echo
8 | freeman@apeture.com | Gordon Freeman
(3 rows)
As new users are added to the collection, they will continue to appear here. One of our users
wants to update their email address, though. This is done by ingesting a new document with
the same id
.
curl -H 'Content-Type: application/json' -d @- 'http://localhost:8081/ingest' <<EOF
{
"examples/shopping/users": [
{
"id": 8,
"name": "Gordon Freeman",
"email": "gordo@retiredlife.org"
}
]
}
EOF
If we re-run the Postgres query, we’ll see that the row for Gordon Freeman has been updated.
Since we declared the collection key of [ /id ]
, Flow will automatically combine the new
document with the previous version. In this case, the most recent document for each id
will
be materialized. But Flow allows you to control how these documents are combined using
reduction annotations, so you have control over how this works for
each collection. The users collection is simply using the default reduction strategy
lastWriteWins
.
Writing Tests¶
Before we go, let’s add some tests that verify the reduction logic in our users collection. The tests section allows us to ingest documents and verify the fully reduced results automatically. Most examples from this point on will use tests instead of shell scripts for ingesting documents and verifying expected results.
tests:
"A users email is updated":
- ingest:
collection: examples/shopping/users
documents:
- { id: 1, name: "Jane", email: "janesoldemail@email.test" }
- { id: 2, name: "Jill", email: "jill@upahill.test" }
- ingest:
collection: examples/shopping/users
documents:
- id: 1
name: Jane
email: jane82@email.test
- verify:
collection: examples/shopping/users
documents:
- id: 1
name: Jane
email: jane82@email.test
- id: 2
name: Jill
email: jill@upahill.test
Each test is a sequence of ingest
and verify
steps, which will be executed in the order
written. In this test, we are first ingesting documents for the users Jane and Jill. The second
ingest
step provides a new email address for Jane. The verify
step includes both
documents, and will fail if any of the properties do not match.
We can run the tests using:
$ flowctl build && flowctl test
Next Steps¶
Now that our users collection is working end-to-end, Here’s some good topics to check out next:
Learn the basics of CSV ingestion by building the Products collection
Explore reduction annotations by building the Shopping Cart collection