DB migration library for beam, targeting Postgres.
This package offers an alternative to beam-migrate
and can be used to migrate a database between different versions of a Haskell schema. It doesn't depend on beam-migrate
if not transitively (beam-postgres
depends on it, for example).
beam-automigrate
Automatic migrations for beam databases!
Table of Contents
- Table of Contents
- Getting started (User guide/reference)
- Current design (10_000ft overview)
- Contributors
Getting started (User guide/reference)
This example is a literate haskell source file. You can run it interactively with the following command:
$ cabal repl readme
If you're using nix, you can enter a shell with the appropriate dependencies with the following command:
$ nix-shell
From that nix-shell, you can run cabal repl readme
.
To run the example code, run main
inside of the cabal repl.
For a more examples, refer to the examples
folder.
Deriving an AnnotatedDatabaseSettings
Deriving an AnnotatedDatabaseSettings
for a Haskell database type is a matter of calling defaultAnnotatedDbSettings
. For example, given:
> {-# LANGUAGE DataKinds #-}
> {-# LANGUAGE DeriveGeneric #-}
> {-# LANGUAGE DeriveAnyClass #-}
> {-# LANGUAGE OverloadedStrings #-}
> {-# LANGUAGE TypeApplications #-}
> {-# LANGUAGE TypeFamilies #-}
> import Prelude hiding ((.))
> import Control.Category ((.))
> import Data.Proxy (Proxy(..))
> import Database.Beam.Postgres
> import Database.Beam.Schema
> import Database.Beam (val_)
> import qualified Database.Beam.AutoMigrate as BA
> import Database.Beam.AutoMigrate.TestUtils
> import Database.PostgreSQL.Simple as Pg
> import GHC.Generics
> import Data.Text
> import System.Environment (getArgs)
>
> data CitiesT f = City
> { ctCity :: Columnar f Text
> , ctLocation :: Columnar f Text
> , ctCapital :: Columnar f Bool
> }
> deriving (Generic, Beamable)
>
> data WeatherT f = Weather
> { wtId :: Columnar f Int
> , wtCity :: PrimaryKey CitiesT f
> , wtTempLo :: Columnar f Int
> , wtTempHi :: Columnar f Int
> }
> deriving (Generic, Beamable)
>
> data ForecastDB f = ForecastDB
> { dbCities :: f (TableEntity CitiesT)
> , dbWeathers :: f (TableEntity WeatherT)
> }
> deriving (Generic, Database be)
>
> instance Table CitiesT where
> data PrimaryKey CitiesT f = CityID (Columnar f Text)
> deriving (Generic, Beamable)
> primaryKey = CityID . ctCity
>
> instance Table WeatherT where
> data PrimaryKey WeatherT f = WeatherID (Columnar f Int)
> deriving (Generic, Beamable)
> primaryKey = WeatherID . wtId
Then calling defaultAnnotatedDbSettings
will yield:
> forecastDB :: BA.AnnotatedDatabaseSettings Postgres ForecastDB
> forecastDB = BA.defaultAnnotatedDbSettings defaultDbSettings
Where defaultDbSettings
is the classic function from beam-core
.
Overriding the defaults of an AnnotatedDatabaseSettings
It is likely that the end user would like to attach extra meta-information to an AnnotatedDatabaseSettings
. For example, the user might want to specify which is the default value for a nullable field, or simple express things like uniqueness constraints or foreign keys (when the inference algorithm cannot proceed due to ambiguity). To do this, we can piggyback on the familiar API from beam-core
. For example, we can do something like this:
> annotatedDB :: BA.AnnotatedDatabaseSettings Postgres ForecastDB
> annotatedDB = BA.defaultAnnotatedDbSettings defaultDbSettings `withDbModification` dbModification
> { dbCities =
> BA.annotateTableFields tableModification { ctCapital = BA.defaultsTo $ val_ False }
> <> BA.uniqueConstraintOn [BA.U ctCity, BA.U ctLocation]
> , dbWeathers = BA.annotateTableFields tableModification <>
> BA.foreignKeyOnPk (dbCities defaultDbSettings) wtCity BA.Cascade BA.Restrict
> }
This is where we deviate from beam-migrate. Beam-migrate forces you to specify "annotations" for each and every field of each and every column, making everything a bit verbose. Here we decided instead to use the other API that beam-core
offers, that is typically used to modify the field names for a standard DatabaseSettings
, but it turns out its types are general enough to be used for an AnnotatedDatabaseSettings
, just by adding some extra functions (the ones prefixed with BA
, the rest is the standard beam-core
API).
This allows us to still override defaults when we need to without all the extra boilerplate. This API and these combinators simply manipulates under the hood the particular TableFieldSchema
associated to each column, so that the information contained within can be "spliced back" when we generate a Schema
out of an AnnotatedDatabaseSettings
.
Deriving a Schema
Once we have an AnnotatedDatabaseSettings
, the next step is to generate a Schema
. This can be done simply by calling fromAnnotatedDbSettings
, like so:
> hsSchema :: BA.Schema
> hsSchema = BA.fromAnnotatedDbSettings annotatedDB (Proxy @'[])
This will generate something like this:
Schema
{ schemaTables = fromList
[
( TableName { tableName = "cities" }
, Table
{ tableConstraints = fromList
[ PrimaryKey "cities_pkey"
( fromList
[ ColumnName { columnName = "city" } ]
)
]
, tableColumns = fromList
[
( ColumnName { columnName = "city" }
, Column
{ columnType = SqlStdType ( DataTypeChar True Nothing Nothing )
, columnConstraints = fromList [ NotNull ]
}
)
,
( ColumnName { columnName = "location" }
, Column
{ columnType = SqlStdType ( DataTypeChar True Nothing Nothing )
, columnConstraints = fromList [ NotNull ]
}
)
]
}
)
,
( TableName { tableName = "weathers" }
, Table
{ tableConstraints = fromList
[ PrimaryKey "weathers_pkey"
( fromList
[ ColumnName { columnName = "id" } ]
)
, ForeignKey "weathers_city__city_fkey"
( TableName { tableName = "cities" } )
( fromList
[
( ColumnName { columnName = "city__city" }
, ColumnName { columnName = "city" }
)
]
) NoAction NoAction
]
, tableColumns = fromList
[
( ColumnName { columnName = "city__city" }
, Column
{ columnType = SqlStdType ( DataTypeChar True Nothing Nothing )
, columnConstraints = fromList [ NotNull ]
}
)
,
( ColumnName { columnName = "id" }
, Column
{ columnType = SqlStdType DataTypeInteger
, columnConstraints = fromList [ NotNull ]
}
)
,
( ColumnName { columnName = "temp_hi" }
, Column
{ columnType = SqlStdType DataTypeInteger
, columnConstraints = fromList [ NotNull ]
}
)
,
( ColumnName { columnName = "temp_lo" }
, Column
{ columnType = SqlStdType DataTypeInteger
, columnConstraints = fromList [ NotNull ]
}
)
]
}
)
]
, schemaEnumerations = fromList []
}
Notable things to notice:
- Foreign keys have been automatically inferred;
- Each column is mapped to a sensible SQL datatype;
- Non
Maybe/Nullable
types have aNotNull
constraint added;
Generating an automatic migration
Once a Schema
has been generated, in order to automatically migrate your schema, it is sufficient to write something like this:
>
> exampleShowMigration :: Connection -> IO ()
> exampleShowMigration conn = runBeamPostgres conn $
> BA.printMigration $ BA.migrate conn hsSchema
>
> exampleAutoMigration :: Connection -> IO ()
> exampleAutoMigration conn =
> BA.tryRunMigrationsWithEditUpdate annotatedDB conn
>
> main :: IO ()
> main = do
> args <- getArgs
> let (getLine', connMethod) = case args of
> -- The "ci" argument allows the readme to be run and tested in a headless
> -- environment (e.g., by a continuous integration server)
> ["ci"] -> (return "y", ConnMethod_Direct $ defaultConnectInfo { connectDatabase = "readme" })
> _ -> (getLine, ConnMethod_Gargoyle "readme-db")
> withConnection connMethod $ \conn -> Pg.withTransaction conn$ do
> putStrLn "----------------------------------------------------"
> putStrLn "MIGRATION PLAN (if migration needed):"
> putStrLn "----------------------------------------------------"
> exampleShowMigration conn
> putStrLn "----------------------------------------------------"
> putStrLn "MIGRATE?"
> putStrLn "----------------------------------------------------"
> putStrLn "Would you like to run the migration on the database in the folder \"readme-db\" (will be created if it doesn't exist)? (y/n)"
> response <- getLine'
> case response of
> "y" -> exampleAutoMigration conn
> "Y" -> exampleAutoMigration conn
> _ -> putStrLn "Exiting"
>
The exampleAutoMigration
function will try to generate another Schema
, this time from the Postgres database and the two Schema
s will be "diffed together" in order to compute the list of edits necessary to migrate from the DB to the Haskell database. To begin with, we can call exampleShowMigration
to "preview" the full SQL command that will be run:
Ok, 12 modules loaded.
-> import Database.Beam.Migrate.Example.ForeignKeys
-> exampleShowMigration
CREATE TABLE "cities" (city VARCHAR NOT NULL, location VARCHAR NOT NULL);
CREATE TABLE "weathers" (city__city VARCHAR NOT NULL, id INT NOT NULL, temp_hi INT NOT NULL, temp_lo INT NOT NULL);
ALTER TABLE "cities" ADD CONSTRAINT "cities_pkey" PRIMARY KEY (city);
ALTER TABLE "weathers" ADD CONSTRAINT "weathers_pkey" PRIMARY KEY (id);
ALTER TABLE "weathers" ADD CONSTRAINT "weathers_city__city_fkey" FOREIGN KEY (city__city) REFERENCES "cities"(city);
Once we are satisfied with visual inspection, we can run it via exampleAutoMigration
. If all is well, we can now check how the DB has been migrated. If we try to call exampleShowMigration
again, no output should be shown, because the DB is now up-to-date with the Haskell types.
Current design (10_000ft overview)
Beam itself provides a DatabaseSettings
type. A value of this type is typically derived generically from the Haskell datatypes representing the database schema, but can be amended. The primary purpose of DatabaseSettings
is to provide a mapping between Haskell names for tables and columns and the corresponding DB-side names.
On top of this, we provide an AnnotatedDatabaseSettings
type. This is similar in spirit to the CheckedDatabaseSettings
provided by the original beam-migrate
, but a bit simpler. On top of DatabaseSettings
, the AnnotatedDatabaseSettings
contain additional information, in particular constraints that tables and columns must satisfy. Once again, a value of this type can be derived generically, but the information can be amended.
Both DatabaseSettings
and AnnotatedDatabaseSettings
follow the structure of the Haskell datatypes comprising the schema, and are therefore quite strongly typed.
From an AnnotatedDatabaseSettings
value, we can internally derive a Schema
. This is a straightforward representation of a DB schema without any type-level magic.
We can similarly generate a Schema
value for the DB schema currently stored in the database. We can then diff the two Schema
s to get a Diff
which determines a list of Edit
s. Such edits can be applied in a particular (prioritised) order to migrate the DB schema to the Haskell schema.
There are two important stages in the library, the first one being when we call defaultAnnotatedDbSettings
and the second one when we call fromAnnotatedDbSettings
. The first function converts from a DatabaseSettings
into an AnnotatedDatabaseSettings
whereas the latter convert from an AnnotatedDatabaseSettings
into a Schema
. Both uses generic-derivation but in a different way and for different purposes.
Deriving information from a DatabaseSettings
During this phase we "zip" all the tables together and we essentially convert each DatabaseEntity
into an AnnotatedDatabaseEntity
. The latter is ever so slightly similar to the former but crucially it embeds extra information. One way to see this is that exactly as a DatabaseEntity
carry around a TableSettings
which carries meta-information on the naming of each particular column for each table, an AnnotatedDatabaseEntity
carries what's called a TableSchema
, which is defined as:
-- | A table schema.
type TableSchema tbl =
tbl (TableFieldSchema tbl)
-- | A schema for a field within a given table
data TableFieldSchema (tbl :: (* -> *) -> *) ty where
TableFieldSchema
::
{ tableFieldName :: Text
, tableFieldSchema :: FieldSchema ty }
-> TableFieldSchema tbl ty
data FieldSchema ty where
FieldSchema :: ColumnType
-> Set ColumnConstraint
-> FieldSchema ty
Looking at this, the similarity with a TableSettings
is quite obvious:
type TableSettings tbl = tbl (TableField tbl)
Here is where the second generic-derivation algorithm comes in, and it "maps" each TableField
with a new TableFieldSchema
, which is initialised with "stock" default values for the ColumnType
and Set ColumnConstraint
. There values are automatically inferred thanks to the HasDefaultSqlDataType
and HasSchemaConstraints
typeclasses defined over at Database.Beam.Migrate.Compat
. This gives us a concrete anchor point for the user to further annotate the database and override each individual table & column with extra information.
This is described later on in the "Overriding the defaults of an AnnotatedDatabaseSettings
" section.
Deriving information from an AnnotatedDatabaseSettings
This is the phase where we traverse the generic representation of an AnnotatedDatabaseSettings
in order to infer the Schema
. It is during this phase that we try to discover Foreign keys.
Foreign key discovery can fail statically. If foreign key discovery fails, one should have the possibility to override AnnotatedDatabaseSettings
before running the transformation to Schema
to manually provide the necessary hints.
What is implemented
- [x] Support for JSON, JSONB and Range types;
- [x] Support for automatically inferring FKs (if unambiguous);
- [x] Support for annotating tables and fields via the standard, familiar
beam-core
API (e.g. add a table/column constraint); - [x] Support for running a migration to mutate the database;
Shortcomings and limitations
- [ ] Deriving a particular instance using
deriving via
could hamperSchema
discovery. For example let's imagine we have:
data Foo = Bar | Baz deriving HasDefaultSqlDataType via (DbEnum Foo)
data MyTable f = MyTable {
myTableFoo :: Columnar f Foo
}
This won't correctly infer Foo
is a DbEnum
, at the moment, as this information is derived directly from the types of each individual columns.
- [ ] There is no support yet for specifying FKs in case the discovery algorithm fails due to ambiguity;
- [ ] There is no support yet for manual migrations;
- [ ] There is no support/design for "composable databases";
- [ ] Some parts of the library are Pg-specific.
Contributors
This library was originally written for Obsidian Systems by Alfredo Di Napoli and Andres Löh of Well-Typed. Other contributors include Dan Bornside, Sean Chalmers, Ryan Trinkle, and Ali Abrar of Obsidian Systems.