The goal of HaskellDB is to have a type safe embedding of database queries and
operations within Haskell. Instead of sending plain SQL strings to a database,
queries should be expressed with normal Haskell functions.
Haskell's type system is used to check the queries at compile time.
Instead of getting an error at runtime that a field name doesn't exist,
a type error is given at compile time that points to the
location where the name is misspelled. Since queries are first-class
values, the can be stored in data structures, passed as arguments or take
typed parameters.
The rest of this page will give some examples of queries and other operations
with HaskellDB. The sample database
pubs
is used for all examples. The examples
are all included in the module sample.hs
shipped with HaskellDB.
Allthough I compare mostly with SQL, HaskellDB can be used with other, even
non-relational databases.
- Show the names of all authors.
A possible
SQL query can be written as:
SELECT X.au_fname, X.au_lname FROM Authors AS X
In HaskellDB, the query is expressed as:
names = do{ x <- table authors
; project (au_fname = x!au_fname
,au_lname = x!au_lname )
}
First, the relation (=database) authors
is opened and bound
to variable x
. Second, a projection is taken of the
attributes (=fields,columns) au_fname
and au_lname
.
The do
notation is used to conveniently bind relations to names.
The (!)
operator is used to select attributes from a relation.
This function is typechecked by the normal Haskell type checker. The type of
names
is:
names :: Query (Rel (au_fname :: Expr String
,au_lname :: Expr String))
The type expresses that names
is a query that returns a relation with
two attributes: au_fname
and au_lname
. Both contain values of
the domain (=type) String
. The type of names
prevents
us for example to project the city
attribute:
wrong = do{ x <- names; project (city = x!city) }
When this expression is loaded, the type checker complains:
Type checking
ERROR "SAMPLE.HS" (line 63): Type error in application
*** Expression : x ! city
*** Term : x
*** Type : Rel (au_fname::Expr [Char]
, au_lname::Expr [Char])
*** Does not match : Rel (city::Expr a
, au_fname::Expr [Char]
, au_lname::Expr [Char] | b)
*** Because : rows are not compatible
The produced error isn't very user friendly. However, using the
wrong attribute always produces exactly this error. Freely translated it means
that the relation x
used on line 63
doesn't have the
attribute city
Using the Haskell type system to mimic the type system of the relational
algebra gives the advantage of having a type safe system with the minimal
amount of work. On the downside, the general nature of the Haskell type system
leads to quite complex error messages. A second disadvantage is that some
constraints just can't be expressed within the current type system in Haskell
(for example, a specific join can be typed but not a general join operator).
The example can be run using the following
test
function for executing a
query:
import Trex --record library
import HaskellDB --basic library
import Pubs --the 'pubs' database
test :: (Row r -> String) -> Query (Rel r) -> IO ()
test showrow q = connect $ \pubs ->
do{ rows <- query pubs q
; putStr (unlines (map showrow rows))
}
showrow
is a function that shows the results of the query. While
the (!)
operator is used to select attributes in a query, the
(!.)
operator is used to select the values of the result.
We can show the names of the authors with:
showNames r = r!.au_fname ++ " " ++ r!.au_lname
The connect
part of the test
function is a driver dependent connection function. If you are
using the ADO driver provided with the Win32 version, the recommended way of
connecting is:
import Ado --the ADO driver
test q = do adoRun $
adoConnect (adoDSN "pubs") $ \pubs ->
do ...
This method uses Data Source Names to connect to an ODBC compliant database.
See the faq for information about this driver.
We can now run the example query:
Sample> test showNames names
Abraham Bennet
Reginald Blotchet-Halls
Cheryl Carson
Michel DeFrance
Innes del Castillo
Ann Dull
...
- Find all names of authors living in Oakland
Here we have to put a restriction on the query. In SQL we would write:
SELECT X.FirstName, X.LastName
FROM Authors AS X
WHERE X.City = 'OakLand'
The equivalent query in HaskellDB is:
oaklands = do{ x <- table authors
; restrict (x!city .==. constant "Oakland")
; project (au_fname = x!au_fname
,au_lname = x!au_lname)
}
All conventional relational comparison operations are available in haskell. The
operators are wrapped in two dots to distinguish them from the normal comparison
operators. Constants are declared using constant
. Again the type
system ensures that we can not use expressions like:
restrict (x!city .==. constant 2)
This results in:
Type checking
ERROR "SAMPLE.HS"
*** (line 66): Unresolved top-level overloading
*** Binding : oaklands
*** Outstanding context : Num [Char]
Again, not a friendly message, but a lot better than a strange runtime error
at the user site! This message tells you that a string ([Char]
)
constant at line 66
doesn't support numeric operations
(Num [Char]
).
Since queries are first-class values, it is easy to parametrise this query on the
city name:
incity name = do{ x <- table authors
; restrict (x!city .==. constant name)
; project (au_fname = x!au_fname
,au_lname = x!au_lname)
}
To retrieve all the people living in Oakland or Palo Alto, you could write:
oaklandPalo = union (incity "Oakland")
(incity "Palo Alto")
- Show all authors who live in the same city
This query is best solved using a cartesian product.
samecity = do{ x <- table authors
; y <- table authors
; restrict (x!au_id .<>. y!au_id)
; restrict (x!city .==. y!city)
; project (au_fname = x!au_fname
,au_lname = x!au_lname
,city = x!city)
}
The first restriction excludes the author itself as living in the same city,
whereas the second restriction excludes people living in other cities. Since we
work with relations, a keyword like DISTINCT in SQL is unnecessary.
This query is an example of an equi-join; Two author tables
are joined on equality of the city
attribute.
Note that by simply typing the name of the query function
at the Hugs command prompt,
the generated SQL expression will be shown
Sample> sameCity
SELECT DISTINCT au_fname1 as au_fname ,
au_lname1 as au_lname ,
city1 as city
FROM (SELECT DISTINCT au_id as au_id2 ,city as city2
FROM authors as T1
) as T1 ,
(SELECT DISTINCT au_fname as au_fname1 ,
au_id as au_id1 ,
au_lname as au_lname1 ,
city as city1
FROM authors as T1
) as T2
WHERE (au_id1 <> au_id2) AND (city1 = city2)
- List all authors with their books.
We will join three tables to answer this query:
authorTitles
= do{ x <- table authors
; y <- table titleauthor
; z <- table titles
; restrict (x!au_id .==. y!au_id)
; restrict (y!title_id .==. z!title_id)
; project (au_fname = x!au_fname
,au_lname = x!au_lname
,title = z!title)
}
- List the number of books an author has written
This is the first case of using an aggregate function. The aggregate
function count
counts the number of occurrences in a relation.
countTitles = do{ x <- authorTitles
; project (au_fname = x!au_fname
,au_lname = x!au_lname
,title = count x title)
}
We re-use the title
attribute to contain the count of titles.
Since TREX currently lacks first-class labels, it is not possible to use
a new label without defining one. Look in the Pubs
module
to see how you can define your own labels.
To count the number of authors that have written books with other people, but
not their husbands or family, we can write:
together = do{ x <- authorTitles
; y <- authorTitles
; restrict (x!au_lname .<>. y!au_lname)
; restrict (x!title .==. y!title)
; project (title = count x title)
}
exercise: write this query in SQL.
- List the top 25% of authors with the books they have written
sorted on name.
The functions top
and topPercent
are used to return the top
n
or n
percent rows. The order
function orders
according to a list of attributes, either in ascending (asc
) or
descending (desc
) order.
The resulting query is:
authorTitles2 = do{ x <- authorTitles
; order [asc x au_lname
,asc x au_fname]
; topPercent 25
; return x
}
The query simply returns x
since no projection is necessary.
- Write a list of author names, first and last name separated
with a comma, to a file
While a relation is within the Query
monad, it is of type
Rel r
.
When the relation is retrieved, it is represented in Haskell as a (lazy)
list of Row r
values. An attribute value can be selected using
the (!.)
operator. The query is written as:
saveNames
= doQuery write names
where
write xs = writeFile "names.txt"
(unlines (map showNames xs))
where doQuery
is defined as:
doQuery action q
= do adoRun $
adoConnect (adoDSN "pubs") $ \pubs ->
do xs <- query pubs q
action xs
Of course, it gets more interesting when operations like this are composed with
other combinator libraries like Erik Meijer's CGI library or John Hughes Pretty Printer.
The following example uses both libraries to export all the names of the
authors to a HTML file (which could be CGI response). (note: this example only works
with the HaskellCGI package installed).
import HTML (prose)
import HTMLWizard (page,h)
import qualified HTMLWizard (table)
import PrettyHTML ()
...
saveNamesHTML
= doQuery writeHTML names
where
writeHTML = writeFile "names.html" . show . toHTML
toHTML xs = page "Author Names" []
[ h 2 "HaskellDB generated table"
, toTable xs
]
toTable xs = HTMLWizard.table
(show (length xs) ++ " names")
["FirstName", "LastName"]
(map toRow xs)
toRow x = map toElem [x!.au_fname, x!.au_lname]
toElem x = [prose x]