HaskellDB: Examples

HaskellDB

Examples

About
Examples
Documentation
Download
Pubs database
Questions

The goal of HaskellDB is to have a type safe embedding of database queries and operations within Haskell. Instead of sending plain SQL strings to a database, queries should be expressed with normal Haskell functions. Haskell's type system is used to check the queries at compile time. Instead of getting an error at runtime that a field name doesn't exist, a type error is given at compile time that points to the location where the name is misspelled. Since queries are first-class values, the can be stored in data structures, passed as arguments or take typed parameters.

The rest of this page will give some examples of queries and other operations with HaskellDB. The sample database pubs is used for all examples. The examples are all included in the module sample.hs shipped with HaskellDB. Allthough I compare mostly with SQL, HaskellDB can be used with other, even non-relational databases.

Show the names of all authors.
A possible SQL query can be written as:
```
SELECT X.au_fname, X.au_lname FROM Authors AS X
```
In HaskellDB, the query is expressed as:
```
names  = do{ x <- table authors
           ; project (au_fname = x!au_fname
           	     ,au_lname = x!au_lname )
           }
```
First, the relation (=database) authors is opened and bound to variable x. Second, a projection is taken of the attributes (=fields,columns) au_fname and au_lname. The do notation is used to conveniently bind relations to names. The (!) operator is used to select attributes from a relation.
This function is typechecked by the normal Haskell type checker. The type of names is:
```
names :: Query (Rel (au_fname :: Expr String
                    ,au_lname :: Expr String))
```
The type expresses that names is a query that returns a relation with two attributes: au_fname and au_lname. Both contain values of the domain (=type) String. The type of names prevents us for example to project the city attribute:
```
wrong = do{ x <- names; project (city = x!city) }
```
When this expression is loaded, the type checker complains:
```
Type checking
ERROR "SAMPLE.HS" (line 63): Type error in application
*** Expression     : x ! city
*** Term           : x
*** Type           : Rel (au_fname::Expr [Char]
                         , au_lname::Expr [Char])
*** Does not match : Rel (city::Expr a
			 , au_fname::Expr [Char]
			 , au_lname::Expr [Char] | b)
*** Because        : rows are not compatible
```
The produced error isn't very user friendly. However, using the wrong attribute always produces exactly this error. Freely translated it means that the relation x used on line 63 doesn't have the attribute city
Using the Haskell type system to mimic the type system of the relational algebra gives the advantage of having a type safe system with the minimal amount of work. On the downside, the general nature of the Haskell type system leads to quite complex error messages. A second disadvantage is that some constraints just can't be expressed within the current type system in Haskell (for example, a specific join can be typed but not a general join operator).
The example can be run using the following test function for executing a query:
```
import Trex		--record library
import HaskellDB        --basic library
import Pubs             --the 'pubs' database

test :: (Row r -> String) -> Query (Rel r) -> IO ()
test showrow q  = connect $ \pubs ->
                  do{ rows <- query pubs q
                    ; putStr (unlines (map showrow rows))
                    }
```
showrow is a function that shows the results of the query. While the (!) operator is used to select attributes in a query, the (!.) operator is used to select the values of the result. We can show the names of the authors with:
```
showNames r   = r!.au_fname ++ " " ++ r!.au_lname
```
The connect part of the test function is a driver dependent connection function. If you are using the ADO driver provided with the Win32 version, the recommended way of connecting is:
```
import Ado              --the ADO driver

test q  = do adoRun $
             adoConnect (adoDSN "pubs") $ \pubs ->
             do ...
```
This method uses Data Source Names to connect to an ODBC compliant database. See the faq for information about this driver.
We can now run the example query:
```
Sample> test showNames names

Abraham Bennet
Reginald Blotchet-Halls
Cheryl Carson
Michel DeFrance
Innes del Castillo
Ann Dull
...
```
Find all names of authors living in Oakland
Here we have to put a restriction on the query. In SQL we would write:
```
SELECT X.FirstName, X.LastName
FROM   Authors AS X
WHERE  X.City = 'OakLand'
```
The equivalent query in HaskellDB is:
```
oaklands = do{ x <- table authors
             ; restrict (x!city .==. constant "Oakland")
             ; project (au_fname = x!au_fname
             	       ,au_lname = x!au_lname)
             }
```
All conventional relational comparison operations are available in haskell. The operators are wrapped in two dots to distinguish them from the normal comparison operators. Constants are declared using constant. Again the type system ensures that we can not use expressions like:
```
   restrict (x!city .==. constant 2)
```
This results in:
```
Type checking
ERROR "SAMPLE.HS"
*** (line 66): Unresolved top-level overloading
*** Binding             : oaklands
*** Outstanding context : Num [Char]
```
Again, not a friendly message, but a lot better than a strange runtime error at the user site! This message tells you that a string ([Char]) constant at line 66 doesn't support numeric operations (Num [Char]).
Since queries are first-class values, it is easy to parametrise this query on the city name:
```
incity name  = do{ x <- table authors
                 ; restrict (x!city .==. constant name)
                 ; project  (au_fname = x!au_fname
             	            ,au_lname = x!au_lname)
                 }
```
To retrieve all the people living in Oakland or Palo Alto, you could write:
```
oaklandPalo  = union (incity "Oakland")
                     (incity "Palo Alto")
```

Show all authors who live in the same city

This query is best solved using a cartesian product.

samecity = do{ x <- table authors
             ; y <- table authors
             ; restrict (x!au_id .<>. y!au_id)
             ; restrict (x!city .==. y!city)
             ; project  (au_fname = x!au_fname
             	        ,au_lname = x!au_lname
             	        ,city	  = x!city)
             }

The first restriction excludes the author itself as living in the same city, whereas the second restriction excludes people living in other cities. Since we work with relations, a keyword like DISTINCT in SQL is unnecessary. This query is an example of an equi-join; Two author tables are joined on equality of the city attribute.

Note that by simply typing the name of the query function at the Hugs command prompt, the generated SQL expression will be shown

Sample> sameCity
SELECT DISTINCT  au_fname1 as au_fname ,
                 au_lname1 as au_lname ,
                 city1 as city
FROM  (SELECT DISTINCT  au_id as au_id2 ,city as city2
       FROM  authors as T1
       ) as T1 ,
      (SELECT DISTINCT  au_fname as au_fname1 ,
                        au_id as au_id1 ,
                        au_lname as au_lname1 ,
                        city as city1
       FROM  authors as T1
       ) as T2
WHERE (au_id1 <> au_id2) AND (city1 = city2)

List all authors with their books.

We will join three tables to answer this query:

authorTitles
   = do{ x <- table authors
       ; y <- table titleauthor
       ; z <- table titles
       ; restrict (x!au_id .==. y!au_id)
       ; restrict (y!title_id  .==. z!title_id)
       ; project  (au_fname = x!au_fname
                  ,au_lname = x!au_lname
                  ,title    = z!title)
       }

List the number of books an author has written
This is the first case of using an aggregate function. The aggregate function count counts the number of occurrences in a relation.
```
countTitles  = do{ x <- authorTitles
                 ; project (au_fname = x!au_fname
                  	   ,au_lname = x!au_lname
                           ,title    = count x title)
                 }
```
We re-use the title attribute to contain the count of titles. Since TREX currently lacks first-class labels, it is not possible to use a new label without defining one. Look in the Pubs module to see how you can define your own labels.
To count the number of authors that have written books with other people, but not their husbands or family, we can write:
```
together  = do{ x <- authorTitles
              ; y <- authorTitles
              ; restrict (x!au_lname .<>. y!au_lname)
              ; restrict (x!title .==. y!title)
              ; project  (title = count x title) 
              }
```
exercise: write this query in SQL.
List the top 25% of authors with the books they have written sorted on name.
The functions top and topPercent are used to return the top n or n percent rows. The order function orders according to a list of attributes, either in ascending (asc) or descending (desc) order. The resulting query is:
```
authorTitles2  = do{ x <- authorTitles
                   ; order [asc x au_lname
                           ,asc x au_fname]
                   ; topPercent 25
                   ; return x
                   }
```
The query simply returns x since no projection is necessary.

Write a list of author names, first and last name separated with a comma, to a file

While a relation is within the Query monad, it is of type Rel r. When the relation is retrieved, it is represented in Haskell as a (lazy) list of Row r values. An attribute value can be selected using the (!.) operator. The query is written as:

saveNames
        = doQuery write names   
        where
          write xs  = writeFile "names.txt"
                        (unlines (map showNames xs))

where doQuery is defined as:

doQuery action q
        = do adoRun $
             adoConnect (adoDSN "pubs") $ \pubs ->
             do xs <- query pubs q
                action xs

Of course, it gets more interesting when operations like this are composed with other combinator libraries like Erik Meijer's CGI library or John Hughes Pretty Printer. The following example uses both libraries to export all the names of the authors to a HTML file (which could be CGI response). (note: this example only works with the HaskellCGI package installed).

import HTML                 (prose)
import HTMLWizard           (page,h)
import qualified HTMLWizard (table)
import PrettyHTML           ()

...

saveNamesHTML
  = doQuery writeHTML names
  where
    writeHTML     = writeFile "names.html" . show . toHTML
    toHTML xs     = page "Author Names" []
                          [ h 2 "HaskellDB generated table"
                          , toTable xs
                          ]
    toTable xs    = HTMLWizard.table 
    			  (show (length xs) ++ " names")
                          ["FirstName", "LastName"]
                          (map toRow xs)
                          
    toRow x       = map toElem [x!.au_fname, x!.au_lname]
    toElem x      = [prose x]