Document server

Preliminaries

We will be building an application to retrieve and store documents. We will implement an access library/module, several servers, and several clients.

The document store is an SQLite relational database. It is file based.

The document store will contain the following information:

  1. The documents table:
    1. id
    2. tags
    3. description
    4. body -- The document (body) itself.
  2. The config table:
    1. latest_id

There is a sample database containing a few documents (records) in ./Data/documents01.sqlite.

You can create a new, empty database by running create_doc_database.py.

Here are some SQL queries that you will find useful:

# Get and update config (latest_id).
sql = 'select latest_id from config'
sql = 'update config set latest_id={}'.format(new_id)
#
# Get id, tags, and description for all documents.
sql = 'select id, tags, description from documents'
#
# Get all fields for a specific document by ID.
sql = 'select * from documents where id = {}'.format(id)
#
# Insert a new document into the database.
sql = 'insert into documents values ({}, "{}", "{}", "{}")'.format(
    new_id, tags, description, body)

More information about SQLite: http://www.sqlite.org/docs.html

Exercises

Now, do each of the following:

  1. Implement a module/library to access and add records to a document repository stored in a SQLite relational file/database. Support the following API:

    • list() -- Provide a list of the ids, tags, and descriptions for all the documents/records in the database.
    • get(id) -- Retrieve a record by ID.
    • search(tag) -- Return the ids, tags, and descriptions for all records whose tags field contains tag.
    • add(tags, description, body) -- Add a new record to the database.

    The file create_doc_database.py can be used to create and initialize a new empty SQLite document database file.

  2. Implement a REST-ful document server. Build your server on ZeroMQ.

    Your document server should provide these capabilities:

    • Return a list of all the documents in the database. Return these fields: ID, tags, description.
    • Search for documents and return a set of document IDs (and possibly their descriptions).
    • Retrieve a document by ID.
    • Add a new document given tags, description, and the body (contents) of a file to store in the database.

    You can find templates for the ZeroMQ parts in Templates/hwserver.py and Templates/hwclient.py

  3. Implement a ZeroMQ client for your document server. The client should be able to perform these functions:

    1. Retrieve a list of all the documents in the database.
    2. Search a set of documents, given a tag.
    3. Retrieve a document, given its ID.
    4. Add a new document, given the tags, description, and name of a file to be stored.
  4. Implement a command line shell as a client for the document store. Use the cmd module from the standard Python library. Your command line shell can either (1) access your library/module directly or (2) use your ZeroMQ client.

    In your command line shell, implement the same commands: list, search, get, and add.

    Optional task -- Use one of the Python command line parsers (getopt or argparse) to parse the command line passed into this program. Implement support for options "-v", "--verbose", "-h", and "--help".

  5. Implement a Web application server that provides access to the documents in our document store. Build your Web application with Pyramid.

    Your Web application server should support these operations (URLs):

    /help -- Show this help.
    /list -- List all documents.
    /search/{tag} -- Search for and show documents by tag.
    /get/{id} -- Get and show document by ID.
    

    You can create a starter Pyramid application with the following commands:

    $ pcreate --help
    $ pcreate --list
    $ pcreate --scaffold=starter my_doc_server
    $ cd my_doc_server
    $ python setup.py develop
    

    Now start the server with the following:

    $ pserve development.ini
    

    Then visit http://localhost:6543 in your Web browser to see whether your application is good.