================================================ h5serv -- The HDF server -- Notes and examples ================================================ :author: Dave Kuhlman :contact: dkuhlman (at) davekuhlman (dot) org :address: http://www.davekuhlman.org :revision: 1.0a :date: |date| .. |date| date:: %B %d, %Y :Copyright: Copyright (c) 2015 Dave Kuhlman. All Rights Reserved. This software is subject to the provisions of the MIT License http://www.opensource.org/licenses/mit-license.php. :Abstract: This document provides hints, guidance, and sample code for access to an ``h5serv`` server. .. sectnum:: .. contents:: More info =========== ``h5serv``, the HDF SERVER, serves information about and data from HDF5 data files. - The home Web page: https://www.hdfgroup.org/projects/hdfserver/ - A Blog article about ``h5serv``: https://hdfgroup.org/wp/2015/04/hdf5-for-the-web-hdf-server/ - For information about the REST (Representational state transfer) software architectural style, see: https://en.wikipedia.org/wiki/Representational_state_transfer Start up ========== I installed ``h5serv`` under the Anaconda Python distribution from Continuum. See this for more information: https://store.continuum.io/cshop/anaconda/. Instructions on installing ``h5serv`` under Anaconda and setting up your environment are included with ``h5serv`` distribution. See file ``../docs/Installation/ServerSetup.rst`` in the ``h5serv`` distribution. Installation -- Do this under Linux:: $ conda create -n h5serv python=2.7 h5py twisted tornado requests pytz Set up your environment -- Depending on where you have installed Anaconda, so something like the following:: $ source ~/a1/Python/Anaconda/Anaconda01/envs/h5serv/bin/activate h5serv If and when you need to deactivate this environment, use:: $ source deactivate Server startup -- Go to the ``server`` sub-directory in your ``h5serv`` installation, and run ``app.py``. For example:: $ cd ~/a1/Python/Anaconda/H5serv/Git/h5serv/server $ python app.py Examples ========== cURL ------ The ``curl`` command line tool is an easy way to make REST requests to an ``h5serv`` server. Some examples:: $ curl -X GET -H "host:testdata04.hdfgroup.org" http://crow:5000 Here is a ``bash`` shell script that makes several requests (I've added ``echo`` at the end of each command so that a new line is added.):: #!/bin/bash # get info about a database hdf5 file. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000 ; echo # get the IDs of the datasets in the file. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets ; echo # get info about one specific dataset. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89 ; echo # get the data values from a specific dataset. curl -X GET -H "host: testdata04.hdfgroup.org" http://crow:5000/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value ; echo Python -------- You will need to install the ``requests`` package. You can find it here: https://pypi.python.org/pypi/requests. For this testing, I used the Anaconda distribution of Python, which, I believe, includes ``requests`` by default. You can learn about Anaconda here: https://store.continuum.io/cshop/anaconda/. Using IPython:: In [1]: import requests In [2]: req = 'http://crow:5000/' In [3]: hdrs = {'host': 'testdata04.hdfgroup.org'} In [4]: rsp = requests.get(req, headers=hdrs) In [5]: rsp Out[5]: In [6]: print rsp.text {"lastModified": "2015-07-02T23:49:18.303330Z", "hrefs": [{"href": "http://testdata04.hdfgroup.org/", "rel": "self"}, {"href": "http://testdata04.hdfgroup.org/datasets", "rel": "database"}, {"href": "http://testdata04.hdfgroup.org/groups", "rel": "groupbase"}, {"href": "http://testdata04.hdfgroup.org/datatypes", "rel": "typebase"}, {"href": "http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89", "rel": "root"}], "root": "f416d152-2114-11e5-81d4-0019dbe2bd89", "created": "2015-07-02T23:49:18.303330Z"} In [7]: In [7]: print rsp.json() {u'lastModified': u'2015-07-02T23:49:18.303330Z', u'hrefs': [{u'href': u'http://testdata04.hdfgroup.org/', u'rel': u'self'}, {u'href': u'http://testdata04.hdfgroup.org/datasets', u'rel': u'database'}, {u'href': u'http://testdata04.hdfgroup.org/groups', u'rel': u'groupbase'}, {u'href': u'http://testdata04.hdfgroup.org/datatypes', u'rel': u'typebase'}, {u'href': u'http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89', u'rel': u'root'}], u'root': u'f416d152-2114-11e5-81d4-0019dbe2bd89', u'created': u'2015-07-02T23:49:18.303330Z'} In [8]: In [8]: req = 'http://crow:5000/groups' In [9]: rsp = requests.get(req, headers=hdrs) In [10]: rsp Out[10]: In [11]: print rsp.json() {u'hrefs': [{u'href': u'http://testdata04.hdfgroup.org/groups', u'rel': u'self'}, {u'href': u'http://testdata04.hdfgroup.org/groups/f416d152-2114-11e5-81d4-0019dbe2bd89', u'rel': u'root'}, {u'href': u'http://testdata04.hdfgroup.org/', u'rel': u'home'}], u'groups': [u'f416d155-2114-11e5-81d4-0019dbe2bd89', u'f416d158-2114-11e5-81d4-0019dbe2bd89', u'f416d15b-2114-11e5-81d4-0019dbe2bd89']} And here is a Python script containing examples of several requests like those above:: #!/usr/bin/env python import requests def test(): rsp = requests.get( 'http://crow:5000', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.text print rsp.json() rsp = requests.get( 'http://crow:5000/groups', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/groups/f416d155-2114-11e5-81d4-0019dbe2bd89', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/datasets', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() rsp = requests.get( 'http://crow:5000/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89/value', headers={'host': 'testdata04.hdfgroup.org'}) print rsp.json() value = rsp.json()['value'] print 'value: {}'.format(value) return rsp.json() def main(): test() if __name__ == '__main__': main() And, the following is a Python script that is functionally equivalent to the previous one, but that attempts to hide some of the repetition and messiness in a class:: #!/usr/bin/env python import requests class H5servRequest(object): def __init__(self, host, machine, port): self.host = host self.machine = machine self.port = port self.location = "{}:{}".format(machine, port) def get(self, path): rsp = requests.get( self.location + path, headers={'host': self.host}) return rsp.json() def test(): req = H5servRequest( 'testdata04.hdfgroup.org', 'http://crow', 5000) data = req.get('') print '-----\n{}'.format(data) data = req.get('/groups') print '-----\n{}'.format(data) data = req.get('/datasets') print '-----\n{}'.format(data) data = req.get('/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89') print '-----\n{}'.format(data) data = req.get('/datasets/f416d154-2114-11e5-81d4-0019dbe2bd89/value') print '-----\n{}'.format(data) def main(): test() if __name__ == '__main__': main() Notes: - The class H5servRequest captures and reuses the host, the machine (or node to which we make our requests, and the port. - Each call to ``H5servRequest.get`` uses the ``request`` module to send the request, then returns the JSON payload. JavaScript/Node.js -------------------- Here is a similar example written in ``Node.js``:: #!/usr/bin/env node var http = require('http'); var log = console.log; function do_request(path, cb) { var opt = {}; opt.hostname = 'crow'; opt.port = 5000; opt.method = 'GET'; opt.headers = {host: 'testdata04.hdfgroup.org'}; opt.path = path; log('opt: ' + JSON.stringify(opt)); var req = http.request(opt, function (response) { response.on('data', function (chunk) { log('-----\nbody: ' + chunk); if (cb !== null) { cb(chunk); } }); }); req.on('error', function(e) { log('request error: ' + e.message); }); req.end(); } function test() { var content; do_request('/', null); do_request('/groups', null); do_request('/datasets', null); do_request('/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89', null); do_request( '/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value', function(data) { var content, values; content = JSON.parse(data); values = content.value; log('-----\nvalues: ' + values); }); } test(); The HTTP requests in the above example are asynchronous, and, therefore, the results may not come out in the same order as our calls to ``do_request``. Here is an example that uses a recursive loop to execute these operations in a serial order:: #!/usr/bin/env node var http = require('http'); var async = require('async'); var log = console.log; var args = [ ['/', null], ['/groups', null], ['/datasets', null], ['/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89', null], ['/datasets/f416d15c-2114-11e5-81d4-0019dbe2bd89/value', function(data) { var content, values; content = JSON.parse(data); values = content.value; log('-----\nvalues: ' + values); }], ]; function do_request(args, idx) { if (idx < args.length) { var path = args[idx][0], cb = args[idx][1], opt = {}; opt.hostname = 'crow'; opt.port = 5000; opt.method = 'GET'; opt.headers = {host: 'testdata04.hdfgroup.org'}; opt.path = path; log('opt: ' + JSON.stringify(opt)); var req = http.request(opt, function (response) { response.on('data', function (chunk) { log('-----\nbody: ' + chunk); if (cb !== null) { cb(chunk); } do_request(args, idx + 1); }); }); req.on('error', function(e) { log('request error: ' + e.message); }); req.end(); } } function test() { do_request(args, 0); } test(); Notice that, in this example (above) we do not call ``do_request`` recursively until the ``response.on`` callback has been called. .. vim:ft=rst: