{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# DataFlow API walkthrough\n", "Suhas Somnath
\n", "4/6/2022
\n", "Oak Ridge National Laboratory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Prepare to use DataFlow's API:\n", "\n", "1. Install the ``ordflow`` python package from PyPi via:\n", "\n", "``pip install ordflow``\n", "\n", "2. Generate an API Key from DataFlow's web interface\n", "\n", "**Note**: API Keys are not reusable across DataFlow servers (e.g. facility-local and central at https://dataflow.ornl.gov). You will need to get an API key to suit the specific instance of DataFlow you are communicating with" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "api_key = \"eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo1LCJjcmVhdGVkX2F0IjoiMjAyMi1wNS0wMlQwOTo1ODoxMi0wNDowMCIsImV4cCI6MTY4Mjk4NTYwMH0.jYqV0YNn1dO_8bdQGvVY5MFqfX_xR1DxRKNZANuemuU\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. Encrypt password(s) necessary to activate Globus endpoints securely\n", "\n", "Here, the two Globus endpoints (DataFlow server and destination) use the same authentication (ORNL's XCAMS)\n", "\n", "**Note**: You will need to get your passwords encrypted by the specific deployment of DataFlow (central / facility-local) that you intend to use" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "enc_pwd = \"V5yYQFuavTo83XQ9BFA04azG--5LiXo6OOA3cFPqhm--Hg3wpLrSO0wIswtbFdsz1A==\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "4. Import the ``API`` class from the ``dflow`` package." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from ordflow import API" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instantiate the API object with your personal API Key:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Instantiate the API" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using server at: https://dataflow.ornl.gov/api/v1 as default\n" ] } ], "source": [ "api = API(api_key)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Check default settings\n", "Primarily pay attention to the ``destination_globus`` parameter since this is the only parameter that can be changed / has any significant effect" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'globus': {'destination_endpoint': '57230a10-7ba2-11e7-8c3b-22000b9923ef'},\n", " 'transport': {'protocol': 'globus'}}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.settings_get()\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Update a default setting\n", "\n", "Here, we will switch the destination endpoint to ``olcf#dtn`` for illustration purposes" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'globus': {'destination_endpoint': 'ef1a9560-7ca1-11e5-992c-22000b96db58'},\n", " 'transport': {'protocol': 'globus'}}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.settings_set(\"globus.destination_endpoint\", \n", " \"ef1a9560-7ca1-11e5-992c-22000b96db58\")\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Switching back the destination endpoint to ``cades#CADES-OR`` which is the default" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'globus': {'destination_endpoint': '57230a10-7ba2-11e7-8c3b-22000b9923ef'},\n", " 'transport': {'protocol': 'globus'}}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.settings_set(\"globus.destination_endpoint\", \n", " \"57230a10-7ba2-11e7-8c3b-22000b9923ef\")\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. List and view registered instruments\n", "\n", "Contact a DataFlow server administrator to add an instrument for you." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'id': 2,\n", " 'name': 'Asylum Research Cypher West',\n", " 'description': 'AR Cypher located in building 8610 in room JG 55. This instrument is capable of Band Excitation and General-mode based measurements in addition to common advanced AFM measurements.',\n", " 'instrument_type': None}]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.instrument_list()\n", "response" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 2,\n", " 'name': 'Asylum Research Cypher West',\n", " 'description': 'AR Cypher located in building 8610 in room JG 55. This instrument is capable of Band Excitation and General-mode based measurements in addition to common advanced AFM measurements.',\n", " 'instrument_type': None}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.instrument_info(2)\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Check to see if Globus endpoints are active:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'source_activation': {'code': 'AutoActivationFailed'},\n", " 'destination_activation': {'code': 'AutoActivationFailed'}}" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.globus_endpoints_active(\"57230a10-7ba2-11e7-8c3b-22000b9923ef\")\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Activate one or both endpoints as necessary:\n", "Because the destination wasn't already activated, we can activate that specific endpoint. \n", "\n", "**Note**: An encrypted password is being used in place of the conventional password for safety reasons. " ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'status': 'ok'}" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.globus_endpoints_activate(\"syz\", \n", " enc_pwd, \n", " encrypted=True, \n", " endpoint=\"destination\")\n", "response" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'source_activation': {'code': 'AutoActivated.CachedCredential'},\n", " 'destination_activation': {'code': 'AlreadyActivated'}}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.globus_endpoints_active()\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 7. Create a measurement Dataset\n", "This creates a directory at the destination Globus Endpoint:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 12,\n", " 'name': 'My new dataset with nested metadata',\n", " 'creator': {'id': 5, 'name': 'Suhas Somnath'},\n", " 'dataset_files': [],\n", " 'instrument': None,\n", " 'metadata_field_values': [{'id': 13,\n", " 'field_value': 'PZT',\n", " 'field_name': 'Sample',\n", " 'metadata_field': None},\n", " {'id': 14,\n", " 'field_value': 'Asylum Research',\n", " 'field_name': 'Microscope-Vendor',\n", " 'metadata_field': None},\n", " {'id': 15,\n", " 'field_value': 'MFP3D',\n", " 'field_name': 'Microscope-Model',\n", " 'metadata_field': None},\n", " {'id': 16,\n", " 'field_value': '373',\n", " 'field_name': 'Temperature',\n", " 'metadata_field': None}]}" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.dataset_create(\"My new dataset with nested metadata\",\n", " metadata={\"Sample\": \"PZT\", \n", " \"Microscope\": {\n", " \"Vendor\": \"Asylum Research\",\n", " \"Model\": \"MFP3D\"\n", " },\n", " \"Temperature\": 373\n", " }\n", " )\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Getting the dataset ID programmatically to use later on:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "12" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset_id = response['id']\n", "dataset_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 8. Upload data file(s) to Dataset" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "using Globus since other file transfer adapters have not been implemented\n" ] }, { "data": { "text/plain": [ "{'id': 9,\n", " 'name': 'AFM_Topography.PNG',\n", " 'file_length': 162,\n", " 'file_type': '',\n", " 'created_at': '2022-05-02 15:07:04 UTC',\n", " 'relative_path': '',\n", " 'is_directory': False}" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.file_upload(\"./AFM_Topography.PNG\", dataset_id)\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Upload another data file to the same dataset:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "using Globus since other file transfer adapters have not been implemented\n" ] }, { "data": { "text/plain": [ "{'id': 10,\n", " 'name': 'measurement_configuration.txt',\n", " 'file_length': 162,\n", " 'file_type': '',\n", " 'created_at': '2022-05-02 15:07:08 UTC',\n", " 'relative_path': 'foo/bar',\n", " 'is_directory': False}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.file_upload(\"./measurement_configuration.txt\", dataset_id, relative_path=\"foo/bar\")\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9. Search Dataset:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'total': 1,\n", " 'has_more': False,\n", " 'results': [{'id': 12,\n", " 'created_at': '2022-05-02T15:03:49Z',\n", " 'name': 'My new dataset with nested metadata',\n", " 'dataset_files': [{'id': 9,\n", " 'name': 'AFM_Topography.PNG',\n", " 'file_length': 162,\n", " 'file_type': '',\n", " 'created_at': '2022-05-02 15:07:04 UTC',\n", " 'relative_path': '',\n", " 'is_directory': False},\n", " {'id': 10,\n", " 'name': 'measurement_configuration.txt',\n", " 'file_length': 162,\n", " 'file_type': '',\n", " 'created_at': '2022-05-02 15:07:08 UTC',\n", " 'relative_path': 'foo/bar',\n", " 'is_directory': False},\n", " {'id': 11,\n", " 'name': 'foo',\n", " 'file_length': None,\n", " 'file_type': None,\n", " 'created_at': '2022-05-02 15:07:08 UTC',\n", " 'relative_path': '',\n", " 'is_directory': True},\n", " {'id': 12,\n", " 'name': 'bar',\n", " 'file_length': None,\n", " 'file_type': None,\n", " 'created_at': '2022-05-02 15:07:08 UTC',\n", " 'relative_path': 'foo',\n", " 'is_directory': True}],\n", " 'metadata_field_values': [{'id': 13,\n", " 'field_value': 'PZT',\n", " 'field_name': 'Sample',\n", " 'metadata_field': None},\n", " {'id': 14,\n", " 'field_value': 'Asylum Research',\n", " 'field_name': 'Microscope-Vendor',\n", " 'metadata_field': None},\n", " {'id': 15,\n", " 'field_value': 'MFP3D',\n", " 'field_name': 'Microscope-Model',\n", " 'metadata_field': None},\n", " {'id': 16,\n", " 'field_value': '373',\n", " 'field_name': 'Temperature',\n", " 'metadata_field': None}]}]}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.dataset_search(\"nested\")\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Parsing the response to get the dataset of interest for us:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "12" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dset_id = response['results'][0]['id']\n", "dset_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 10. View this Dataset:\n", "This view shows both the files and metadata contained in a dataset:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 12,\n", " 'name': 'My new dataset with nested metadata',\n", " 'creator': {'id': 5, 'name': 'Suhas Somnath'},\n", " 'dataset_files': [{'id': 9,\n", " 'name': 'AFM_Topography.PNG',\n", " 'file_length': 162,\n", " 'file_type': '',\n", " 'created_at': '2022-05-02 15:07:04 UTC',\n", " 'relative_path': '',\n", " 'is_directory': False},\n", " {'id': 10,\n", " 'name': 'measurement_configuration.txt',\n", " 'file_length': 162,\n", " 'file_type': '',\n", " 'created_at': '2022-05-02 15:07:08 UTC',\n", " 'relative_path': 'foo/bar',\n", " 'is_directory': False},\n", " {'id': 11,\n", " 'name': 'foo',\n", " 'file_length': None,\n", " 'file_type': None,\n", " 'created_at': '2022-05-02 15:07:08 UTC',\n", " 'relative_path': '',\n", " 'is_directory': True},\n", " {'id': 12,\n", " 'name': 'bar',\n", " 'file_length': None,\n", " 'file_type': None,\n", " 'created_at': '2022-05-02 15:07:08 UTC',\n", " 'relative_path': 'foo',\n", " 'is_directory': True}],\n", " 'instrument': None,\n", " 'metadata_field_values': [{'id': 13,\n", " 'field_value': 'PZT',\n", " 'field_name': 'Sample',\n", " 'metadata_field': None},\n", " {'id': 14,\n", " 'field_value': 'Asylum Research',\n", " 'field_name': 'Microscope-Vendor',\n", " 'metadata_field': None},\n", " {'id': 15,\n", " 'field_value': 'MFP3D',\n", " 'field_name': 'Microscope-Model',\n", " 'metadata_field': None},\n", " {'id': 16,\n", " 'field_value': '373',\n", " 'field_name': 'Temperature',\n", " 'metadata_field': None}]}" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = api.dataset_info(dset_id)\n", "response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 11. View files uploaded via DataFlow:\n", "We're not using DataFlow here but just viewing the destination file system.\n", "\n", "Datasets are sorted by date:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! ls -hlt ~/dataflow/untitled_instrument/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There may be more than one dataset per day. Here we only have one" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Viewing the root directory of the dataset we just created:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will very soon be able to specify root level metadata that will be stored in ``metadata.json``.\n", "\n", "We can also see the nested directories: ``foo/bar`` where we uploaded the second file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/foo/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looking at the inner most directory - ``bar``:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/foo/bar" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 4 }