{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# DataFlow API walkthrough\n",
"Suhas Somnath
\n",
"4/6/2022
\n",
"Oak Ridge National Laboratory"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 0. Prepare to use DataFlow's API:\n",
"\n",
"1. Install the ``ordflow`` python package from PyPi via:\n",
"\n",
"``pip install ordflow``\n",
"\n",
"2. Generate an API Key from DataFlow's web interface\n",
"\n",
"**Note**: API Keys are not reusable across DataFlow servers (e.g. facility-local and central at https://dataflow.ornl.gov). You will need to get an API key to suit the specific instance of DataFlow you are communicating with"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"api_key = \"eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo1LCJjcmVhdGVkX2F0IjoiMjAyMi1wNS0wMlQwOTo1ODoxMi0wNDowMCIsImV4cCI6MTY4Mjk4NTYwMH0.jYqV0YNn1dO_8bdQGvVY5MFqfX_xR1DxRKNZANuemuU\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3. Encrypt password(s) necessary to activate Globus endpoints securely\n",
"\n",
"Here, the two Globus endpoints (DataFlow server and destination) use the same authentication (ORNL's XCAMS)\n",
"\n",
"**Note**: You will need to get your passwords encrypted by the specific deployment of DataFlow (central / facility-local) that you intend to use"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"enc_pwd = \"V5yYQFuavTo83XQ9BFA04azG--5LiXo6OOA3cFPqhm--Hg3wpLrSO0wIswtbFdsz1A==\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"4. Import the ``API`` class from the ``dflow`` package."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from ordflow import API"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Instantiate the API object with your personal API Key:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Instantiate the API"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using server at: https://dataflow.ornl.gov/api/v1 as default\n"
]
}
],
"source": [
"api = API(api_key)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Check default settings\n",
"Primarily pay attention to the ``destination_globus`` parameter since this is the only parameter that can be changed / has any significant effect"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'globus': {'destination_endpoint': '57230a10-7ba2-11e7-8c3b-22000b9923ef'},\n",
" 'transport': {'protocol': 'globus'}}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.settings_get()\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Update a default setting\n",
"\n",
"Here, we will switch the destination endpoint to ``olcf#dtn`` for illustration purposes"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'globus': {'destination_endpoint': 'ef1a9560-7ca1-11e5-992c-22000b96db58'},\n",
" 'transport': {'protocol': 'globus'}}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.settings_set(\"globus.destination_endpoint\", \n",
" \"ef1a9560-7ca1-11e5-992c-22000b96db58\")\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Switching back the destination endpoint to ``cades#CADES-OR`` which is the default"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'globus': {'destination_endpoint': '57230a10-7ba2-11e7-8c3b-22000b9923ef'},\n",
" 'transport': {'protocol': 'globus'}}"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.settings_set(\"globus.destination_endpoint\", \n",
" \"57230a10-7ba2-11e7-8c3b-22000b9923ef\")\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. List and view registered instruments\n",
"\n",
"Contact a DataFlow server administrator to add an instrument for you."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'id': 2,\n",
" 'name': 'Asylum Research Cypher West',\n",
" 'description': 'AR Cypher located in building 8610 in room JG 55. This instrument is capable of Band Excitation and General-mode based measurements in addition to common advanced AFM measurements.',\n",
" 'instrument_type': None}]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.instrument_list()\n",
"response"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'id': 2,\n",
" 'name': 'Asylum Research Cypher West',\n",
" 'description': 'AR Cypher located in building 8610 in room JG 55. This instrument is capable of Band Excitation and General-mode based measurements in addition to common advanced AFM measurements.',\n",
" 'instrument_type': None}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.instrument_info(2)\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Check to see if Globus endpoints are active:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'source_activation': {'code': 'AutoActivationFailed'},\n",
" 'destination_activation': {'code': 'AutoActivationFailed'}}"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.globus_endpoints_active(\"57230a10-7ba2-11e7-8c3b-22000b9923ef\")\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Activate one or both endpoints as necessary:\n",
"Because the destination wasn't already activated, we can activate that specific endpoint. \n",
"\n",
"**Note**: An encrypted password is being used in place of the conventional password for safety reasons. "
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'status': 'ok'}"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.globus_endpoints_activate(\"syz\", \n",
" enc_pwd, \n",
" encrypted=True, \n",
" endpoint=\"destination\")\n",
"response"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'source_activation': {'code': 'AutoActivated.CachedCredential'},\n",
" 'destination_activation': {'code': 'AlreadyActivated'}}"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.globus_endpoints_active()\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Create a measurement Dataset\n",
"This creates a directory at the destination Globus Endpoint:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'id': 12,\n",
" 'name': 'My new dataset with nested metadata',\n",
" 'creator': {'id': 5, 'name': 'Suhas Somnath'},\n",
" 'dataset_files': [],\n",
" 'instrument': None,\n",
" 'metadata_field_values': [{'id': 13,\n",
" 'field_value': 'PZT',\n",
" 'field_name': 'Sample',\n",
" 'metadata_field': None},\n",
" {'id': 14,\n",
" 'field_value': 'Asylum Research',\n",
" 'field_name': 'Microscope-Vendor',\n",
" 'metadata_field': None},\n",
" {'id': 15,\n",
" 'field_value': 'MFP3D',\n",
" 'field_name': 'Microscope-Model',\n",
" 'metadata_field': None},\n",
" {'id': 16,\n",
" 'field_value': '373',\n",
" 'field_name': 'Temperature',\n",
" 'metadata_field': None}]}"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.dataset_create(\"My new dataset with nested metadata\",\n",
" metadata={\"Sample\": \"PZT\", \n",
" \"Microscope\": {\n",
" \"Vendor\": \"Asylum Research\",\n",
" \"Model\": \"MFP3D\"\n",
" },\n",
" \"Temperature\": 373\n",
" }\n",
" )\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Getting the dataset ID programmatically to use later on:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dataset_id = response['id']\n",
"dataset_id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. Upload data file(s) to Dataset"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"using Globus since other file transfer adapters have not been implemented\n"
]
},
{
"data": {
"text/plain": [
"{'id': 9,\n",
" 'name': 'AFM_Topography.PNG',\n",
" 'file_length': 162,\n",
" 'file_type': '',\n",
" 'created_at': '2022-05-02 15:07:04 UTC',\n",
" 'relative_path': '',\n",
" 'is_directory': False}"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.file_upload(\"./AFM_Topography.PNG\", dataset_id)\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Upload another data file to the same dataset:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"using Globus since other file transfer adapters have not been implemented\n"
]
},
{
"data": {
"text/plain": [
"{'id': 10,\n",
" 'name': 'measurement_configuration.txt',\n",
" 'file_length': 162,\n",
" 'file_type': '',\n",
" 'created_at': '2022-05-02 15:07:08 UTC',\n",
" 'relative_path': 'foo/bar',\n",
" 'is_directory': False}"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.file_upload(\"./measurement_configuration.txt\", dataset_id, relative_path=\"foo/bar\")\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9. Search Dataset:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'total': 1,\n",
" 'has_more': False,\n",
" 'results': [{'id': 12,\n",
" 'created_at': '2022-05-02T15:03:49Z',\n",
" 'name': 'My new dataset with nested metadata',\n",
" 'dataset_files': [{'id': 9,\n",
" 'name': 'AFM_Topography.PNG',\n",
" 'file_length': 162,\n",
" 'file_type': '',\n",
" 'created_at': '2022-05-02 15:07:04 UTC',\n",
" 'relative_path': '',\n",
" 'is_directory': False},\n",
" {'id': 10,\n",
" 'name': 'measurement_configuration.txt',\n",
" 'file_length': 162,\n",
" 'file_type': '',\n",
" 'created_at': '2022-05-02 15:07:08 UTC',\n",
" 'relative_path': 'foo/bar',\n",
" 'is_directory': False},\n",
" {'id': 11,\n",
" 'name': 'foo',\n",
" 'file_length': None,\n",
" 'file_type': None,\n",
" 'created_at': '2022-05-02 15:07:08 UTC',\n",
" 'relative_path': '',\n",
" 'is_directory': True},\n",
" {'id': 12,\n",
" 'name': 'bar',\n",
" 'file_length': None,\n",
" 'file_type': None,\n",
" 'created_at': '2022-05-02 15:07:08 UTC',\n",
" 'relative_path': 'foo',\n",
" 'is_directory': True}],\n",
" 'metadata_field_values': [{'id': 13,\n",
" 'field_value': 'PZT',\n",
" 'field_name': 'Sample',\n",
" 'metadata_field': None},\n",
" {'id': 14,\n",
" 'field_value': 'Asylum Research',\n",
" 'field_name': 'Microscope-Vendor',\n",
" 'metadata_field': None},\n",
" {'id': 15,\n",
" 'field_value': 'MFP3D',\n",
" 'field_name': 'Microscope-Model',\n",
" 'metadata_field': None},\n",
" {'id': 16,\n",
" 'field_value': '373',\n",
" 'field_name': 'Temperature',\n",
" 'metadata_field': None}]}]}"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.dataset_search(\"nested\")\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Parsing the response to get the dataset of interest for us:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"12"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dset_id = response['results'][0]['id']\n",
"dset_id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 10. View this Dataset:\n",
"This view shows both the files and metadata contained in a dataset:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'id': 12,\n",
" 'name': 'My new dataset with nested metadata',\n",
" 'creator': {'id': 5, 'name': 'Suhas Somnath'},\n",
" 'dataset_files': [{'id': 9,\n",
" 'name': 'AFM_Topography.PNG',\n",
" 'file_length': 162,\n",
" 'file_type': '',\n",
" 'created_at': '2022-05-02 15:07:04 UTC',\n",
" 'relative_path': '',\n",
" 'is_directory': False},\n",
" {'id': 10,\n",
" 'name': 'measurement_configuration.txt',\n",
" 'file_length': 162,\n",
" 'file_type': '',\n",
" 'created_at': '2022-05-02 15:07:08 UTC',\n",
" 'relative_path': 'foo/bar',\n",
" 'is_directory': False},\n",
" {'id': 11,\n",
" 'name': 'foo',\n",
" 'file_length': None,\n",
" 'file_type': None,\n",
" 'created_at': '2022-05-02 15:07:08 UTC',\n",
" 'relative_path': '',\n",
" 'is_directory': True},\n",
" {'id': 12,\n",
" 'name': 'bar',\n",
" 'file_length': None,\n",
" 'file_type': None,\n",
" 'created_at': '2022-05-02 15:07:08 UTC',\n",
" 'relative_path': 'foo',\n",
" 'is_directory': True}],\n",
" 'instrument': None,\n",
" 'metadata_field_values': [{'id': 13,\n",
" 'field_value': 'PZT',\n",
" 'field_name': 'Sample',\n",
" 'metadata_field': None},\n",
" {'id': 14,\n",
" 'field_value': 'Asylum Research',\n",
" 'field_name': 'Microscope-Vendor',\n",
" 'metadata_field': None},\n",
" {'id': 15,\n",
" 'field_value': 'MFP3D',\n",
" 'field_name': 'Microscope-Model',\n",
" 'metadata_field': None},\n",
" {'id': 16,\n",
" 'field_value': '373',\n",
" 'field_name': 'Temperature',\n",
" 'metadata_field': None}]}"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"response = api.dataset_info(dset_id)\n",
"response"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 11. View files uploaded via DataFlow:\n",
"We're not using DataFlow here but just viewing the destination file system.\n",
"\n",
"Datasets are sorted by date:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! ls -hlt ~/dataflow/untitled_instrument/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There may be more than one dataset per day. Here we only have one"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Viewing the root directory of the dataset we just created:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will very soon be able to specify root level metadata that will be stored in ``metadata.json``.\n",
"\n",
"We can also see the nested directories: ``foo/bar`` where we uploaded the second file:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/foo/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looking at the inner most directory - ``bar``:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!ls -hlt ~/dataflow/untitled_instrument/2022-04-06/135750_atomic_force_microscopy_scan_of_pzt/foo/bar"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}