Arranger simplifies GraphQL queries over Elasticsearch indices with it's front-end library of reusable search components. The primary configurable components for this guide are the left hand search facets and the central data table seen below.
All configurations for these components are made through four configuration files: base.json
, extended.json
, table.json
and facets.json
. We will cover each in the following sections.
Indices in Elasticsearch are a collection of related documents with similar characteristics.
Elasticvue offers a convenient and user-friendly interface for managing and exploring your Elasticsearch data. With elasticvue, you can:
To install elasticvue, follow these steps:
http://localhost:9200
.elastic
and password myelasticpassword
.From the elasticvue dashboard's top navigation, select search.
This page displays all indexed Elasticsearch documents created by Maestro from published Song analyses and used by Arranger. Clicking any of the _index
rows will give you a direct view of the JSON documents that populate the index.
Being able to easily view the JSON documents within your elastic search instance will be beneficial when configuring your Arranger configs.
The base.json file contains only two fields, documenType
and index
:
{"documentType": "file","index": "overture-quickstart-index"}
index
field specifies the name of the Elasticsearch index, in this example the overture-quickstart-index
documentType
informs Arranger of the mapping type being used by Maestro, analysis
or file
centricFor more information on index mappings and index centricity, see our administration guide covering index mappings.
The extended.json configuration file defines all the fields and display names you wish to populate your front-end portal with. Below, we have provided a simplified list taken from our QuickStart extended.json configuration:
{"extended": [{"displayName": "Object Id","fieldName": "object_id"},{"displayName": "Analysis Id","fieldName": "analysis.analysis_id"},{"displayName": "Treatment Duration (Days)","fieldName": "analysis.donor.primaryDiagnosis.treatment.treatmentDuration"}]}
ThedisplayName
field outlines how you want your fields displayed on the front-end UI when used within the search facets and or table.
The fieldName
values are written as represented within your Elasticsearch documents:
.
making the appropriate fieldName analysis.analysis_id
treatmentDuration
field, we can see it is nested relatively deeper than our other three fields outlined above. The same rules, however apply, and by working backwards and adding a .
for each nested element, we end up with analysis.donor.primaryDiagnosis.treatment.treatmentDuration
The table.json file configures the columns displayed in the data table. These configurations specify which fields are shown, their visibility, and their sortability.
{"table": {"columns": [{"canChangeShow": true,"fieldName": "object_id","show": false,"sortable": true},{"canChangeShow": true,"fieldName": "analysis.analysis_id","show": false,"sortable": true},{"canChangeShow": true,"fieldName": "analysis.collaborator.name","jsonPath": "$.analysis.collaborator.hits.edges[*].node.name","query": "analysis { collaborator { hits { edges { node { name } } } } }","show": true,"sortable": true}]}}
canChangeShow
is a boolean indicating if the user can toggle the visibility of the column, set this to true if you want users to have the option to show or hide this column using the columns dropdown menu. Set it to false if the visibility of this column should remain fixed.FieldName
is the same fieldname as described above, these values are written as represented within your Elasticsearch documentsshow
is a boolean indicating if the column is initially, by default, visible. Set this to true if you want the column to be visible when the table is first loaded. Set it to false if you want the column to be hidden by default.sortable
is a boolean indicating if the column can be sorted. Set this to true if you want users to be able to sort the table by this column. Set it to false if sorting should not be allowed for this column.The jsonPath
field specifies the JSON path to extract nested data from Elasticsearch documents. This field defines the path to data nested within arrays.
For example, suppose we have an Elasticsearch document structured like this:
{"analysis": {"collaborator": [{"contactEmail": "susannorton@micr.ca","name": "MICR"}]}}
If we want to extract the name
field from the collaborator
array within the analysis
object, our jsonPath for this field would be:
$.analysis.collaborator.hits.edges[*].node.name
$.
designates the root of our elasticsearch documentsanalysis.collaborator
is the key for our desired nested object within the roothits.edges[*].node
specifies that we're accessing an array ([*]
translates to "all elements" in the array)name
specifies the desired field we want to extract from our Elasticsearch documentsThe query
field defines the GraphQL query needed to retrieve the nested data.
This follows a similar structure to our JSON path but is written in GraphQL syntax:
{analysis {collaborator {hits {edges {node {name}}}}}}
When flattened, this matches the configuration shown in our example above:
"analysis { collaborator { hits { edges { node { name } } } } }",
If you want to gain hands-on experience making these queries and exploring GraphQL, we recommend accessing the Arranger GraphQL server using our Quickstart from http://localhost:5050/graphql
. For those preferring to use the most up-to-date GraphQL Playground UI, you can access it from http://localhost:5050/graphql/hellogql
(appending any string to the URL will take you there).
The facets.json file defines how aggregations (also known as facets in Elasticsearch) are configured for data exploration and filtering.
{"facets": {"aggregations": [{"active": true,"fieldName": "file_type","show": true},{"active": true,"fieldName": "analysis__collaborator__name","show": true}]}}
active
indicates whether this aggregation is active or enabled (true)fieldName
the field used for aggregation. This means Elasticsearch will aggregate data based on different values found in the defined field. For the file_type
field, this translates into a facet with the options of filtering for aggregations of three file types: VCF
, BAM
and CRAM
show
indicates whether to display this aggregation in the user interface (true)One caveat of the facets.json
file is the notation used for fieldNames. Here we use double underscores __
rather than .
for nested elements, for example analysis__collaborator__name
instead of analysis.collaborator.name