Setting up your data portal involves configuring and running Elasticsearch, Maestro, Arranger and Stage. Below are the steps and breakdowns to ensure a smooth setup process.
docker run -d --name elasticsearch \-p 9200:9200 \-e discovery.type=single-node \-e cluster.name=workflow.elasticsearch \-e ES_JAVA_OPTS="-Xms512m -Xmx2048m" \-e ELASTIC_PASSWORD=myelasticpassword \-e xpack.security.enabled=true \-e MANAGE_INDEX_TEMPLATES=true \-e NETWORK_HOST=http://localhost:9200 \docker.elastic.co/elasticsearch/elasticsearch:7.17.1
-p 9200:9200
maps port 9200 of the host to port 9200 of the container-e discovery.type=single-node
configures Elasticsearch to run in single-node mode, this bypasses the need for cluster discovery and formation protocols, making Elasticsearch start up as a standalone node, ideal for development, testing, or small-scale deployments where clustering is not necessary-e cluster.name=workflow.elasticsearch
names the Elasticsearch cluster, this is good practice in case you choose to run multiple clusters or nodes in the future-e ES_JAVA_OPTS=-Xms512m -Xmx2048m
sets the initial and maximum heap size for the Java Virtual Machine (JVM) running Elasticsearch. -Xms512m
sets the initial heap size to 512 MB.
-Xmx2048m
sets the maximum heap size to 2048 MB (2 GB). Properly setting these values ensures that Elasticsearch has enough memory to handle its operations efficiently, but not so much that it starves other processes on the host machine.-e xpack.security.enabled=true
activates security features such as authentication, authorization, encryption, and audit logging-e MANAGE_INDEX_TEMPLATES=true
ensures Elasticsearch manages index templates, when true, the system expects to manage the index templates as part of its operations. In the next step we will create a client services to set up the default configurations for new indices-e ELASTIC_PASSWORD=myelasticpassword
Sets the password for the elastic userelasticsearchConfigs
elasticsearchConfigs
folder. This file specifies settings, mappings, and configurations that will be applied automatically to new indices that match the template's patternIf you'd like to learn more about creating an index mapping for your own data see our administration guide on configuring the index mapping.
Update Elasticsearch with your index template using the following curl
commmand:
curl -u elastic:myelasticpassword -X PUT 'http://localhost:9200/_template/index_template' -H 'Content-Type: application/json' -d ./elasticsearchConfigs/quickstart_index_template.json
Create a new alias in Elasticsearch using the following curl
command:
curl -u elastic:myelasticpassword -X PUT 'http://localhost:9200/overture-quickstart-index'
If successful you should be able to view the updated index in your browser from http://localhost:9200/overture-quickstart-index
with the username elastic
and password myelasticpassword
.
Any index alias that starts with overture-
will use the mapping of the index template we initially provided. This is defined on line two of our quickstart_index_template
.
.env.maestro
with the following content:# ==============================# Maestro Environment Variables# ==============================# Maestro VariablesMAESTRO_FAILURELOG_ENABLED=trueMAESTRO_FAILURELOG_DIR=app/logs/maestroMAESTRO_LOGGING_LEVEL_ROOT=INFOMAESTRO_NOTIFICATIONS_SLACK_ENABLED=false# Song VariablesMAESTRO_REPOSITORIES_0_CODE=song.overtureMAESTRO_REPOSITORIES_0_URL=http://song:8080MAESTRO_REPOSITORIES_0_NAME=OvertureMAESTRO_REPOSITORIES_0_ORGANIZATION=OvertureMAESTRO_REPOSITORIES_0_COUNTRY=CA# Elasticsearch VariablesMAESTRO_ELASTICSEARCH_CLUSTER_NODES=http://elasticsearch:9200MAESTRO_ELASTICSEARCH_CLIENT_BASICAUTH_USER=elasticMAESTRO_ELASTICSEARCH_CLIENT_BASICAUTH_PASSWORD=myelasticpasswordMAESTRO_ELASTICSEARCH_CLIENT_TRUSTSELFSIGNCERT=trueMAESTRO_ELASTICSEARCH_INDEXES_ANALYSISCENTRIC_ENABLED=falseMAESTRO_ELASTICSEARCH_INDEXES_FILECENTRIC_ENABLED=trueMAESTRO_ELASTICSEARCH_INDEXES_FILECENTRIC_NAME=overture-quickstart-indexMAESTRO_ELASTICSEARCH_INDEXES_FILECENTRIC_ALIAS=file_centricMAESTRO_ELASTICSEARCH_CLIENT_BASICAUTH_ENABLED=trueMANAGEMENT_HEALTH_ELASTICSEARCH_ENABLED=false# Spring VariablesSPRING_MVC_ASYNC_REQUESTTIMEOUT=-1SPRINGDOC_SWAGGERUI_PATH=/swagger-api# Kafka VariablesSPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS=kafka:9092SPRING_CLOUD_STREAM_BINDINGS_SONGINPUT_DESTINATION=song-analysis
MAESTRO_FAILURELOG_ENABLED
enables or disables failure logging. When set to true
, Maestro logs any failures that occur, which is useful for debugging and monitoring purposesMAESTRO_FAILURELOG_DIR
sets the directory path where failure logs are stored. The value should be app/logs/maestro
or another path of your choosingMAESTRO_LOGGING_LEVEL_ROOT
sets the root logging level for Maestro. The value can be INFO
, DEBUG
or WARN
. It determines the level of detail included in logs, where INFO
is standard and DEBUG
provides more detailed informationMAESTRO_NOTIFICATIONS_SLACK_ENABLED
enables or disables Slack notifications. When set to true
, Maestro can send notifications to a Slack channelMAESTRO_REPOSITORIES_0_CODE
sets the code identifier for the repository. The value here is song.overture
, serving as a unique identifier used within Maestro to reference the repositoryMAESTRO_REPOSITORIES_0_URL
is the URL of the metadata repository. The value is http://song:8080
, specifying the endpoint where Maestro can connect to the Song repositoryMAESTRO_REPOSITORIES_0_NAME
defines the display name for the repository. The value is Overture
, providing a human-readable name for the repository used in logs and interfacesMAESTRO_REPOSITORIES_0_ORGANIZATION
defines the name of the organization that owns the repositoryMAESTRO_REPOSITORIES_0_COUNTRY
defines the country code for the repository's location. The value is CA
(Canada), indicating the country associated with the repositoryMAESTRO_ELASTICSEARCH_INDEXES_ANALYSISCENTRIC_ENABLED
set to true
specifing that analysis-centric indices are to be expectedMAESTRO_ELASTICSEARCH_INDEXES_FILECENTRIC_ENABLED
set to false
specifing to Maestro that file-centric indices are not to be expectedMAESTRO_ELASTICSEARCH_CLIENT_BASICAUTH_ENABLED
enables basic authentication for the Elasticsearch clientMAESTRO_ELASTICSEARCH_INDEXES_ANALYSISCENTRIC_NAME
is the name of the analysis-centric Elasticsearch index. The value is analysis-conductor-index
, aligned with our previously created indexMAESTRO_ELASTICSEARCH_INDEXES_ANALYSISCENTRIC_ALIAS
is the alias for the analysis-centric Elasticsearch indexMAESTRO_ELASTICSEARCH_CLUSTER_NODES
points to the address of the Elasticsearch cluster node(s). The value is elasticsearch:9200
, specifying the Elasticsearch node that Maestro will interact withMAESTRO_ELASTICSEARCH_CLIENT_BASICAUTH_USER
, MAESTRO_ELASTICSEARCH_CLIENT_BASICAUTH_PASSWORD
is the username and password for ElasticsearchMANAGEMENT_HEALTH_ELASTICSEARCH_ENABLED
: Enables or disables Elasticsearch health checks. The value can be false
(disabled) or true
(enabled), controlling whether health checks for Elasticsearch are performed.MANAGEMENT_SECURITY_ENABLED
: Enables or disables security management. The value can be false
(disabled) or true
(enabled), controlling whether security features are enabled.SPRING_MVC_ASYNC_REQUESTTIMEOUT
is -1
(no timeout), this setting determines how long asynchronous requests are allowed to run before timing outSPRINGDOC_SWAGGERUI_PATH
is /swagger-api
, specifying the URL path where the Swagger UI can be accessed (localhost:11235/swagger-api
).SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS
defines the address of the Kafka broker(s). The value is set to kafka:9092
, specifying the Kafka instance we set up earlier
SPRING_CLOUD_STREAM_BINDINGS_SONGINPUT_DESTINATION
is the destination topic for the Song input binding. The value is song-analysis
, pointing to the Kafka topic we configured earlier
--env-file
option:docker run --env-file .env.maestro \--name maestro \--platform linux/amd64 \-p 11235:11235 \ghcr.io/overture-stack/maestro:4.3.0
.env.arranger
with the following content:# ==============================# Arranger Environment Variables# ==============================# Arranger VariablesENABLE_LOGS=false# Elasticsearch VariablesES_HOST=http://elasticsearch:9200ES_USER=elasticES_PASS=myelasticpassword# Stage VariablesREACT_APP_BASE_URL=http://stage:3000REACT_APP_ARRANGER_ADMIN_ROOT=http://arranger-server:5050/graphql
arrangerConfigs
and place the following configuration files within it:.env.arranger
file:docker run --env-file .env.arranger \--name arranger-server \-p 5050:5050 \-v ./arrangerConfigs/base.json:/app/modules/server/configs/base.json \-v ./arrangerConfigs/extended.json:/app/modules/server/configs/extended.json \-v ./arrangerConfigs/facets.json:/app/modules/server/configs/facets.json \-v ./arrangerConfigs/matchbox.json:/app/modules/server/configs/matchbox.json \-v ./arrangerConfigs/table.json:/app/modules/server/configs/table.json \ghcr.io/overture-stack/arranger-server:3.0.0-beta.33
Make sure to confirm the ./arrangerConfigs/
path aligns with the actual paths to your Arranger-Server configuration files, update your command or folder structure accordingly.
ES_HOST
is the URL of your Elasticsearch instanceES_USER
and ES_PASS
are the credentials for accessing ElasticsearchREACT_APP_BASE_URL
is the base URL for your front-end application, in this case Stage, which we will set up nextREACT_APP_ARRANGER_ADMIN_ROOT
is the URL for the Arranger GraphQL endpoint-p 5050:5050
maps port 5050 of the host to port 5050 of the container.-v ./arrangerConfigs/...:/app/modules/server/configs/...
mounts configuration files into the containerbase.json
contains the base configuration for the Arranger serverextended.json
contains all possible fields inputted into arrangerfacets.json
defines the facets found within the facet panel of the data exploration page in Stagetable.json
defines the formatting of the tables found on the data exploration page in Stagematchbox.json
contains matchbox configuration settingsIf you want to lean more about configuring Arranger see our administration guide on customizing the search portal.
.env.stage
with the following content:# ==============================# Stage Environment Variables# ==============================# Stage VariablesNEXTAUTH_URL=http://localhost:3000/api/authNEXT_PUBLIC_LAB_NAME=Overture QuickStart PortalNEXT_PUBLIC_ADMIN_EMAIL=contact@overture.bioNEXT_PUBLIC_DEBUG=trueNEXT_PUBLIC_SHOW_MOBILE_WARNING=true# Keycloak VariablesNEXT_PUBLIC_AUTH_PROVIDER=keycloakACCESSTOKEN_ENCRYPTION_SECRET=super_secretSESSION_ENCRYPTION_SECRET=this_is_a_super_secret_secretNEXT_PUBLIC_KEYCLOAK_HOST=http://keycloak:8080NEXT_PUBLIC_KEYCLOAK_REALM=myrealmNEXT_PUBLIC_KEYCLOAK_CLIENT_ID=webclientKEYCLOAK_CLIENT_SECRET=ikksyrYaKX07acf4hpGrpKWcUGaFkEdMNEXT_PUBLIC_KEYCLOAK_PERMISSION_AUDIENCE=dms# Score VariablesNEXT_PUBLIC_SCORE_API_URL=http://score:8087# Arranger VariablesNEXT_PUBLIC_ARRANGER_DOCUMENT_TYPE=fileNEXT_PUBLIC_ARRANGER_INDEX=file_centricNEXT_PUBLIC_ARRANGER_API_URL=http://arranger-server:5050NEXT_PUBLIC_ARRANGER_MANIFEST_COLUMNS=repositories.code, object_id, analysis.analysis_id, study_id, file_type, file.name, file.size, file.md5sum, file.index_file.object_id, donors.donor_id, donors.specimens.samples.sample_id
NEXTAUTH_URL
specifies the base URL for NextAuth.js, which handles authentication in Next.js applications. This setting is used to configure the authentication flow, including where to redirect users after successful authentication.
NEXT_PUBLIC_LAB_NAME
is the name that will be displayed in the top left of the portal interface. Feel free to get creative here
NEXT_PUBLIC_ADMIN_EMAIL
is the email address of the administrator or support contact. This setting updates the help link found by default in the footer navigation of the portal interface
NEXT_PUBLIC_AUTH_PROVIDER
specifies the authentication provider, in this case, KeycloakACCESSTOKEN_ENCRYPTION_SECRET
defines the secret used to encrypt access tokens, enhancing security by preventing easy decoding of intercepted tokensSESSION_ENCRYPTION_SECRET
specifies the secret used to encrypt session cookies, protecting sensitive information stored in the cookie from unauthorized accessNEXT_PUBLIC_KEYCLOAK_HOST
specifies the URL where the Keycloak server is hosted https://localhost:8443
while NEXT_PUBLIC_KEYCLOAK_REALM
defines the realm in Keycloak that contains the users and roles for our applicationNEXT_PUBLIC_KEYCLOAK_CLIENT_ID
and client secret KEYCLOAK_CLIENT_SECRET
are assigned to the application by Keycloak, linking the application to its configuration within KeycloakNEXT_PUBLIC_KEYCLOAK_PERMISSION_AUDIENCE
specifies the audience for the permission claims in the access token, restricting the scope of access granted to the tokenNEXT_PUBLIC_SCORE_API_URL
is the URL of the Score API, which the application uses to communicate with the Score serviceNEXT_PUBLIC_ARRANGER_DOCUMENT_TYPE
indexes can be either file centric or analysis (participant) centric, the document type variable specifies which of these configurations is trueNEXT_PUBLIC_ARRANGER_INDEX
defines the index used by the Arranger serviceNEXT_PUBLIC_ARRANGER_API_URL
is the URL of the Arranger graphQL API, by default Arrangers API is mapped to port 5050 NEXT_PUBLIC_ARRANGER_MANIFEST_COLUMNS
lists the columns to be included in the manifest generated for download with Score.env.stage
file:docker run --env-file .env.stage \--name stage \-p 3000:3000 \ghcr.io/overture-stack/stage:3ede4e2
The front-end portal will now be available in your browser at localhost:3000
Now that we have our platform setup we will need to generate an API key to enable secure communication between Song and Score.
API Keys are brokered by Keycloak and accessible when logged in to the Stage UI localhost:3000/login
.
1. Login through the Stage UI by selecting login from the top right.
Default credentials were pre-configured when we imported our Users.json file into Keycloak, our default admin account credentials are username admin
and password admin123
.
2. Generate a new API token by selecting Profile and Token from your user dropdown menu at the top right of the Stage UI and selecting Generate New Token.
3. Update the SCORE_ACCESSTOKEN
variable within your .env.song
and once updated, remove the existing Song container and re-run Song with your updated .env.song
docker run -d \--name song \--platform linux/amd64 \-p 8080:8080 \--env-file .env.song \ghcr.io/overture-stack/song-server:438c2c42
Now that you have the end-to-end portal setup we recommend you check out our administration guide on updating the data model.