Submitted data consists of data files (e.g. sequencing reads or VCFs), as well as any associated file metadata (data that describes the data file). Data is submitted to Song & Score using the Song and Score CLIs (Command Line Clients). The Song and Score clients are used in conjunction to upload raw data files while maintaining file metadata and provenance, which is tracked through Song metadata analysis objects.
Running the song-client docker image
You must supply environment variables for the CLIENT_STUDY_ID
, the CLIENT_SERVER_URL
and your CLIENT_ACCESS_TOKEN
. The access token is supplied from Ego or your profile page within Stage.
docker run -d -it --name song-client \-e CLIENT_ACCESS_TOKEN=${token} \-e CLIENT_STUDY_ID=ABC123 \-e CLIENT_SERVER_URL=https://<INSERT-URL> \--network="host" \--mount type=bind,source="$(pwd)",target=/output \ghcr.io/overture-stack/song-client:latest
Running the score-client docker image
You will be required to supply environment variables for the STORAGE_URL
, the METADATA_URL
and your CLIENT_ACCESS_TOKEN
.
docker run -d -it \--name score-client \-e CLIENT_ACCESS_TOKEN=${token} \-e STORAGE_URL=http://<INSERT-URL> \-e METADATA_URL=http://<INSERT-URL> \--network="host" \--mount type=bind,source="$(pwd)",target=/output \ghcr.io/overture-stack/score:latest
First, a metadata payload must be prepared. The payload must conform to an analysis_type
registered as a schema. For help with creating or updating schemas please see the Dynamic Schemas documentation.
Once you have formatted the payload correctly, use the song-client submit
command to upload the payload.
docker exec song-client sh -c "sing submit -f /output/example-payload.json"
If your payload is not formatted correctly, you will receive an error message detailing what is wrong. Please fix any errors and resubmit. If your payload is formatted correctly, you will get an analysisId
in response:
{"analysisId": "a4142a01-1274-45b4-942a-01127465b422","status": "OK"}
At this point, since the payload data has successfully been submitted and accepted by Song, it is now referred to as an analysis. By default, all newly created analyses are set to an UNPUBLISHED
state.
For more information on analysis states (published, unpublished and suppressed) see our page on analysis management
Use the returned analysis_id
to generate a manifest for file upload. This manifest will used by the score-client in the next step.
The manifest establishes a link between the analysis-id that has been submitted and the data file on your local systems that is being uploaded.
Using the song-client manifest
command, define
-a
parameter-d
parameter-f
parameter. Note: this is a file path not a directory pathHere is an example of a manifest command:
docker exec song-client sh -c "sing manifest -a a4142a01-1274-45b4-942a-01127465b422 -f /some/output/dir/manifest.txt -d /submitting/file/directory"
Here is the expected response:
Wrote manifest file 'manifest.txt' for analysisId 'a4142a01-1274-45b4-942a-01127465b422'
The manifest.txt
file will be written out to a defined output file path. If the output directory does not exist, it will be automatically created.
Upload all the files associated with the analysis using the score-client upload
command:
docker exec score-client sh -c "score-client upload --manifest manifest.txt"
Once the file(s) successfully upload, you will receive an Upload completed
message.
Sometimes, if an upload is stuck, you can reinitiate the upload using the --force
command.
docker exec score-client sh -c "score-client upload --manifest manifest.txt --force "
For more information on Score, please see the Score documentation page.
The final step to submitting molecular data is to set the state of an analysis to PUBLISHED
. A published analysis signals to the data administrators that this data is ready to be processed by downstream services.
docker exec song-client sh -c "sing publish -a a4142a01-1274-45b4-942a-01127465b422"
Here is the expected response:
AnalysisId a4142a01-1274-45b4-942a-01127465b422 successfully published
A published analysis will now be searchable in Song. In the next section, we will outline how to search for data in Song.