This example data was generated by training PBSIM3 (https://github.com/yukiteruono/pbsim3 - v.3.0.5) on a set of real reads from one of our runs and then generating a set of artificial LASV reads with some host and bacterial reads. It can be used to test the functionality of ViMOP. The output report will then look like this: https://opr.bnitm.de/example_data/report_example_data.html

The process was as follows:
1. Run ViMOP on a real analysis run on one of our field samples
2. Extract all the reads that were used to generate the found target virus consensus sequences
3. Align those reads back to the consensus sequences with minimap2 and calculate the number of substitutions, insertions and deletions to get the difference ratios of the real data. This resulted in the output of the following difference-ratios: 41:23:36
4. Create a sample profile with PBSIM3 using the extracted reads and the target virus consensus sequences from step 2 as sample input and genome reference respectively. The created sample profile can be downloaded from: https://opr.bnitm.de/example_data/sample_profile_demo.tar.gz
5. Use the created sample profile and to generate artificial LASV, MS2, Corynebacterium and human reads from reference sequences (NCBI accession numbers specified in table below) with the following command.

for organism in ['lasv_l', 'lasv_s', 'ms2', 'corynebacterium', 'human']:
  prefix = f'simulated_reads/{organism}'
  ref = f'references/{organism}.fasta'

  ! pbsim --strategy wgs \
        --method sample \
        --depth 1000 \
        --prefix "$prefix" \
        --genome "$ref" \
        --sample-profile-id "demo1"\
        --difference-ratio 41:23:36
6. From the created reads, subsample a specific number of reads to mix our own example data set. The fractions that were used for this demo data set are specified in the table below.

Ref	Accession		Number of subsampled reads
LASV_L	GU979513.1		1000
LASV_L	GU830839.1		1000
MS2	LC710218.1		800
HUMAN	GCA_000001405.29	2000
CORYNE	LN831026.1		200

The run should take between 5-10 minutes depending on how occupied your system is.