FOMC Service Report

16S rRNA Gene V1V3 Amplicon Sequencing

Version V1.50

Version History

The Forsyth Institute, Cambridge, MA, USA
June 03, 2025

Project ID: 20250529_HAKU_6


I. Project Summary

Project 20250529_HAKU_6 services include NGS sequencing of the V1V3 region of the 16S rRNA gene amplicons from the samples. First and foremost, please download this report, as well as the sequence raw data from the download links provided below. These links will expire after 60 days. We cannot guarantee the availability of your data after 60 days.

Full Bioinformatics analysis service was requested. We provide many analyses, starting from the raw sequence quality and noise filtering, pair reads merging, as well as chimera filtering for the sequences, using the DADA2 denosing algorithm and pipeline.

We also provide many downstream analyses such as taxonomy assignment, alpha and beta diversity analyses, and differential abundance analysis.

For taxonomy assignment, most informative would be the taxonomy barplots. We provide an interactive barplots to show the relative abundance of microbes at different taxonomy levels (from Phylum to species) that you can choose.

If you specify which groups of samples you want to compare for differential abundance, we provide both ANCOM and LEfSe differential abundance analysis.

 

II. Workflow Checklist

1.Sample Received
2.Sample Quality Evaluated
3.Sample Prepared for Sequencing
4.Next-Gen Sequencing
5.Sequence Quality Check
6.Absolute Abundance
7.Report and Raw Sequence Data Available for Download
8.Bioinformatics Analysis - Reads Processing (DADA2 Quality Trimming, Denoising, Paired Reads Merging)
9.Bioinformatics Analysis - Reads Taxonomy Assignment
10.Bioinformatics Analysis - Alpha Diversity Analysis
11.Bioinformatics Analysis - Beta Diversity Analysis
12.Bioinformatics Analysis - Differential Abundance Analysis
13.Bioinformatics Analysis - Heatmap Profile
14.Bioinformatics Analysis - Network Association
 

III. NGS Sequencing

The samples were processed and analyzed with the ZymoBIOMICS® Service: Targeted Metagenomic Sequencing (Zymo Research, Irvine, CA).

DNA Extraction: If DNA extraction was performed, the following DNA extraction kit was used according to the manufacturer’s instructions:

ZymoBIOMICS®-96 MagBead DNA Kit (Zymo Research, Irvine, CA)
N/A (DNA Extraction Not Performed)
Elution Volume: 50µL
Additional Notes: NA

Targeted Library Preparation: The DNA samples were prepared for targeted sequencing with the Quick-16S™ NGS Library Prep Kit (Zymo Research, Irvine, CA). These primers were custom designed by Zymo Research to provide the best coverage of the 16S gene while maintaining high sensitivity. The primer sets used in this project are marked below:

Quick-16S™ Primer Set V1-V2 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V1-V3 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V3-V4 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V4 (Zymo Research, Irvine, CA)
Quick-16S™ Primer Set V6-V8 (Zymo Research, Irvine, CA)
Additional Notes: NA

The sequencing library was prepared using an innovative library preparation process in which PCR reactions were performed in real-time PCR machines to control cycles and therefore limit PCR chimera formation. The final PCR products were quantified with qPCR fluorescence readings and pooled together based on equal molarity. The final pooled library was cleaned up with the Select-a-Size DNA Clean & Concentrator™ (Zymo Research, Irvine, CA), then quantified with TapeStation® (Agilent Technologies, Santa Clara, CA) and Qubit® (Thermo Fisher Scientific, Waltham, WA).

Control Samples: The ZymoBIOMICS® Microbial Community Standard (Zymo Research, Irvine, CA) was used as a positive control for each DNA extraction, if performed. The ZymoBIOMICS® Microbial Community DNA Standard (Zymo Research, Irvine, CA) was used as a positive control for each targeted library preparation. Negative controls (i.e. blank extraction control, blank library preparation control) were included to assess the level of bioburden carried by the wet-lab process.

Sequencing: The final library was sequenced on Illumina® NextSeq 2000™ with a p1 (Illumina, Sand Diego, CA) reagent kit (600 cycles). The sequencing was performed with 25% PhiX spike-in.

Absolute Abundance Quantification*: A quantitative real-time PCR was set up with a standard curve. The standard curve was made with plasmid DNA containing one copy of the 16S gene and one copy of the fungal ITS2 region prepared in 10-fold serial dilutions. The primers used were the same as those used in Targeted Library Preparation. The equation generated by the plasmid DNA standard curve was used to calculate the number of gene copies in the reaction for each sample. The PCR input volume (2 µl) was used to calculate the number of gene copies per microliter in each DNA sample.
The number of genome copies per microliter DNA sample was calculated by dividing the gene copy number by an assumed number of gene copies per genome. The value used for 16S copies per genome is 4. The value used for ITS copies per genome is 200. The amount of DNA per microliter DNA sample was calculated using an assumed genome size of 4.64 x 106 bp, the genome size of Escherichia coli, for 16S samples, or an assumed genome size of 1.20 x 107 bp, the genome size of Saccharomyces cerevisiae, for ITS samples. This calculation is shown below:

Calculated Total DNA = Calculated Total Genome Copies × Assumed Genome Size (4.64 × 106 bp) ×
Average Molecular Weight of a DNA bp (660 g/mole/bp) ÷ Avogadro’s Number (6.022 x 1023/mole)


* Absolute Abundance Quantification is only available for 16S and ITS analyses.

The absolute abundance standard curve data can be viewed in Excel here:

The absolute abundance standard curve is shown below:

Absolute Abundance Standard Curve

 

IV. Complete Report Download

The complete report of your project, including all links in this report, can be downloaded by clicking the link provided below. The downloaded file is a compressed ZIP file and once unzipped, open the file “REPORT.html” (may only shown as "REPORT" in your computer) by double clicking it. Your default web browser will open it and you will see the exact content of this report.

Please download and save the file to your computer storage device. The download link will expire after 60 days upon your receiving of this report.

Complete report download link:

To view the report, please follow the following steps:

1.Download the .zip file from the report link above.
2.Extract all the contents of the downloaded .zip file to your desktop.
3.Open the extracted folder and find the "REPORT.html" (may shown as only "REPORT").
4.Open (double-clicking) the REPORT.html file. Your default browser will open the top age of the complete report. Within the report, there are links to view all the analyses performed for the project.

 

V. Raw Sequence Data Download

The raw NGS sequence data is available for download with the link provided below. The data is a compressed ZIP file and can be unzipped to individual sequence files. Since this is a pair-end sequencing, each of your samples is represented by two sequence files, one for READ 1, with the file extension “*_R1.fastq.gz”, another READ 2, with the file extension “*_R1.fastq.gz”. The files are in FASTQ format and are compressed. FASTQ format is a text-based data format for storing both a biological sequence and its corresponding quality scores. Most sequence analysis software will be able to open them. The Sample IDs associated with the R1 and R2 fastq files are listed in the table below:

Sample IDOriginal Sample IDRead 1 File NameRead 2 File Name
F19244.S10zr19244_10V1V3_R1.fastq.gzzr19244_10V1V3_R2.fastq.gz
F19244.S11zr19244_11V1V3_R1.fastq.gzzr19244_11V1V3_R2.fastq.gz
F19244.S12zr19244_12V1V3_R1.fastq.gzzr19244_12V1V3_R2.fastq.gz
F19244.S13zr19244_13V1V3_R1.fastq.gzzr19244_13V1V3_R2.fastq.gz
F19244.S14zr19244_14V1V3_R1.fastq.gzzr19244_14V1V3_R2.fastq.gz
F19244.S15zr19244_15V1V3_R1.fastq.gzzr19244_15V1V3_R2.fastq.gz
F19244.S16zr19244_16V1V3_R1.fastq.gzzr19244_16V1V3_R2.fastq.gz
F19244.S17zr19244_17V1V3_R1.fastq.gzzr19244_17V1V3_R2.fastq.gz
F19244.S18zr19244_18V1V3_R1.fastq.gzzr19244_18V1V3_R2.fastq.gz
F19244.S19zr19244_19V1V3_R1.fastq.gzzr19244_19V1V3_R2.fastq.gz
F19244.S01zr19244_1V1V3_R1.fastq.gzzr19244_1V1V3_R2.fastq.gz
F19244.S20zr19244_20V1V3_R1.fastq.gzzr19244_20V1V3_R2.fastq.gz
F19244.S21zr19244_21V1V3_R1.fastq.gzzr19244_21V1V3_R2.fastq.gz
F19244.S22zr19244_22V1V3_R1.fastq.gzzr19244_22V1V3_R2.fastq.gz
F19244.S23zr19244_23V1V3_R1.fastq.gzzr19244_23V1V3_R2.fastq.gz
F19244.S24zr19244_24V1V3_R1.fastq.gzzr19244_24V1V3_R2.fastq.gz
F19244.S25zr19244_25V1V3_R1.fastq.gzzr19244_25V1V3_R2.fastq.gz
F19244.S26zr19244_26V1V3_R1.fastq.gzzr19244_26V1V3_R2.fastq.gz
F19244.S27zr19244_27V1V3_R1.fastq.gzzr19244_27V1V3_R2.fastq.gz
F19244.S28zr19244_28V1V3_R1.fastq.gzzr19244_28V1V3_R2.fastq.gz
F19244.S29zr19244_29V1V3_R1.fastq.gzzr19244_29V1V3_R2.fastq.gz
F19244.S02zr19244_2V1V3_R1.fastq.gzzr19244_2V1V3_R2.fastq.gz
F19244.S30zr19244_30V1V3_R1.fastq.gzzr19244_30V1V3_R2.fastq.gz
F19244.S31zr19244_31V1V3_R1.fastq.gzzr19244_31V1V3_R2.fastq.gz
F19244.S32zr19244_32V1V3_R1.fastq.gzzr19244_32V1V3_R2.fastq.gz
F19244.S33zr19244_33V1V3_R1.fastq.gzzr19244_33V1V3_R2.fastq.gz
F19244.S34zr19244_34V1V3_R1.fastq.gzzr19244_34V1V3_R2.fastq.gz
F19244.S35zr19244_35V1V3_R1.fastq.gzzr19244_35V1V3_R2.fastq.gz
F19244.S36zr19244_36V1V3_R1.fastq.gzzr19244_36V1V3_R2.fastq.gz
F19244.S37zr19244_37V1V3_R1.fastq.gzzr19244_37V1V3_R2.fastq.gz
F19244.S38zr19244_38V1V3_R1.fastq.gzzr19244_38V1V3_R2.fastq.gz
F19244.S39zr19244_39V1V3_R1.fastq.gzzr19244_39V1V3_R2.fastq.gz
F19244.S03zr19244_3V1V3_R1.fastq.gzzr19244_3V1V3_R2.fastq.gz
F19244.S40zr19244_40V1V3_R1.fastq.gzzr19244_40V1V3_R2.fastq.gz
F19244.S41zr19244_41V1V3_R1.fastq.gzzr19244_41V1V3_R2.fastq.gz
F19244.S42zr19244_42V1V3_R1.fastq.gzzr19244_42V1V3_R2.fastq.gz
F19244.S43zr19244_43V1V3_R1.fastq.gzzr19244_43V1V3_R2.fastq.gz
F19244.S44zr19244_44V1V3_R1.fastq.gzzr19244_44V1V3_R2.fastq.gz
F19244.S45zr19244_45V1V3_R1.fastq.gzzr19244_45V1V3_R2.fastq.gz
F19244.S46zr19244_46V1V3_R1.fastq.gzzr19244_46V1V3_R2.fastq.gz
F19244.S47zr19244_47V1V3_R1.fastq.gzzr19244_47V1V3_R2.fastq.gz
F19244.S48zr19244_48V1V3_R1.fastq.gzzr19244_48V1V3_R2.fastq.gz
F19244.S04zr19244_4V1V3_R1.fastq.gzzr19244_4V1V3_R2.fastq.gz
F19244.S05zr19244_5V1V3_R1.fastq.gzzr19244_5V1V3_R2.fastq.gz
F19244.S06zr19244_6V1V3_R1.fastq.gzzr19244_6V1V3_R2.fastq.gz
F19244.S07zr19244_7V1V3_R1.fastq.gzzr19244_7V1V3_R2.fastq.gz
F19244.S08zr19244_8V1V3_R1.fastq.gzzr19244_8V1V3_R2.fastq.gz
F19244.S09zr19244_9V1V3_R1.fastq.gzzr19244_9V1V3_R2.fastq.gz

Please download and save the file to your computer storage device. The download link will expire after 60 days upon your receiving of this report.

Raw sequence data download link:

 

VI. Analysis - DADA2 Read Processing

What is DADA2?

DADA2 is a software package that models and corrects Illumina-sequenced amplicon errors [1]. DADA2 infers sample sequences exactly, without coarse-graining into OTUs, and resolves differences of as little as one nucleotide. DADA2 identified more real variants and output fewer spurious sequences than other methods.

DADA2’s advantage is that it uses more of the data. The DADA2 error model incorporates quality information, which is ignored by all other methods after filtering. The DADA2 error model incorporates quantitative abundances, whereas most other methods use abundance ranks if they use abundance at all. The DADA2 error model identifies the differences between sequences, eg. A->C, whereas other methods merely count the mismatches. DADA2 can parameterize its error model from the data itself, rather than relying on previous datasets that may or may not reflect the PCR and sequencing protocols used in your study.

DADA2 Software Package is available as an R package at : https://benjjneb.github.io/dada2/index.html

References

  1. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods. 2016 Jul;13(7):581-3. doi: 10.1038/nmeth.3869. Epub 2016 May 23. PMID: 27214047; PMCID: PMC4927377.

Analysis Procedures:

DADA2 pipeline includes several tools for read quality control, including quality filtering, trimming, denoising, pair merging and chimera filtering. Below are the major processing steps of DADA2:

Step 1. Read trimming based on sequence quality The quality of NGS Illumina sequences often decreases toward the end of the reads. DADA2 allows to trim off the poor quality read ends in order to improve the error model building and pair mergicing performance.

Step 2. Learn the Error Rates The DADA2 algorithm makes use of a parametric error model (err) and every amplicon dataset has a different set of error rates. The learnErrors method learns this error model from the data, by alternating estimation of the error rates and inference of sample composition until they converge on a jointly consistent solution. As in many machine-learning problems, the algorithm must begin with an initial guess, for which the maximum possible error rates in this data are used (the error rates if only the most abundant sequence is correct and all the rest are errors).

Step 3. Infer amplicon sequence variants (ASVs) based on the error model built in previous step. This step is also called sequence "denoising". The outcome of this step is a list of ASVs that are the equivalent of oligonucleotides.

Step 4. Merge paired reads. If the sequencing products are read pairs, DADA2 will merge the R1 and R2 ASVs into single sequences. Merging is performed by aligning the denoised forward reads with the reverse-complement of the corresponding denoised reverse reads, and then constructing the merged “contig” sequences. By default, merged sequences are only output if the forward and reverse reads overlap by at least 12 bases, and are identical to each other in the overlap region (but these conditions can be changed via function arguments).

Step 5. Remove chimera. The core dada method corrects substitution and indel errors, but chimeras remain. Fortunately, the accuracy of sequence variants after denoising makes identifying chimeric ASVs simpler than when dealing with fuzzy OTUs. Chimeric sequences are identified if they can be exactly reconstructed by combining a left-segment and a right-segment from two more abundant “parent” sequences. The frequency of chimeric sequences varies substantially from dataset to dataset, and depends on on factors including experimental procedures and sample complexity.

Results

1. Read Quality Plots NGS sequence analaysis starts with visualizing the quality of the sequencing. Below are the quality plots of the first sample for the R1 and R2 reads separately. In gray-scale is a heat map of the frequency of each quality score at each base position. The mean quality score at each position is shown by the green line, and the quartiles of the quality score distribution by the orange lines. The forward reads are usually of better quality. It is a common practice to trim the last few nucleotides to avoid less well-controlled errors that can arise there. The trimming affects the downstream steps including error model building, merging and chimera calling. FOMC uses an empirical approach to test many combinations of different trim length in order to achieve best final amplicon sequence variants (ASVs), see the next section “Optimal trim length for ASVs”.

Quality plots for all samples:

2. Optimal trim length for ASVs The final number of merged and chimera-filtered ASVs depends on the quality filtering (hence trimming) in the very beginning of the DADA2 pipeline. In order to achieve highest number of ASVs, an empirical approach was used -

  1. Create a random subset of each sample consisting of 5,000 R1 and 5,000 R2 (to reduce computation time)
  2. Trim 10 bases at a time from the ends of both R1 and R2 up to 50 bases
  3. For each combination of trimmed length (e.g., 300x300, 300x290, 290x290 etc), the trimmed reads are subject to the entire DADA2 pipeline for chimera-filtered merged ASVs
  4. The combination with highest percentage of the input reads becoming final ASVs is selected for the complete set of data

Below is the result of such operation, showing ASV percentages of total reads for all trimming combinations (1st Column = R1 lengths in bases; 1st Row = R2 lengths in bases):

R1/R2281271261251241231
32181.76%81.93%82.16%82.45%82.46%75.85%
31181.72%81.98%82.22%82.16%76.01%57.02%
30181.80%82.07%81.93%75.68%57.11%34.02%
29181.84%81.70%75.31%56.71%34.07%23.14%
28181.58%75.16%56.54%33.76%23.04%9.80%
27175.19%56.67%33.74%22.88%9.76%3.66%

Based on the above result, the trim length combination of R1 = 321 bases and R2 = 241 bases (highlighted red above), was chosen for generating final ASVs for all sequences. This combination generated highest number of merged non-chimeric ASVs and was used for downstream analyses, if requested.

3. Error plots from learning the error rates After DADA2 building the error model for the set of data, it is always worthwhile, as a sanity check if nothing else, to visualize the estimated error rates. The error rates for each possible transition (A→C, A→G, …) are shown below. Points are the observed error rates for each consensus quality score. The black line shows the estimated error rates after convergence of the machine-learning algorithm. The red line shows the error rates expected under the nominal definition of the Q-score. The ideal result would be the estimated error rates (black line) are a good fit to the observed rates (points), and the error rates drop with increased quality as expected.

Forward Read R1 Error Plot


Reverse Read R2 Error Plot

The PDF version of these plots are available here:

 

4. DADA2 Result Summary The table below shows the summary of the DADA2 analysis, tracking paired read counts of each samples for all the steps during DADA2 denoising process - including end-trimming (filtered), denoising (denoisedF, denoisedF), pair merging (merged) and chimera removal (nonchim).

Sample IDF19244.S01F19244.S02F19244.S03F19244.S04F19244.S05F19244.S06F19244.S07F19244.S08F19244.S09F19244.S10F19244.S11F19244.S12F19244.S13F19244.S14F19244.S15F19244.S16F19244.S17F19244.S18F19244.S19F19244.S20F19244.S21F19244.S22F19244.S23F19244.S24F19244.S25F19244.S26F19244.S27F19244.S28F19244.S29F19244.S30F19244.S31F19244.S32F19244.S33F19244.S34F19244.S35F19244.S36F19244.S37F19244.S38F19244.S39F19244.S40F19244.S41F19244.S42F19244.S43F19244.S44F19244.S45F19244.S46F19244.S47F19244.S48Row SumPercentage
input82,973120,857103,27581,01287,63987,53997,03387,96066,12365,19379,14488,006100,01269,99491,11784,81179,04772,55268,83989,444125,663107,703106,63189,19764,07286,34978,945103,483101,730112,37278,86097,78375,19074,82870,34380,43677,09477,41982,48678,902101,62573,13088,89887,682108,654107,80782,57875,8534,198,283100.00%
filtered82,972120,856103,27381,01187,63787,53797,03287,95966,12265,19379,14288,005100,00969,99491,11584,81179,04572,55268,83889,443125,662107,702106,63089,19664,07286,34678,944103,482101,730112,37178,85997,78275,19074,82870,34180,43677,09377,41882,48678,902101,62573,12988,89487,679108,652107,80682,57675,8524,198,229100.00%
denoisedF82,195119,717102,24679,93286,93686,87196,29387,46165,55864,83978,41387,42698,83169,17190,07884,16878,15271,67568,19388,852124,710107,031105,62088,48563,28385,48878,128102,911100,633111,46177,72196,56574,59474,16469,78080,08575,98876,30081,80878,260100,68372,59787,96286,531107,653107,08781,58074,9194,159,03499.07%
denoisedR81,530119,426102,13379,84986,82686,52496,01487,33965,45264,84578,50787,42198,73268,59589,91384,21577,99071,38267,71188,705124,297106,981105,69588,36163,35085,33777,888102,905100,207111,31177,15396,20974,11374,03469,57479,80775,67176,00081,30977,985100,58572,37787,82186,147107,068106,92981,14274,6144,147,97998.80%
merged75,906108,05796,03274,82881,48882,20291,41483,13861,42862,91673,73283,12992,94163,35382,91380,04672,55966,15162,25784,298118,027102,953100,60583,46559,14780,52071,82299,14393,247105,68670,15689,52768,33069,45264,97276,20068,78169,27875,15073,32494,05667,03882,99879,95799,764101,50374,30969,2133,887,41192.60%
nonchim71,896103,16291,31771,26377,12873,27082,80870,49149,13146,18464,90871,16289,85259,02876,34666,41769,66163,02555,57676,102112,55595,35387,75574,64255,31474,16463,54284,92688,796102,22665,99185,64163,76260,25755,26671,17362,88261,32771,22666,64688,43362,72780,52276,31891,14592,13869,57664,8923,557,92284.75%

This table can be downloaded as an Excel table below:

 

5. DADA2 Amplicon Sequence Variants (ASVs). A total of 7002 unique merged and chimera-free ASV sequences were identified, and their corresponding read counts for each sample are available in the "ASV Read Count Table" with rows for the ASV sequences and columns for sample. This read count table can be used for microbial profile comparison among different samples and the sequences provided in the table can be used to taxonomy assignment.

 

The table can be downloaded from this link:

 
 

Sample Meta Information

Download Sample Meta Information
#SampleIDSampleNameGroupGROUP_AGEDRPART_SEXGROUP_SHEDEVERSHEDHIV_STATUSPriortestSEQHIV_SheddingGender_SheddingAge_HIVAge_SheddingAge_HIV_Shedding_Age
HAKU.02057.01HAKU.02057.01ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.02057.04HAKU.02057.04CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.04092.05HAKU.04092.05CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.04287.03HAKU.04287.03CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.05094.01HAKU.05094.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.05791.01HAKU.05791.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.06266.01HAKU.06266.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.06449.01HAKU.06449.01ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.06449.04HAKU.06449.04CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.06601.03HAKU.06601.03CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.06641.01HAKU.06641.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.06912.03HAKU.06912.03ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.06978.01HAKU.06978.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.07031.01HAKU.07031.01ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.07102.01HAKU.07102.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.07104.01HAKU.07104.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.07991.04HAKU.07991.04CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.08146.01HAKU.08146.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.10105.01HAKU.10105.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.10302.01HAKU.10302.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.12128.01HAKU.12128.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.12368.01HAKU.12368.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.12368.03HAKU.12368.03CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.12653.01HAKU.12653.01ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.13032.01HAKU.13032.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.13051.01HAKU.13051.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.13110.01HAKU.13110.01ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.13254.01HAKU.13254.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.13254.02HAKU.13254.02ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.13254.03HAKU.13254.03ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.14131.01HAKU.14131.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.14131.03HAKU.14131.03CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.14257.05HAKU.14257.05CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.14257.06HAKU.14257.06CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.15133.01HAKU.15133.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.15133.03HAKU.15133.03CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.15133.05HAKU.15133.05CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.15133.08HAKU.15133.08ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.16063.02HAKU.16063.02ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.16255.01HAKU.16255.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.16255.02HAKU.16255.02ADULT-SHEDDERADULTMaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERMale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.16255.04HAKU.16255.04ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.17022.01HAKU.17022.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.17022.02HAKU.17022.02CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.17043.01HAKU.17043.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.17043.03HAKU.17043.03CHILD-SHEDDERCHILDFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.17164.01HAKU.17164.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.17164.06HAKU.17164.06ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.17251.01HAKU.17251.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.17282.03HAKU.17282.03CHILD-NON-SHEDDERCHILDFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.17364.01HAKU.17364.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.17575.01HAKU.17575.01ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.17575.08HAKU.17575.08ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.18078.01HAKU.18078.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.18305.01HAKU.18305.01ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.19122.01HAKU.19122.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.19356.04HAKU.19356.04CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.19356.05HAKU.19356.05ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.21294.01HAKU.21294.01ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
HAKU.21309.01HAKU.21309.01ADULT-SHEDDERADULTMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.22022.01HAKU.22022.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERMale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.22420.02HAKU.22420.02ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERFemale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.22968.02HAKU.22968.02ADULT-SHEDDERADULTFemaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERFemale SHEDDERADULT negativeADULT SHEDDERADULT HIV-Neg SHEDDER
HAKU.22968.06HAKU.22968.06CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.23313.04HAKU.23313.04CHILD-NON-SHEDDERCHILDMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERCHILD negativeCHILD NON-SHEDDERCHILD HIV-Neg NON-SHEDDER
HAKU.24962.03HAKU.24962.03CHILD-SHEDDERCHILDMaleSHEDDERyesnegative1SEQ039HIV-Neg SHEDDERMale SHEDDERCHILD negativeCHILD SHEDDERCHILD HIV-Neg SHEDDER
HAKU.25095.01HAKU.25095.01ADULT-NON-SHEDDERADULTMaleNON-SHEDDERnonegative1SEQ039HIV-Neg NON-SHEDDERMale NON-SHEDDERADULT negativeADULT NON-SHEDDERADULT HIV-Neg NON-SHEDDER
HAKU.25284.02HAKU.25284.02ADULT-NON-SHEDDERADULTFemaleNON-SHEDDERnopositive2SEQ041HIV-Pos NON-SHEDDERFemale NON-SHEDDERADULT positiveADULT NON-SHEDDERADULT HIV-Pos NON-SHEDDER
HAKU.25705.02HAKU.25705.02ADULT-SHEDDERADULTFemaleSHEDDERyespositive2SEQ041HIV-Pos SHEDDERFemale SHEDDERADULT positiveADULT SHEDDERADULT HIV-Pos SHEDDER
 
 

ASV Read Counts by Samples

#Sample IDRead Count
HAKU.04287.031
HAKU.06449.01352
HAKU.19356.0525,795
HAKU.06912.0328,475
HAKU.17282.0328,730
HAKU.06601.0328,949
HAKU.15133.0830,535
HAKU.07104.0130,920
HAKU.14257.0636,669
HAKU.17043.0338,148
HAKU.17575.0840,616
HAKU.12368.0340,676
HAKU.06449.0441,135
HAKU.07991.0441,660
HAKU.23313.0441,706
HAKU.12653.0142,335
HAKU.15133.0142,545
HAKU.17043.0142,710
HAKU.15133.0342,967
HAKU.22968.0243,087
HAKU.19356.0443,706
HAKU.22968.0643,874
HAKU.12368.0144,138
HAKU.14131.0344,184
HAKU.17575.0144,291
HAKU.25284.0244,885
HAKU.10105.0144,984
HAKU.16255.0445,918
HAKU.14257.0546,123
HAKU.10302.0146,218
HAKU.16063.0247,633
HAKU.13254.0348,235
HAKU.02057.0448,322
HAKU.14131.0148,431
HAKU.22420.0248,649
HAKU.17164.0648,690
HAKU.13254.0148,972
HAKU.21309.0149,067
HAKU.15133.0549,692
HAKU.17022.0250,011
HAKU.25095.0151,219
HAKU.04092.0551,685
HAKU.18078.0151,693
HAKU.24962.0351,720
HAKU.21294.0151,796
HAKU.17251.0152,051
HAKU.13254.0252,495
HAKU.17164.0153,230
HAKU.13032.0153,578
HAKU.17364.0156,377
HAKU.16255.0157,409
HAKU.22022.0159,817
HAKU.12128.0162,172
HAKU.25705.0263,606
HAKU.06978.0163,770
HAKU.18305.0164,705
HAKU.08146.0165,095
HAKU.16255.0267,126
HAKU.13051.0167,808
HAKU.19122.0170,404
HAKU.05094.0170,625
HAKU.06266.0171,430
HAKU.06641.0171,848
HAKU.17022.0176,274
HAKU.07102.0179,326
HAKU.02057.0180,201
HAKU.13110.0183,060
HAKU.07031.0187,315
HAKU.05791.0196,451
 
 
 

VII. Analysis - Read Taxonomy Assignment

Read Taxonomy Assignment - Methods

 

The close-reference taxonomy assignment of the ASV sequences using BLASTN is based on the algorithm published by Al-Hebshi et. al. (2015)[2].

The species-level, open-reference 16S rRNA NGS reads taxonomy assignment pipeline

Version 20210310a
 
 

1. Raw sequences reads in FASTA format were BLASTN-searched against a combined set of 16S rRNA reference sequences - the FOMC 16S rRNA Reference Sequences version 20221029 (https://microbiome.forsyth.org/ftp/refseq/). This set consists of the HOMD (version 15.22 http://www.homd.org/index.php?name=seqDownload&file&type=R ), Mouse Oral Microbiome Database (MOMD version 5.1 https://momd.org/ftp/16S_rRNA_refseq/MOMD_16S_rRNA_RefSeq/V5.1/), and the NCBI 16S rRNA reference sequence set (https://ftp.ncbi.nlm.nih.gov/blast/db/16S_ribosomal_RNA.tar.gz). These sequences were screened and combined to remove short sequences (<1000nt), chimera, duplicated and sub-sequences, as well as sequences with poor taxonomy annotation (e.g., without species information). This process resulted in 1,015 full-length 16S rRNA sequences from HOMD V15.22, 356 from MOMD V5.1, and 22,126 from NCBI, a total of 23,497 sequences. Altogether these sequence represent a total of 17,035 oral and non-oral microbial species.

The NCBI BLASTN version 2.7.1+ (Zhang et al, 2000) [3] was used with the default parameters. Reads with ≥ 98% sequence identity to the matched reference and ≥ 90% alignment length (i.e., ≥ 90% of the read length that was aligned to the reference and was used to calculate the sequence percent identity) were classified based on the taxonomy of the reference sequence with highest sequence identity. If a read matched with reference sequences representing more than one species with equal percent identity and alignment length, it was subject to chimera checking with USEARCH program version v8.1.1861 (Edgar 2010). Non-chimeric reads with multi-species best hits were considered valid and were assigned with a unique species notation (e.g., spp) denoting unresolvable multiple species.

2. Unassigned reads (i.e., reads with < 98% identity or < 90% alignment length) were pooled together and reads < 200 bases were removed. The remaining reads were subject to the de novo operational taxonomy unit (OTU) calling and chimera checking using the USEARCH program version v8.1.1861 (Edgar 2010)[4]. The de novo OTU calling and chimera checking was done using 98% as the sequence identity cutoff, i.e., the species-level OTU. The output of this step produced species-level de novo clustered OTUs with 98% identity. Representative reads from each of the OTUs/species were then BLASTN-searched against the same reference sequence set again to determine the closest species for these potential novel species. These potential novel species were pooled together with the reads that were signed to specie-level in the previous step, for down-stream analyses.

Reference:

  1. Al-Hebshi NN, Nasher AT, Idris AM, Chen T. Robust species taxonomy assignment algorithm for 16S rRNA NGS reads: application to oral carcinoma samples. J Oral Microbiol. 2015 Sep 29;7:28934. doi: 10.3402/jom.v7.28934. PMID: 26426306; PMCID: PMC4590409.
  2. Zhang Z, Schwartz S, Wagner L, Miller W. A greedy algorithm for aligning DNA sequences. J Comput Biol. 2000 Feb-Apr;7(1-2):203-14. doi: 10.1089/10665270050081478. PMID: 10890397.
  3. Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12. PubMed PMID: 20709691.
  4. 3. Designations used in the taxonomy:

    	1) Taxonomy levels are indicated by these prefixes:
    	
    	   k__: domain/kingdom
    	   p__: phylum
    	   c__: class
    	   o__: order
    	   f__: family
    	   g__: genus  
    	   s__: species
    	
    	   Example: 
    	
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Blautia;s__faecis
    		
    	2) Unique level identified – known species:
    	   
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__hominis
    	
    	   The above example shows some reads match to a single species (all levels are unique)
    	
    	3) Non-unique level identified – known species:
    
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__multispecies_spp123_3
    	   
    	   The above example “s__multispecies_spp123_3” indicates certain reads equally match to 3 species of the 
    	   genus Roseburia; the “spp123” is a temporally assigned species ID.
    	
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__multigenus;s__multispecies_spp234_5
    	   
    	   The above example indicates certain reads match equally to 5 different species, which belong to multiple genera.; 
    	   the “spp234” is a temporally assigned species ID.
    	
    	4) Unique level identified – unknown species, potential novel species:
    	   
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__ hominis_nov_97%
    	   
    	   The above example indicates that some reads have no match to any of the reference sequences with 
    	   sequence identity ≥ 98% and percent coverage (alignment length)  ≥ 98% as well. However this groups 
    	   of reads (actually the representative read from a de novo  OTU) has 96% percent identity to 
    	   Roseburia hominis, thus this is a potential novel species, closest to Roseburia hominis. 
    	   (But they are not the same species).
    	
    	5) Multiple level identified – unknown species, potential novel species:
    	   k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__Roseburia;s__ multispecies_sppn123_3_nov_96%
    	
    	   The above example indicates that some reads have no match to any of the reference sequences 
    	   with sequence identity ≥ 98% and percent coverage (alignment length)  ≥ 98% as well. 
    	   However this groups of reads (actually the representative read from a de novo  OTU) 
    	   has 96% percent identity equally to 3 species in Roseburia. Thus this is no single 
    	   closest species, instead this group of reads match equally to multiple species at 96%. 
    	   Since they have passed chimera check so they represent a novel species. “sppn123” is a 
    	   temporary ID for this potential novel species. 
    

 
4. The taxonomy assignment algorithm is illustrated in this flow char below:
 
 
 
 

Read Taxonomy Assignment - Result Summary *

CodeCategoryMPC=0% (>=1 read)MPC=0.01%(>=334 reads)
ATotal reads3,488,3203,488,320
BTotal assigned reads3,341,9923,341,992
CAssigned reads in species with read count < MPC028,686
DAssigned reads in samples with read count < 5003215
ETotal samples6969
FSamples with reads >= 5006767
GSamples with reads < 50022
HTotal assigned reads used for analysis (B-C-D)3,341,6713,313,301
IReads assigned to single species1,424,2821,404,539
JReads assigned to multiple species1,917,3891,908,762
KReads assigned to novel species00
LTotal number of species543277
MNumber of single species372193
NNumber of multi-species17184
ONumber of novel species00
PTotal unassigned reads146,328146,328
QChimeric reads00
RReads without BLASTN hits6262
SOthers: short, low quality, singletons, etc.146,266146,266
A=B+P=C+D+H+Q+R+S
E=F+G
B=C+D+H
H=I+J+K
L=M+N+O
P=Q+R+S
* MPC = Minimal percent (of all assigned reads) read count per species, species with read count < MPC were removed.
* Samples with reads < 500 were removed from downstream analyses.
* The assignment result from MPC=0.1% was used in the downstream analyses.
 
 
 

Read Taxonomy Assignment - ASV Species-Level Read Counts Table

This table shows the read counts for each sample (columns) and each species identified based on the ASV sequences. The downstream analyses were based on this table.
SPIDTaxonomyHAKU.02057.01HAKU.02057.04HAKU.04092.05HAKU.04287.03HAKU.05094.01HAKU.05791.01HAKU.06266.01HAKU.06449.01HAKU.06449.04HAKU.06601.03HAKU.06641.01HAKU.06912.03HAKU.06978.01HAKU.07031.01HAKU.07102.01HAKU.07104.01HAKU.07991.04HAKU.08146.01HAKU.10105.01HAKU.10302.01HAKU.12128.01HAKU.12368.01HAKU.12368.03HAKU.12653.01HAKU.13032.01HAKU.13051.01HAKU.13110.01HAKU.13254.01HAKU.13254.02HAKU.13254.03HAKU.14131.01HAKU.14131.03HAKU.14257.05HAKU.14257.06HAKU.15133.01HAKU.15133.03HAKU.15133.05HAKU.15133.08HAKU.16063.02HAKU.16255.01HAKU.16255.02HAKU.16255.04HAKU.17022.01HAKU.17022.02HAKU.17043.01HAKU.17043.03HAKU.17164.01HAKU.17164.06HAKU.17251.01HAKU.17282.03HAKU.17364.01HAKU.17575.01HAKU.17575.08HAKU.18078.01HAKU.18305.01HAKU.19122.01HAKU.19356.04HAKU.19356.05HAKU.21294.01HAKU.21309.01HAKU.22022.01HAKU.22420.02HAKU.22968.02HAKU.22968.06HAKU.23313.04HAKU.24962.03HAKU.25095.01HAKU.25284.02HAKU.25705.02
SP1Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;denticola2501000023060340242700158115902024901460632057056012901521813192744353010701343171226028004601327620923048443153
SP100;;;;;Peptostreptococcaceae_[XI][G-7];[Eubacterium]_yurii_subsps._yurii_&_margaretiae000000106000005500000500160204010500008405400000010066980240000660200025101101817730000190316366
SP101Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Peptoanaerobacter;[Eubacterium] yurii12020240100012616230402164350611600521000022004120612901406070239016013233100001161228
SP106Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;sp. HMT780242611525301892614128057600414425724729381481475696352524620800113176934242547596486031587042628012717693822100998961742402114961066039223400887493121333365490250
SP112Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;nucleatum_subsp._polymorphum31140352510011001447158297711280141010005014725011003012992898012011174920430810000191301201904
SP12Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;sp. HMT064633579010437030212227361205160291131132601711231212425603560226109721312790295124782741451228341046133812222370332489150574422988262197158115459522763367122
SP120Bacteria;Firmicutes;Bacilli;Lactobacillales;Lactobacillaceae;Lactobacillus;taiwanensis5832925200369064951984325916631390020012418833322681970471443321032934610912175028278450139622707802431628000187232922
SP123Bacteria;Firmicutes;Tissierellia;Tissierellales;Peptoniphilaceae;Peptoniphilaceae_[G-1];bacterium HMT113000000000000000400000001714210000100000003000007000000002900008330000002511810
SP129Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;noxia002000117000015200012000000140000012753221200340010014812000570091360008700700303730102365
SP137Bacteria;Firmicutes;Bacilli;Bacillales;Gemellaceae;Gemella;haemolysans006000500005550002141210040045221144020060000404501000034199034000147823907680004718540004024828372
SP139Bacteria;Firmicutes;Bacilli;Bacillales;Gemellaceae;Gemella;sanguinis04470444091024761487031503371713233006030515011600715013513011131122113012013343434100777389657
SP14Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;downii3013608801322519501443543884638155212111992471134671332271339363761384213569736155163219699298300661691562089761555845950602403950243648422181481693208319929697251
SP140Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;hofstadii359681203626380253016054318730115434681474604758451842241227552960574330761289217255910152963061104119257854968154516542
SP153Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnoanaerobaculum;saburreum163505014663001540191062086140112345490171511230045755621253931058093808981108201315780164131558232602911
SP157Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Erwiniaceae;Rosenbergiella;australiborealis00600120260031600019319101020193000114000915403001107061123011029054000533300018054567
SP189Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;paraphrohaemolyticus000011822030214318011203001297060014480040751222362017800070202903540059495178117127632
SP19Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;oralis_subsp._dentisani_clade_39811262024441701141321400603003419413020540219000623300010010920890617400205540500009172208
SP190Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;sputorum21121790186716204500332222961485025223731514721315827127205291930518438207325551521682985211631136096315240426575264115827164257400207422523431552051407417174593971837
SP191Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;tannerae0204301715397097133323210510703961865928150147113139652070183268797598013712327066445302103912386227304639068358039254082229501819373912117122
SP192Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT90936000120210004210200113067020120100001307010833200300015920105081712020371620
SP195Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;aphrophilus68695040141670795538329426452610166011112927271415535199396828892413612713620837751414142712874786010511072118502630114581940538346
SP196Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;paraphrophilus1091711603022908398242120308154013642102919432454291601518404542997363015520202016331273851518122051172623841583512916218231115862331118477
SP197Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales;Burkholderiaceae;Lautropia;mirabilis302411432296050816651644136920943525116100620422486412755141544221911822116867822074817021941344627550117339441668564723004274473202399217163405263219648401449103913915045193815984484315285011917261303173322621521582175
SP20Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;sp. HMT0611808160692054369575031561028342587196677352812814223815115929546144473209717512444718058343413947094522652465203913579373569802662861522468021673081841018649148188109155780108747115214
SP220Bacteria;Firmicutes;Bacilli;Bacillales;Bacillaceae;Bacillus;paramobilis0014600400025107400305931401013970001390001210266003794449474033000015105485011900027534200063132110
SP221Bacteria;Firmicutes;Bacilli;Bacillales;Bacillaceae;Bacillus;paranthracis38137016630360158054979373121642275662610901374213241920621402875101480310185664012658151342564317611077262390107150549122719
SP224Bacteria;Firmicutes;Bacilli;Bacillales;Bacillaceae;Bacillus;mycoides76169240211514301964241846597217122722724121285072491354195233862420340410100216571792361352138341483814710274489469051101
SP225Bacteria;Firmicutes;Bacilli;Bacillales;Bacillaceae;Bacillus;gaemokensis5354209023061230656691291106417335616152127166200127182728215541135230460342107396504604351688538709102085360
SP23Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Mogibacterium;vescum6311102025033114358270427742500235973580400282391512019269714912800410113354014412700871801328
SP24Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Mogibacterium;neglectum064000001011008210020210171280082171101507230001331510003439008809010848004912154
SP248Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;sp. HMT4734148349340490295611202910235698327146419001281183462851022133117165012628503398977476078273100164520756224791214480131729611512104497355172922546086592824911310888223134331
SP26Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Mogibacterium;diversum2241712050010368764136108134165095223921344184529613305823982254169517220339668153620725557460043163358314159314691155647987121348123045352319815349827816886468036059038361109114862094437812125619616564
SP263Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;salivae119204309049721046100572824518142131197375109533831065757851021110268296113800281854967551912129506491102411667812141295723706170431552821177511292121050816619517788338
SP266Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;sp. HMT93013957010211292106983003136466475088210327182863369086331526541261810264168810051219217618322889104682468203645142052322430251848702794163403
SP27Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;histicola422492989010233177090263034021209564415146782401197136610981710401344224210091914455368692013242019049413200312131259102199686432218125917318922411977729186012115437881152442275
SP272Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Escherichia;fergusonii00280001702200269400047544312650340163010050117202028021201217066501838381517013550131526136604530730115261003020216064
SP273Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Escherichia;marmotae400016108308013012005501012000008111000641310000003508031919000840204012000100021620000351100
SP276Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Shigella;sonnei0030001300008000009300000120190441010000000020087709000146020100000414600000165414
SP277Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;gingivalis00000064000003124006841516702590256039703005530002800047500118384104764053000016194099200112461150400301991568705
SP278Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Veillonellaceae_[G-1];bacterium HMT155502810011000111166020000060486210200001800702510021177100080270400149310002610162
SP28Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT2154281732080505113805139796240371119770942364165186858458259120299530226537601576142317443461185186203671974722291513051442716481181911990272277741147964168401012751557980
SP29Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT309268805350010021003384100101299010940441243036040013183390149320053900112122600306212410010
SP296Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Intrasporangiaceae;Knoellia;remsis3390002500135035426911011240400175200605241419817502021640363106501123333274364121
SP297Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;denticola49840303366640011212120361063901310358102275076536446191250391444109053722918594345333492052900440432018662391012554432111524185
SP298Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT94292614370016521104810020140014446011001000601500020603010001301112110800864290240012052001922102441741
SP300Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Rothia;aeria14718270893051004252571328665319102711198082821735561592409106108160320566050941138331841329131101950263612101217482414829521584827432783930125
SP301Bacteria;Absconditabacteria_(SR1);Absconditabacteria_(SR1)_[C-1];Absconditabacteria_(SR1)_[O-1];Absconditabacteria_(SR1)_[F-1];Absconditabacteria_(SR1)_[G-1];bacterium HMT874070037061032110824011670244001912202067176100346120961534076011681103112761502541047113100213182212
SP303Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;tobetsuensis19222301852269021328010655224451504232530271012071726234612650832926646126393336672479914993213071067621131081251441865771118919411210646102141388
SP304Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;rogosae5725780038195536058921954107149105108111950141933521224446242631513764836185846376342423166843563544310725648932231975142232546073118944062415571034354387142
SP315Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Schaalia;lingnae52390143448000298619125173514771517452806115116162342398110090124511501329014183710539501045660647012170629311440027198
SP316Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnospiraceae_[G-8];bacterium HMT500005000500116921001858200035121002400000540011020244911201801263015001990107412008002243843
SP318Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;atypica309430143170234653001751101412111164000386352106122070151503846134192992221502211016391416196202221
SP320Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Streptobacillus;felis00000017204000050000037000040463000000000000000002571310000017033010750000176000005232580
SP322Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;nanceiensis3517139014112172370272617733267127136223149524136017358363937431566176456003915364831510279771519725477742420648533711949524133596301736259142
SP323Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Weeksellaceae;Riemerella;sp. HMT3221311110584679010123337469751910934651868134142104994615393129831246242111985374321019179941451915617677651816264596091033021514
SP324Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;sp. HMT05615451106023672320103039310131181643028844156381415423128538127103109114444469558018643030231113825910613820463186894631453171212523537627753216170
SP325Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;himalayensis322101008152707872112370314001598300212120876151060700575220584506000162560235213110014
SP333Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT313312711004280109411814271011062714022918428018483152156500241031551813295462140122231731120733314630022524345751
SP334Bacteria;Tenericutes;Mollicutes;Mycoplasmatales;Mycoplasmataceae;Mycoplasma;faucium0044003507006721001010278000065200008201223414001859011368200000112700104020637000606172200
SP336Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;oralis946290117089749837016351418710889911073091101951163548121470012918412616155140161338263217315394012816784710771941033543322551310458618311017224557913884951551776420118
SP339Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Actinomyces;johnsonii260942501235960359812742824764842515141527833328172252470012122992163530261612017105181111105104327143600152142471657816535928219878
SP341Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT52626066913019999901296444454436313171571055050379571378305232172821412178904315440151122216121510141215531201724731231326914637075151
SP342;;;;;Peptostreptococcaceae_[XI][G-7];bacterium HMT08166443844200210180042371108015262537524958419072020284193132001329149746228112114256022114128268513514377327722124016549111693914200019092215525
SP351Bacteria;Firmicutes;Bacilli;Bacillales;Staphylococcaceae;Staphylococcus;caprae3035305084912201908226683438571375618310829223182533320752267619962390201064178321291537031271326101862592211281914149916872332474697164701510105619407149
SP353Bacteria;Firmicutes;Bacilli;Bacillales;Staphylococcaceae;Staphylococcus;aureus15930421130011324307841511220915054519163918100215889554165976527200121060050001810872231906001028003292
SP354Bacteria;Firmicutes;Bacilli;Bacillales;Staphylococcaceae;Staphylococcus;petrasii398336154501553525160018152170966413813257119513823782182217003522407994625687565691080696105838705719643942035891949947240134010823486652774217155731194336925402563373625515317525845424217931027596291378429
SP358Bacteria;Firmicutes;Bacilli;Bacillales;Staphylococcaceae;Staphylococcus;simiae140000000000009030412020530190031000000000005014221033007873502700311181000010322
SP359Bacteria;Firmicutes;Bacilli;Bacillales;Staphylococcaceae;Staphylococcus;saccharolyticus150022218900025137909061550291231340541315001200216240109414493832018211210010018002230
SP361Bacteria;Firmicutes;Bacilli;Bacillales;Staphylococcaceae;Staphylococcus;warneri180501300318106868462105493532247435850256213115219191241763101365721151356147221501303922122172552111545201612810310506165
SP37Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sputigena210004043050149527801804105343230110128914002010162301007113192924748030616118325103148950007117016275
SP38Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Alloprevotella;sp. HMT308714177011984548081314775787116554078832655423432618139157626939051234013572222024205123218821306013141241962303421713
SP380Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;lactarius89860312550130752831410011655397031660139072140346599129412969187068051548551417162713105017590178428301174379536144513365671509364318022089258311115341273287841052283641343373145198180731263274477165290134215271070205039328185295612891855
SP384Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Brevibacteriaceae;Brevibacterium;celere35039000000000000010000108321500000200000275701330010111600000001800000055000111135470
SP385Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Brevibacteriaceae;Brevibacterium;iodinum4074857298072597780120192200385807169033604380147054751144758717295759337103050401291715858512509215109506810609601762932825301585
SP388Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Brevibacteriaceae;Brevibacterium;ammoniilyticum01540002001002504001022307025001501020072008447002100371020102200144400107006246
SP397Bacteria;Actinobacteria;Actinomycetia;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;dentium078011255802359123233917702408245029011511211012383931760166391744461652233161700613245615411111581102
SP398Bacteria;Actinobacteria;Actinomycetia;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;moukalabense2531156590127234106811220913919622310015627124813241517452402050923126301318716579129816926912049411843503257436228242152974244832078105
SP399Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;sp. HMT0180000482200017400132022031203000311136641000261311402000021000030001101502700109162301000
SP40Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Veillonellaceae_[G-1];bacterium HMT14800900770901093908142431030250675655051133012013510170727315254136191221086538101290442517830000101175249
SP400Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sp. HMT33231106130812300313030400027250031902329153220102901210304137002700263018791302241498
SP402Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Peptoanaerobacter;stomatis01000031000207200051381330210214225030696000017190081009666911004373168235025040461320000021842291
SP411Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;sp. HMT2850057000000000000002400000000011900000016230003700491500000016505900000076000140822050
SP430Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Kocuria;assamensis333100114630023373517249541054282342192391921220306538338373172932730464546341720356373273014482060116637111232117523
SP431Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;noxia65100111658018275621898121489404423823681194320017295725840107598389718112082551162891671858645141202031267147719
SP432Bacteria;Firmicutes;Bacilli;Lactobacillales;Carnobacteriaceae;Granulicatella;adiacens290352106509405467570221163408434689389113114469728181751192645688854241211711513144153689412822555219779936727161190145458285116913289214712831267911395164273849068420261710762115428361252176836479341282483801168629
SP433Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multisaccharivorax2040001000008021151490001440000690202603025101800011805062000300008000034031027420000385
SP440Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Acinetobacter;junii116130000000900155212025575115110294480620320211501800012166313559151034704471213104110143711293402258319
SP441Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Acinetobacter;guillouiae178634023154404528575880346591117193449156271638008734171216156612213479616311837185012783283406006902113111152
SP444Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Acinetobacter;gyllenbergii205201204001801023224401096691300041450000141057004127115100250455951100341613591440035142460100222336
SP445Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Acinetobacter;vivianii1020150091000128120000405922724026140116187020001143201211192200121102901000094005050
SP448Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Acinetobacter;proteolyticus144174307333112071272251381428332749202833618619601284616713258281892346345369812917271994212291248022101212224153419147110333040
SP455Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;Lancefieldella;parvula134331020381254030274114328139917048230263476171745354409916982591912881175063041384181918899633305444311512127147464671831215055696290281363591888336413169270
SP459Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sp. HMT3249317700321481010048659061232552126201645348691874426205225150148613129221050258198645744410104108622025129818680399010118
SP46Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT2511435300343829503512735213810291918521264131414625111462917134826546313815380106181769282141416301081245202198918745201023212236111723292102
SP47Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT25011947905421590381127281124656189181018254260123289455012262121910266341410355113610292693821563774262640731515717393455322771510943
SP483Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;Lancefieldella;rimae047130004100104718600015010312010124510011210202025450005002614024619012153720281860180
SP485Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Peptidiphaga;gingivicola158420378202655252733051711240275376308311400028991516401500115273258800260111010128349843171825139127
SP496Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;maculosa34178160485406271197641911615011131110308521540385161315101441912234727282091529016115941041627642621
SP497Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT317428860152610145511538132828105012935131673141962163515391043521271311422141515104816017342045105696102112153214131
SP498Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;shahii13140041650801283480121215900103500026096002124213072132101860051021198120349000124018205681
SP499Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;parainfluenzae525123842450727336442456083914037128199425252542436187026827814414189161221842769849938203671644161470219334871151051329259126911106201814412041177535555852428225791101225064033695903311915939529044761304120812291933152718895409809226451233441658523785
SP50Bacteria;Firmicutes;Bacilli;Lactobacillales;Lactobacillaceae;Lactobacillus;salivarius19309000226000784522790014139151126130846109713835194007213421880123190756531012313006164036322693926366662317192012108257
SP500Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;pallens17867223034975168401935189112871831231002508212610180558115713095113467856470051088541440816222721108542514911583152429625233352334204722193841040117169257185
SP502Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sp. HMT864005300000002262552003161660133840000140440000122028410400145162413310118127223170000124928
SP507Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Veillonellaceae_[G-1];bacterium HMT132821400178416002129220698481523561672210770164490257104761851105002043305017319498969230411356333004724241135755531330313610926680639321330
SP509Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;sp. HMT472016004912430355359291715090021317130925001149007006312401004500520016030170100005114
SP510Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;sinus10850400041859803820701947535435121144542115123183331496511349161038945452679012132352313648287306928351171542110471173887971431426332311335801
SP511Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Peptostreptococcaceae_[G-4];bacterium HMT36900000010000010180001632000042060000100001311007001533700010111143097015722500000104635
SP512Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT39211342206314630523140141324571312221550941412815153948461601402863183714699911251646212719947152220446415359901719761021498
SP514Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;downei6161800411600267012522176221310131171082938301839135153014112281211165321511005462499310249016450313
SP515Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;dentirousetti356314952605681475905216081721611350323382106904211794130489715924302680663147834290100410330215332297344730272813721546393318218622781906272210374355227910230441310169421136014521446819128209201017995229229917769042301961
SP518Bacteria;Firmicutes;Bacilli;Lactobacillales;Lactobacillaceae;Limosilactobacillus;mucosae1860000200072007005000110010035051100001006010130006000900004370007900005530010318
SP519Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;macedonicus2337548807123574011111124252719203132011219115463421040321611696063482590173123540841012586125151221610410863191098349141938248
SP522Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;sp. HMT9170013400130000450081209900500300882000600456111181940000103660167026400000178500120417123120000
SP523Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multiformis00186000830000033000000550025901370301032005057021261700259234134429300211328034781124211310081050
SP526Bacteria;Firmicutes;Bacilli;Lactobacillales;Enterococcaceae;Enterococcus;saccharolyticus01300000900038000023201071150000060000092001030001170190000381217300000052000611870
SP527Bacteria;Firmicutes;Bacilli;Lactobacillales;Enterococcaceae;Enterococcus;gallinarum11010212730002161117233022901000051024240000200070241002400069101301016901402501000063
SP53Bacteria;Bacteroidetes;Flavobacteriia;Flavobacteriales;Flavobacteriaceae;Capnocytophaga;sp. HMT901807000120000004930005000662700011011100107600400018010000034429030030025003017171
SP538Bacteria;Firmicutes;Bacilli;Bacillales;Bacillaceae;Priestia;flexa00950327600011097021231005721340004310100230170050007065000018168260000041000101020000020
SP540Bacteria;Firmicutes;Bacilli;Bacillales;Bacillaceae;Priestia;qingshengii44710006001404120251320111015160002703214345001016032131624701057010206210035113113
SP541Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;sp. HMT1020000000000002500021000017111801230100000000420401801400001115400011218800000704
SP542Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;actinomycetemcomitans0924500000747008900002300000000000000045953388514300409800310900100059000032000000000027
SP547Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;sp. HMT1369220001130050011600050001741234000006111201113329000113547000000700770002100043
SP55Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Coriobacteriaceae;Fannyhessea;sp. HMT41600000000000001000000000002000200000020000010001000000000000000000059000108
SP551Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;fusca2652032100800600084021801014000000305109600117807130946001191860100400000900430
SP558Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Stomatobaculum;longum45560410052201340742148330485155624814117314523817165134831286827914805571159211763948751850167464810367132371604216461903345505115128862121037
SP559Bacteria;Actinobacteria;Actinomycetia;Corynebacteriales;Corynebacteriaceae;Corynebacterium;xerosis6494146301455074850621153190701566032818958171115425044642942821422681078408738222271891054132404117310643644251898122426514221630163114819052705143448122881671509494628871255
SP56Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT4171072412418503677127201024625918901517433335585820721167709703758749351404129212552922536258643601214731271910910225287056978121149371401365280665122282254941854837164620714417516
SP561Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Peptostreptococcaceae_[G-6];[Eubacterium]_nodatum00100010000030000001480000300000001000011001200103511141002001019031000168510000090167108
SP568Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas;cremoricolorata15111025601417427401832396234913043951193414383172314412912777935916643494132162943210487384153134359940261466939241361703011028223213515483312587928174
SP569Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas;putida104442097323702644021243731082423515713331313027305552364410260523394910426870212134126112032865923133982122676191401015576139185121933
SP570Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas;straminea1025201029023024271453613093213100254514000010202121040101521642439900532390751410502501311022794
SP573Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas;cuatrocienegasensis23945015481414044080463937146622133531610527931058271187273831271071193171852913052493369122336807064832303851323305067304103702942195222110
SP577Bacteria;Firmicutes;Clostridia;Eubacteriales;Ruminococcaceae;Ruminococcaceae_[G-1];bacterium HMT0756813410805154038561461694935495819321548616818713112118078825515397120171121848283124152924246817733903915913143115127261731335914
SP578Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;maltophilum073000300002152500002430001341531313220300110011008340334205410025616802355297300005083
SP585Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Butyrivibrio;sp. HMT4551401401513170003301653224753160410014165112478505001160043804020200104202111045150800163010130
SP603Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Megasphaera;micronuciformis1786601037815147604621216925219722543141084996113622326526934767111731378121882485082271381023222081721217231109215867668292406913911444806250316170
SP608Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Enterobacter;bugandensis0000000011000010400140230000026000500000013000010000000009142001070018010048600
SP625Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Kluyvera;intermedia5992802803204811411250000370232510553653031871164495905013017627221203810329171410551594601140
SP63Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;buccalis3960012139400125013501371222132111011294409134211049338921243874185271573221286540268261840111116812132615418371161610324
SP635Bacteria;Synergistetes;Synergistia;Synergistales;Synergistaceae;Fretibacterium;sp. HMT36000200410501540000204000010114312013210000090181301510902006103047002048000306465
SP636Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Peptidiphaga;sp. HMT183182700118402130118225951510100045502112012015152348100000560416001000173119036700107001522
SP657Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;goodfellowii262473488029853810902865967829074411121992471377319382264062959817356513646783219023252871921331114413819262344717112810519174193110585149107816065342813434099115
SP658Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Dialister;propionicifaciens158247000270012004001120000441110400200121354001010102139030001001001300260730009
SP667Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Klebsiella;grimontii331861111069354615370156155192286269137616834634110387492333906681343943014421714581319512117077632045947344096014522016010216523061377112289421468371544571122918722424614441051985510349710
SP669Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Arthrobacter;livingstonensis3011670499821903911264124803522331920483673501054321125302516493103031311306631827532843108268161120819100013147121591213343841763069132
SP688Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Arthrobacter;oryzae3023001180170137900003706034401301200222002010104043019571619255718562241770049015175476473704232127232500030513397700
SP690Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;Lancefieldella;rimae2022150311330285948810080200083179122131662046516121422031131717523091369310925232321801028667353422611127701202043457335835
SP692Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Pluralibacter;gergoviae0100010025000000054012000201150400301047100001821200100040000000700001000300
SP693Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Citrobacter;sedlakii17025000760240316351790419382053270695718701451260301776402303004472331207401104425014503878382005361231163140
SP701Bacteria;Firmicutes;Clostridia;Eubacteriales;Ruminococcaceae;Ruminococcaceae_[G-2];bacterium HMT085521035501209230121891316735141041561218915890301173110522136917934274092145610229251324714301192153804020154218131311295323311
SP712Bacteria;Actinobacteria;Actinomycetia;Corynebacteriales;Corynebacteriaceae;Corynebacterium;matruchotii148119861033947703911539348350199343134141391284126431315815155504128132952433452172421882358733838434110242872298199241352471528412921244298
SP731Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Peptostreptococcaceae_[G-9];[Eubacterium]_brachy4210250091030562861502164201221241021018214020207110210636813416211401252202879020125120307351682615100054052154147
SP732Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales;Comamonadaceae;Ottowia;sp. HMT894000011330270128385060121030026000480000001400030480000210001001942400090007450
SP734Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Kocuria;flava0020011209006134300061330102207100120300001300314132513840067217014403244550055038444
SP752Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;sp. HMT256001000003000000000300001000000000020002000000784130000064117000000240010042320
SP753Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;catoniae000000100001611000017000001011000000030000081004824090158190850122209300004371
SP760Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Actinomyces;graevenitzii691420561715805091547949696086319911944364672171310224149812921400005542641728235035113373226861378024359816171572238853198116170214314657
SP77Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Catonella;sp. HMT1640030001070005000010490001101900003000035001050271901160002431260650046180000013520
SP776Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT2251542559416030652097013542510411812036739311096584023739232102211590351730334262217683839172630414215264200218319415019111439692616491057
SP78Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;sp. HMT248848113400080000052308053827420005251180490071630110050000027896170002260135172027073200128000258411000
SP780Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Brevibacteriaceae;Brevibacterium;oceani0027009800019232000019481127058418502108023034041600853006923180060400242207187039891111400007928
SP787Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Morococcus;cerebrosus000000001200706004131320101901001930000041100001001162301800432018003013300400251863
SP788Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;flava2171706129530332947428855111782671413391341711055201783420416191331342424105713498098010313813181391803823167
SP789Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;mucosa593000010201070050005514081114341000600100201500010142600000001300190000211502013
SP79Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;sp. HMT49875430844101242050197050224866431357451831841205975426999141685156303593374166761416227940193723553611503102519905391222284651139213
SP801Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Micrococcus;endophyticus1411218000650112139281692672961115151918240350545110152133505945240164220861190204011731648230110851980124
SP803Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Micrococcus;antarcticus53903110100011680181100342009336900006000003553050263001010000015459042050136000000000372269144406760200
SP805Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Micrococcus;yunnanensis039100430000135161115002302441743757002101120028117303061611302474161160130551731631162602300104114250041220
SP811Bacteria;Proteobacteria;Epsilonproteobacteria;Campylobacterales;Campylobacteraceae;Campylobacter;concisus70128145017413986083183138117483220453812144361471091359275160194314820765641921510652459716512193223261083401012012091411559216441017822933323183252149057
SP812Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;parvum11408114704023503812104791641139082844171425921230083603913803703251562011221235101533700224119059006930
SP822Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Clostridioides;difficile1600018308000165121324401640200006102401310000000201515011000580019000141000702200000071
SP823Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Oribacterium;sp. HMT078208611001620551541051140032421284937205112101121119131230121503113110217260201055500128702537921402509
SP827Bacteria;Bacteroidetes;Bacteroidetes_[C-1];Bacteroidetes_[O-1];Bacteroidetes_[F-1];Bacteroidetes_[G-5];bacterium HMT50700800000000000000000080123000000001001640000006178000000147000000032000006700
SP845Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Schaalia;sp. HMT17212853400022555270015088524650137423493031841834436460756495944125531170140054947301744071541056210620176166271113212141312250626302718751831816633014171550
SP85Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Kingella;sp. HMT459081601720582413336301224702605380005103018017810811160170215311301223013721015120394671354
SP871Bacteria;Firmicutes;Bacilli;Lactobacillales;Leuconostocaceae;Leuconostoc;holzapfelii152363069020157019088720374475486117849238611013833397199928125647101300379111234231217525832281139818329123102711474782312181322542968137257901017
SP875Bacteria;Firmicutes;Clostridia;Eubacteriales;Eubacteriaceae;Pseudoramibacter;alactolyticus660000100000003200100600114900000050000000104320221310390100410424100002400677
SP88Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;artemidis00700070300016000222001901120000000030000001216803200000003712047000300004038130
SP90Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;lecithinolyticum0020001201900512510001820002205014400015041031100212222911133012300261160123040364006422001485
SP91Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;parasanguinis_clade_41100300000012007000080900001600000110137015110154180346201700910640130000001200900038100
SP93Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;parasanguinis0236073570353700150231411100441410003823013127014901080010419780132401830421806236019102853713170
SP94Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;sp. HMT05745703754128023854316211681271721312131322900213684314859378320710197449411210311228112247269141013722113883612183354820497
SPP1multidomain;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp1_31330012402510131116001158001502209200481000608190171100700183965326366084952158002631682102
SPP10Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp10_2222108403237079241420490111111221651214140805146752011461921119313332542421328104951221113904244160148161533139334582814204735
SPP105Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;multispecies_spp105_31361391960682139695064137256529252333777112932360395518863144325571820111137051041120792132916993492064305570937162229778118355731931771161617387636516654961580112660114105914287250925640562701324981103338
SPP106Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;Atopobium;rimae2011250210180224946000200031791180164696516101401013171751509261116112117600162011226112543310424562335
SPP107Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp107_2018880025110119509833020000619016008042014017003270141113022069310026150193301370272174
SPP109Bacteria;Firmicutes;Bacilli;Lactobacillales;multifamily;multigenus;multispecies_spp109_24233002113004185146283021162309209901341280915132813876910221014067551013122156013218189318037262796412143651444487
SPP11Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp11_26803861210222823301851055287132325126432942481217129241198299192106211404067751623694981112360911409826354749894481608410136342432208743712236228852779124463012961
SPP111Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp111_3824020000001380150020002061007000000163160304700102120192600010200730402004530110018
SPP112Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp112_6321221140154262380101719228230376139100015802522771616186919130451094251318394281963602098239351315061231502527443551878117377111515832052432133137
SPP118Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;multispecies_spp118_3444925082819010943983202282868611570623174397840143245522882001611218818135453874682615014413023242913149
SPP119Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Fusobacteriaceae;Fusobacterium;multispecies_spp119_2135585220441934202332112828160023114091714620725119186158994734417145152520791652531296616420243113187712173471413103905655010626233201211361758241738380315212
SPP121Bacteria;Firmicutes;Clostridia;Eubacteriales;Lachnospiraceae;Lachnoanaerobaculum;multispecies_spp121_238952001743784018841925122236545727332674538691022551310761144102182703241638188542286423846189286928338464178692546415310152602563721045316871301415
SPP13Bacteria;Firmicutes;Bacilli;Bacillales;Gemellaceae;Gemella;multispecies_spp13_443056105370236839645960354392916758152074743968781589204653863901176659688117721458385829241350230925621540202136926754555754130506887619326093702196490049941787259238122134658219493810113293952485638575101280612922385378445176313027184937334615816225
SPP131Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp131_415185802360725637960992623281364321433896229321625116923570314381826158319911514741041279173552121339109185818362005581857814386502164196622694572979217749069210134111765141814183518049977214292491435917
SPP132Bacteria;Fusobacteria;Fusobacteriia;Fusobacteriales;Leptotrichiaceae;Leptotrichia;multispecies_spp132_213720011062017219072356006911020793119051516714241621921931263561951614140432482102853826016812784732311
SPP133Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;multigenus;multispecies_spp133_27101420201788380611023525424562041364839125096326354171117850730171108118722822887202032841906209341001938612348017761011786914790572471057230245828157248812516
SPP135Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Moraxella;multispecies_spp135_40000160000000000000100000000020249518000000020870300000410082533000014000000005100
SPP141Bacteria;Proteobacteria;Epsilonproteobacteria;Campylobacterales;Campylobacteraceae;Campylobacter;multispecies_spp141_228284309274024228217476631349196060581751017512956615132673012192520112011071784481935018528234114228193872501810151381486439237168253109
SPP147Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp147_244199135016362771346033586091535752452314743947985803414202387928018070911035579962751184168151581351945628715086261841195978131109274174244126154818164036418351649204912661858131743957822841
SPP148Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp148_3521612805371425026001107274147137015310039915113305570004396153421860021841690740224009699147103664046339840491131723253946500
SPP149Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp149_27127036010023207285121220150321001120586010268132363510272104010029040160033023140220051206
SPP150Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Haemophilus;multispecies_spp150_5803332740313203556101106509194927931391208327218111809413168970556362934831827836171596157813310772253105309059838087103884061029621117755172146812615143133743942269572458360848758048444365488783143280146422599
SPP151Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp151_411310210278014811825322651203298580002011003301143101471620490300116806340082592014008132920
SPP152Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;multispecies_spp152_27421620017191610425637298334916279714348480216123385717273219178153999633319382829347893114820557013581550946460037110252377152
SPP153Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;multigenus;multispecies_spp153_6612307420021801084900015000452041101001140188551302316752160036000700139005126577600
SPP155Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;multigenus;multispecies_spp155_20010001200010936001210702052431001094022015008426261813013000223700720013060001033124116
SPP156Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;multigenus;multispecies_spp156_528693056252021271011228101406588315818121311926739952734587912757493328649685512192912105621300484490266514456246281681342536551521686007507322493887711751504083348734313630212281129038431278495361895267080732196673131323230059314091223
SPP159Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp159_31916518040911556149006704162028175956168144406402458913445123442910512091365796470672819482327901031179267334272619951257465657743280432813528313713525130108263917972780120108466114363756773153205841042
SPP165Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Kingella;multispecies_spp165_206101503028032280060702003011223012100030040224041202023540231823314021073174196010
SPP17Bacteria;Firmicutes;Bacilli;Lactobacillales;Lactobacillaceae;Lactobacillus;multispecies_spp17_3149000000000000000000011000000317000000000001200000005000000000207000057800000
SPP170Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp170_7346282103007985511070332309177912999725527672735262954364833532810477305110345656651064607326435870310461362534455912431136124187615153932779378615322208693032226102161432278989656731196918150537914431
SPP171Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp171_20270092210001203287000413604530061310010191409301168111120501030063411410122731920081430
SPP22Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp22_393210102220165112414010139144499366441006938108236563461431019165341204104133217331035301740578915637521013581267109793190255217972996735522132443414924485241938143216621645570136922223384271388453410591866711142
SPP25Bacteria;Firmicutes;Negativicutes;Selenomonadales;Selenomonadaceae;Selenomonas;multispecies_spp25_95731906865018221518064732204531981864766322586101125457323103415016243181081731262642126915182183941214713273050
SPP26Bacteria;Proteobacteria;Gammaproteobacteria;multiorder;multifamily;multigenus;multispecies_spp26_3292778013409600265104001271502533358722015024516423146918922031470152141600101110134521501129007813928400037693541949911359114110653116019
SPP27Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;multispecies_spp27_2010010000540507000250050104160009104002161016030153004658820017822246180634650204513305231410
SPP28Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp28_386165300100012861007010040882915801900001231481030700030520818503101200006100243
SPP29Bacteria;Proteobacteria;Betaproteobacteria;Burkholderiales;Burkholderiaceae;Lautropia;multispecies_spp29_299708013360139110176710012029391182740083181601454153720086370121218834641507723760223032199575128517967213378
SPP3Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp3_613412471025220013432027823461492916024162118140158861912396107701271526364427161724021913554910471118102755102346471286651
SPP31Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp31_23725062103120929903842640615261654841050226498172323026311147782711011638186291552102254705712688741011239823268921497154114745111127940319583336152068
SPP32Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp32_71728560102728000290445625208904505053384523452210102405831281421284121115610005201442010010612160500
SPP34Bacteria;Firmicutes;multiclass;multiorder;multifamily;multigenus;multispecies_spp34_43320000450120187254323100880872785508136795502233892214033244180111331016323504916261214662011655902690227233320302139323320
SPP35Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptostreptococcaceae;Mogibacterium;multispecies_spp35_423716190152481011350975512224213067163173819082192607141581012162310317112118781214072202041013150905161101251571311163761532421815557445527
SPP36Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria;multispecies_spp36_444961810580389557431735047024658852623965495244749381403355167853123875263764102834041795751323582175860322956514221178252315942034280668464337173765223844802388314824432005491517660431270225464219206563741326015264236112219841163120425967960267
SPP37Bacteria;Spirochaetes;Spirochaetia;Spirochaetales;Treponemataceae;Treponema;multispecies_spp37_20010001000011000016100001401101005042003003422042409010010434603800283470000026288
SPP4Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp4_430356676026479286047156691481111209496212334124541578316560397135244489467755210112197491371174282561127131521051264829328946810030136552128959799143127401643074374378665564
SPP41Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp41_294026402957210000091016011223737042335770150862561341164748050911014243595540376249118514802825292582324058752569589
SPP42Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp42_635114410316280131006962433429281913261239117302744203193792160152217341411594221219819502459101508424127141493534042245723
SPP43Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp43_70000219016000723402660000038600007017450000000000092640261100060201000102900000110
SPP44Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp44_31331264123055414617400636488342891159390458753351782657051831278872142414915177757464254267174239303010431886281395628193292893894967259823355954899121610153214015817818829157727041112
SPP45Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;multigenus;multispecies_spp45_70513000001330000005001307141200011011161451051611090090111261020708990041309031159672100
SPP46Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp46_43521000000510130057400001590000130100004700255902052010510121270350014212000311005
SPP47Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp47_2352100021030101900011061106013014000030420163009525343481077362022063990233000983001601555912
SPP5Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp5_1815290342879560993321809150400100467636848762078256136662311610105751023844700533631968379481224580449479110211321176355596562740531317123468102619476771005962203147767669751331019864116732462662863216890660444136841745515298565156051318383015163693390541230787009392239313160111286808767720410
SPP51Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp51_2450000100050612000023530302120631900001060411161012100303039020057200004810
SPP52Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp52_30000001000912750000247100000000612000000006100584700012091527006000012500000000
SPP57Bacteria;Firmicutes;Negativicutes;Veillonellales;Veillonellaceae;Veillonella;multispecies_spp57_329825284025449038460587712053825431175620111236594422804105411206359869601270143243322140185737182436826644108108126650910677914039711252721312309855729819385321874333256131646785713852332349854156
SPP59Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp59_436245470881502506111027924718698938581427020509260185670831041261428803264077349111616138091112322371765212238333
SPP6Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp6_31169120463000194232811634212582938193279819410917101852261023490134615383651324172220105121414136025913104821226891874136651230
SPP60Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp60_262369695021813405071250382874958927061195285214133141902877608122254047046876341302081893392391419668299691037448290111825217331702657221006752241014793847
SPP61Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Tannerellaceae;Tannerella;multispecies_spp61_220415214036354012461315158212243728432132154637801561645112705310219115114841172412634641281117014011431210191612152746192406
SPP62Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp62_2309430143170234653001751101412111164000386352106122070151503846134192992221502211016391416196202221
SPP63Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp63_21035937458805512274216860213222229922233619581173230146047916911210762439729610614519694259130619925791107497210669217021265832792163606174389313328778381511153861709152177245423731326581205866911203403413806967668801031
SPP64Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp64_21412160005004400630110103111021712212094100104311004228248065110180018279060560271100170427133
SPP65Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp65_2145978063436202015447293182031012421195025675109238180832143326314013399398102888136906451246077035210591114110390180653822
SPP66Bacteria;Synergistetes;Synergistia;Synergistales;Synergistaceae;Fretibacterium;multispecies_spp66_4006001200000100002110000330170000100002400360155270102001300330238001622910000127852
SPP67Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp67_212130550174376005440749731870157137403014665270526935041151309138295085127291934280019342813615610107133210106
SPP68Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Actinomyces;multispecies_spp68_47520733029280171282222174321313611113009091079530310628154385410054141124290316100226226822634111136413731
SPP69Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Actinomyces;multispecies_spp69_672629029005482800862802257843395854851035627479168601485112713511613032687910921165625536736980472344895875112243171827789185711754117196174861654337150
SPP7Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp7_3359016316300381150197228226587287309130221318180173037948114960343232642204110019086125172983148116212132719340919854297121364302464533316510015295116322112254
SPP71multidomain;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp71_200100008000400000007100060910040303001200013000125091910116394001912018200071050100
SPP75multidomain;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp75_2140000000000009030412020530190031000000000005014221033007873502700311181000010322
SPP77Bacteria;Firmicutes;Clostridia;Eubacteriales;Peptococcaceae;Peptococcus;multispecies_spp77_252431501150110515242521213122030196101110022134002435932171120472128230014049003102182414
SPP79Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp79_313121002048210253716940016111960607512397643603110038234551010529239413211902101382118436267714100528171110
SPP80Bacteria;Firmicutes;Bacilli;Lactobacillales;Streptococcaceae;Streptococcus;multispecies_spp80_1035320415802444531707282113161345257241201821262216258725111814758741075261211968624748344664931775454475134111612161804002191218369365280292158210589185170551191266108171268
SPP83Bacteria;Actinobacteria;Actinomycetia;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;multispecies_spp83_291000000001800271000010005950460157300000100950400073090020000011000019231600288637000715
SPP88Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Porphyromonadaceae;Porphyromonas;multispecies_spp88_2001201226701000763174200114846375060031293892453412580212073419332823439309347533142396864831321815841521052201101664743006718641110728
SPP89Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;multispecies_spp89_2546810001511101900021622410241211120000420132123112300931312521120328793220180554025018115522608
SPP91Bacteria;Proteobacteria;Gammaproteobacteria;Pasteurellales;Pasteurellaceae;Aggregatibacter;multispecies_spp91_22910020222390212108141333235336464327195202301216138530378526101633666231093437371116310371320239103838341321900168125053363132098
SPP92Bacteria;Actinobacteria;Actinomycetia;Actinomycetales;Actinomycetaceae;Schaalia;multispecies_spp92_3164418990107731700814104256308772728215944223334913171635119351216123458911214664010216531312411141804232326571941436721274100932218196854197104013023227464
SPP93Bacteria;Actinobacteria;Actinomycetia;Micrococcales;Micrococcaceae;Kocuria;multispecies_spp93_20210000072000000100000000000000012660001000000000000000000029000000012000
SPP96Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp96_24710001801112185010040420227646001603160126360033180161541016975321537101113108105014721081521
SPP97Bacteria;multiphylum;multiclass;multiorder;multifamily;multigenus;multispecies_spp97_31601507692031901217719054066271106500114014280000011930039336500123607402104302271811119
SPP98Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Atopobiaceae;multigenus;multispecies_spp98_2134331020381254030274114328139917048230263476171745354409916982591912881175063041384181918899633305444311512127147464671831215055696290281363591888336413169270
 
 
Download OTU Tables at Different Taxonomy Levels
PhylumCount*: Relative**: CLR***:
ClassCount*: Relative**: CLR***:
OrderCount*: Relative**: CLR***:
FamilyCount*: Relative**: CLR***:
GenusCount*: Relative**: CLR***:
SpeciesCount*: Relative**: CLR***:
* Read count
** Relative abundance (count/total sample count)
*** Centered log ratio transformed abundance
;
 
The species listed in the table has full taxonomy and a dynamically assigned species ID specific to this report. When some reads match with the reference sequences of more than one species equally (i.e., same percent identiy and alignmnet coverage), they can't be assigned to a particular species. Instead, they are assigned to multiple species with the species notaton "s__multispecies_spp2_2". In this notation, spp2 is the dynamic ID assigned to these reads that hit multiple sequences and the "_2" at the end of the notation means there are two species in the spp2.

You can look up which species are included in the multi-species assignment, in this table below:
 
 
 
 
Another type of notation is "s__multispecies_sppn2_2", in which the "n" in the sppn2 means it's a potential novel species because all the reads in this species have < 98% idenity to any of the reference sequences. They were grouped together based on de novo OTU clustering at 98% identity cutoff. And then a representative sequence was chosed to BLASTN search against the reference database to find the closest match (but will still be < 98%). This representative sequence also matched equally to more than one species, hence the "spp" was given in the label.
 
 

Taxonomy Bar Plots for All Samples

 
 

Taxonomy Bar Plots for Individual Comparison Groups

 
 
Comparison No.Comparison NameFamiliesGeneraSpecies
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 15CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 

VIII. Analysis - Alpha Diversity

 

In ecology, alpha diversity (α-diversity) is the mean species diversity in sites or habitats at a local scale. The term was introduced by R. H. Whittaker[5][6] together with the terms beta diversity (β-diversity) and gamma diversity (γ-diversity). Whittaker's idea was that the total species diversity in a landscape (gamma diversity) is determined by two different things, the mean species diversity in sites or habitats at a more local scale (alpha diversity) and the differentiation among those habitats (beta diversity).

 

References:

  1. Whittaker, R. H. (1960) Vegetation of the Siskiyou Mountains, Oregon and California. Ecological Monographs, 30, 279–338. doi:10.2307/1943563
  2. Whittaker, R. H. (1972). Evolution and Measurement of Species Diversity. Taxon, 21, 213-251. doi:10.2307/1218190

 

Alpha Diversity Analysis by Rarefaction

Diversity measures are affected by the sampling depth. Rarefaction is a technique to assess species richness from the results of sampling. Rarefaction allows the calculation of species richness for a given number of individual samples, based on the construction of so-called rarefaction curves. This curve is a plot of the number of species as a function of the number of samples. Rarefaction curves generally grow rapidly at first, as the most common species are found, but the curves plateau as only the rarest species remain to be sampled [7].


References:

  1. Willis AD. Rarefaction, Alpha Diversity, and Statistics. Front Microbiol. 2019 Oct 23;10:2407. doi: 10.3389/fmicb.2019.02407. PMID: 31708888; PMCID: PMC6819366.

 
 
 

Boxplot of Alpha-diversity Indices

The two main factors taken into account when measuring diversity are richness and evenness. Richness is a measure of the number of different kinds of organisms present in a particular area. Evenness compares the similarity of the population size of each of the species present. There are many different ways to measure the richness and evenness. These measurements are called "estimators" or "indices". Below is a diversity of 3 commonly used indices showing the values for all the samples (dots) and in groups (boxes).

 
Alpha Diversity Box Plots for All Groups
 
 
 
 
 
 
 
Alpha Diversity Box Plots for Individual Comparisons at Species level
 
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERView in PDFView in SVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERView in PDFView in SVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERView in PDFView in SVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERView in PDFView in SVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERView in PDFView in SVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 9ADULT negative vs ADULT positiveView in PDFView in SVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERView in PDFView in SVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERView in PDFView in SVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERView in PDFView in SVG
Comparison 14CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERView in PDFView in SVG
Comparison 15CHILD NON-SHEDDER vs CHILD SHEDDERView in PDFView in SVG
 
 
 
 

Group Significance of Alpha-diversity Indices

To test whether the alpha diversity among different comparison groups are different statistically, we use the Kruskal Wallis H test provided the "alpha-group-significance" fucntion in the QIIME 2 "diversity" package. Kruskal Wallis H test is the non-parametric alternative to the One Way ANOVA. Non-parametric means that the test doesn’t assume your data comes from a particular distribution. The H test is used when the assumptions for ANOVA aren’t met (like the assumption of normality). It is sometimes called the one-way ANOVA on ranks, as the ranks of the data values are used in the test rather than the actual data points. The H test determines whether the medians of two or more groups are different.

Below are the Kruskal Wallis H test results for each comparison based on three different alpha diversity measures: 1) Observed species (features), 2) Shannon index, and 3) Simpson index.

 
 
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 9.ADULT negative vs ADULT positiveObserved FeaturesShannon IndexSimpson Index
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 14.CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERObserved FeaturesShannon IndexSimpson Index
Comparison 15.CHILD NON-SHEDDER vs CHILD SHEDDERObserved FeaturesShannon IndexSimpson Index
 
 

IX. Analysis - Beta Diversity

 

NMDS and PCoA Plots

Beta diversity compares the similarity (or dissimilarity) of microbial profiles between different groups of samples. There are many different similarity/dissimilarity metrics [8]. In general, they can be quantitative (using sequence abundance, e.g., Bray-Curtis or weighted UniFrac) or binary (considering only presence-absence of sequences, e.g., binary Jaccard or unweighted UniFrac). They can be even based on phylogeny (e.g., UniFrac metrics) or not (non-UniFrac metrics, such as Bray-Curtis, etc.).

For microbiome studies, species profiles of samples can be compared with the Bray-Curtis dissimilarity, which is based on the count data type. The pair-wise Bray-Curtis dissimilarity matrix of all samples can then be subject to either multi-dimensional scaling (MDS, also known as PCoA) or non-metric MDS (NMDS).

MDS/PCoA is a scaling or ordination method that starts with a matrix of similarities or dissimilarities between a set of samples and aims to produce a low-dimensional graphical plot of the data in such a way that distances between points in the plot are close to original dissimilarities.

NMDS is similar to MDS, however it does not use the dissimilarities data, instead it converts them into the ranks and use these ranks in the calculation.

In our beta diversity analysis, Bray-Curtis dissimilarity matrix was first calculated and then plotted by the PCoA and NMDS separately. Below are beta diveristy results for all groups together:

References:

  1. Plantinga, AM, Wu, MC (2021). Beta Diversity and Distance-Based Analysis of Microbiome Data. In: Datta, S., Guha, S. (eds) Statistical Analysis of Microbiome Data. Frontiers in Probability and the Statistical Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-73351-3_5

 
 
NMDS and PCoA Plots for All Groups
 
 
 
 
 

The above PCoA and NMDS plots are based on count data. The count data can also be transformed into centered log ratio (CLR) for each species. The CLR data is no longer count data and cannot be used in Bray-Curtis dissimilarity calculation. Instead CLR can be compared with Euclidean distances. When CLR data are compared by Euclidean distance, the distance is also called Aitchison distance.

Below are the NMDS and PCoA plots of the Aitchison distances of the samples:

 
 
 
 
 
 
 
NMDS and PCoA Plots for Individual Comparisons at Species level
 
 
Comparison No.Comparison NameNMDAPCoA
Bray-CurtisCLR EuclideanBray-CurtisCLR Euclidean
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 14CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
Comparison 15CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVGPDFSVG
 
 
 
 
 
 

Interactive 3D PCoA Plots - Bray-Curtis Dissimilarity

 
 
 

Interactive 3D PCoA Plots - Euclidean Distance

 
 
 

Interactive 3D PCoA Plots - Correlation Coefficients

 
 
 

Group Significance of Beta-diversity Indices

To test whether the between-group dissimilarities are significantly greater than the within-group dissimilarities, the "beta-group-significance" function provided in the QIIME 2 "diversity" package was used with PERMANOVA (permutational multivariate analysis of variance) as the group significant testing method.

Three beta diversity matrics were used: 1) Bray–Curtis dissimilarity 2) Correlation coefficient matrix , and 3) Aitchison distance (Euclidean distance between clr-transformed compositions).

 
 
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERBray–CurtisCorrelationAitchison
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERBray–CurtisCorrelationAitchison
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERBray–CurtisCorrelationAitchison
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERBray–CurtisCorrelationAitchison
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 9.ADULT negative vs ADULT positiveBray–CurtisCorrelationAitchison
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDERBray–CurtisCorrelationAitchison
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERBray–CurtisCorrelationAitchison
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERBray–CurtisCorrelationAitchison
Comparison 14.CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERBray–CurtisCorrelationAitchison
Comparison 15.CHILD NON-SHEDDER vs CHILD SHEDDERBray–CurtisCorrelationAitchison
 
 
 

X. Analysis - Differential Abundance

16S rRNA next generation sequencing (NGS) generates a fixed number of reads that reflect the proportion of different species in a sample, i.e., the relative abundance of species, instead of the absolute abundance. In Mathematics, measurements involving probabilities, proportions, percentages, and ppm can all be thought of as compositional data. This makes the microbiome read count data “compositional” (Gloor et al, 2017). In general, compositional data represent parts of a whole which only carry relative information [9].

The problem of microbiome data being compositional arises when comparing two groups of samples for identifying “differentially abundant” species. A species with the same absolute abundance between two conditions, its relative abundances in the two conditions (e.g., percent abundance) can become different if the relative abundance of other species change greatly. This problem can lead to incorrect conclusion in terms of differential abundance for microbial species in the samples.

When studying differential abundance (DA), the current better approach is to transform the read count data into log ratio data. The ratios are calculated between read counts of all species in a sample to a “reference” count (e.g., mean read count of the sample). The log ratio data allow the detection of DA species without being affected by percentage bias mentioned above

In this report, a compositional DA analysis tool “ANCOM” (analysis of composition of microbiomes) was used [10]. ANCOM transforms the count data into log-ratios and thus is more suitable for comparing the composition of microbiomes in two or more populations. "ANCOM" generates a table of features with W-statistics and whether the null hypothesis is rejected. The “W” is the W-statistic, or number of features that a single feature is tested to be significantly different against. Hence the higher the "W" the more statistical sifgnificant that a feature/species is differentially abundant.

References:

  1. Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome Datasets Are Compositional: And This Is Not Optional. Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. PMID: 29187837; PMCID: PMC5695134.
  2. Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis. 2015 May 29;26:27663. doi: 10.3402/mehd.v26.27663. PMID: 26028277; PMCID: PMC4450248.
 
 

ANCOM Differential Abundance Analysis

 
ANCOM Results for Individual Comparisons
Comparison No.Comparison Name
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 9.ADULT negative vs ADULT positive
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDER
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 14.CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 15.CHILD NON-SHEDDER vs CHILD SHEDDER
 
 

ANCOM-BC2 Differential Abundance Analysis

 

Starting with version V1.2, we include the results of ANCOM-BC (Analysis of Compositions of Microbiomes with Bias Correction) (Lin and Peddada 2020) [11]. ANCOM-BC is an updated version of "ANCOM" that:
(a) provides statistically valid test with appropriate p-values,
(b) provides confidence intervals for differential abundance of each taxon,
(c) controls the False Discovery Rate (FDR),
(d) maintains adequate power, and
(e) is computationally simple to implement.

The bias correction (BC) addresses a challenging problem of the bias introduced by differences in the sampling fractions across samples. This bias has been a major hurdle in performing DA analysis of microbiome data. ANCOM-BC estimates the unknown sampling fractions and corrects the bias induced by their differences among samples. The absolute abundance data are modeled using a linear regression framework.

Starting with version V1.43, ANCOM-BC2 is used instead of ANCOM-BC, So that multiple pairwise directional test can be performed (if there are more than two gorups in a comparison). When performing pairwise directional test, the mixed directional false discover rate (mdFDR) is taken into account. The mdFDR is the combination of false discovery rate due to multiple testing, multiple pairwise comparisons, and directional tests within each pairwise comparison. The mdFDR is adopted from (Guo, Sarkar, and Peddada 2010 [12]; Grandhi, Guo, and Peddada 2016 [13]). For more detail explanation and additional features of ANCOM-BC2 please see author's documentation.

References:

  1. Lin H, Peddada SD. Analysis of compositions of microbiomes with bias correction. Nat Commun. 2020 Jul 14;11(1):3514. doi: 10.1038/s41467-020-17041-7. PMID: 32665548; PMCID: PMC7360769.
  2. Guo W, Sarkar SK, Peddada SD. Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories. Biometrics. 2010 Jun;66(2):485-92. doi: 10.1111/j.1541-0420.2009.01292.x. Epub 2009 Jul 23. PMID: 19645703; PMCID: PMC2895927.
  3. Grandhi A, Guo W, Peddada SD. A multiple testing procedure for multi-dimensional pairwise comparisons with application to gene expression studies. BMC Bioinformatics. 2016 Feb 25;17:104. doi: 10.1186/s12859-016-0937-5. PMID: 26917217; PMCID: PMC4768411.
 
 
ANCOM-BC Results for Individual Comparisons
 
Comparison No.Comparison Name
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 9.ADULT negative vs ADULT positive
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDER
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 14.CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 15.CHILD NON-SHEDDER vs CHILD SHEDDER
 
 
 
 
 

LEfSe - Linear Discriminant Analysis Effect Size

LEfSe (Linear Discriminant Analysis Effect Size) is an alternative method to find "organisms, genes, or pathways that consistently explain the differences between two or more microbial communities" (Segata et al., 2011) [14]. Specifically, LEfSe uses rank-based Kruskal-Wallis (KW) sum-rank test to detect features with significant differential (relative) abundance with respect to the class of interest. Since it is rank-based, instead of proportional based, the differential species identified among the comparison groups is less biased (than percent abundance based).

Reference:

  1. Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, Huttenhower C. Metagenomic biomarker discovery and explanation. Genome Biol. 2011 Jun 24;12(6):R60. doi: 10.1186/gb-2011-12-6-r60. PMID: 21702898; PMCID: PMC3218848.
 
CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
 
 
 
 
 
 
 
LEfSe Results for All Comparisons
 
Comparison No.Comparison Name
Comparison 1.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER
Comparison 2.CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 3.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 4.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 5.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 6.CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 7.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER
Comparison 8.CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 9.ADULT negative vs ADULT positive
Comparison 10.ADULT NON-SHEDDER vs ADULT SHEDDER
Comparison 11.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 12.ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 13.ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDER
Comparison 14.CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER
Comparison 15.CHILD NON-SHEDDER vs CHILD SHEDDER
 
 

XI. Analysis - Heatmap Profile

 

Species vs Sample Abundance Heatmap for All Samples

 
 
 

Heatmaps for Individual Comparisons

 
A) Two-way clustering - clustered on both columns (Samples) and rows (organism)
Comparison No.Comparison NameFamily LevelGenus LevelSpecies Level
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 15CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 
B) One-way clustering - clustered on rows (organism) only
Comparison No.Comparison NameFamily LevelGenus LevelSpecies Level
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 15CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 
C) No clustering
Comparison No.Comparison NameFamily LevelGenus LevelSpecies Level
Comparison 1CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 2CHILD HIV-Neg SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 3CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 4CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 5CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 6CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 7CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 8CHILD HIV-Neg SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 9ADULT negative vs ADULT positivePDFSVGPDFSVGPDFSVG
Comparison 10ADULT NON-SHEDDER vs ADULT SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 11ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 12ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 13ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDER vs ADULT HIV-Pos SHEDDER vs ADULT HIV-Pos NON-SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 14CHILD HIV-Neg SHEDDER vs CHILD HIV-Neg NON-SHEDDER vs ADULT HIV-Neg NON-SHEDDER vs ADULT HIV-Neg SHEDDERPDFSVGPDFSVGPDFSVG
Comparison 15CHILD NON-SHEDDER vs CHILD SHEDDERPDFSVGPDFSVGPDFSVG
 
 

XII. Analysis - Network Association

To analyze the co-occurrence or co-exclusion between microbial species among different samples, network correlation analysis tools are usually used for this purpose. However, microbiome count data are compositional. If count data are normalized to the total number of counts in the sample, the data become not independent and traditional statistical metrics (e.g., correlation) for the detection of specie-species relationships can lead to spurious results. In addition, sequencing-based studies typically measure hundreds of OTUs (species) on few samples; thus, inference of OTU-OTU association networks is severely under-powered. Here we use SPIEC-EASI (SParse InversE Covariance Estimation for Ecological Association Inference), a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets that addresses both of these issues (Kurtz et al., 2015) [15]. SPIEC-EASI combines data transformations developed for compositional data analysis with a graphical model inference framework that assumes the underlying ecological association network is sparse. SPIEC-EASI provides two algorithms for network inferencing – 1) Meinshausen-Bühlmann's neighborhood selection (MB method) and inverse covariance selection (GLASSO method, i.e., graphical least absolute shrinkage and selection operator). This is fundamentally distinct from SparCC, which essentially estimate pairwise correlations. In addition to these two methods, we provide the results of a third method - SparCC (Sparse Correlations for Compositional Data)(Friedman & Alm 2012)[16], which is also a method for inferring correlations from compositional data. SparCC estimates the linear Pearson correlations between the log-transformed components.

References:

  1. Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ, Bonneau RA. Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput Biol. 2015 May 7;11(5):e1004226. doi: 10.1371/journal.pcbi.1004226. PMID: 25950956; PMCID: PMC4423992.
  2. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687. doi: 10.1371/journal.pcbi.1002687. Epub 2012 Sep 20. PMID: 23028285; PMCID: PMC3447976.
 

SPIEC-EASI Network Inference by Neighborhood Selection (MB Method)

 

 

 

Association Network Inference by SparCC

 

 

 
 

XIII. Disclaimer

The results of this analysis are for research purpose only. They are not intended to diagnose, treat, cure, or prevent any disease. Forsyth and FOMC are not responsible for use of information provided in this report outside the research area.

 

Copyright FOMC 2025