Prabhjot Sanghera, François Belzile, Waldiodio Seck and Pierre Dutilleul*
The reported study was motivated by the necessity to select 30 soybean lines from a total of 137 for a sophisticated 3-D phenotyping analysis of the Root System Architecture (RSA), which would not allow that all the lines be included and replicated. A representative subset of size 30 was found after performing four cluster analyses and comparing the results of two more particularly. These two cluster analyses are based on the data for 12 RSA-related traits previously collected in 2D on three replicates of the 137 soybean lines and the first six principal components representing 95% of the total dispersion after data standardization in a preliminary Principal Component Analysis (PCA). The two cluster analysis procedures provided 16 soybean lines that were the closest to the centroid of their respective cluster in both cases. Fourteen more were found to be common and at a distance from the centroid below a pre-set threshold value without being the closest. The final selection of 30 excludes two soybean lines that were the second member selected from their cluster, and includes instead two soybean lines that are the closest and second closest to their respective centroid in the cluster analysis after PCA on standardized data, but are not well represented in the other cluster analysis. In conclusion, the 93.3% overlap between the two sets of results shows a robust clustering structure in RSA 2-D phenotyping in soybean. Our statistical approaches and procedures can be followed and applied in other biological frameworks than plant phenotyping.
HTML PDFShare this article
Journal of Biometrics & Biostatistics received 3254 citations as per Google Scholar report