diff --git a/labs/lab_8.md b/labs/lab_8.md
index c9fc64096..37131dbd5 100644
--- a/labs/lab_8.md
+++ b/labs/lab_8.md
@@ -148,6 +148,148 @@ Loglikelihood: -712272.711731
 Writing output files.
 ````
 
+It has now produced two output files: chinook.g5.m2.c5.d5.20miss.subsample.nchrs.1.Q and 
+chinook.g5.m2.c5.d5.20miss.subsample.nchrs.1.P. The Q file includes the ancestry of each
+sample for each group. In this case, there's only one group, but this will be more useful with
+higher K values. The P file estimates allele frequencies for each SNP in each population, which 
+isn't useful for us now. 
+
+Ultimately for this type of algorith, we want to try multiple K values and find which one is best. To
+do that we use a cross- validation method. In this method, they mask some genotype values, and try
+to predict what they are, based on the groupings. Lower cross-validation error means the model
+is a better fit. Lets run this for K from 1 to 10. 
+
+```bash
+for K in `seq 10`
+  do
+  admixture --cv chinook.g5.m2.c5.d5.20miss.subsample.nchrs.bed $K -j3 | tee log${K}.out;
+done
+```
+The "tee" command is allowing us to save the output that is output to the screen into a file. 
+We need those files to compare the CV error for each. Lets look at all the CV error scores.
+
+```bash
+grep -h CV log*.out
+```
+
+```output
+CV error (K=10): 1.00561
+CV error (K=1): 0.46015
+CV error (K=2): 0.47496
+CV error (K=3): 0.53477
+CV error (K=4): 0.60377
+CV error (K=5): 0.66514
+CV error (K=6): 0.72932
+CV error (K=7): 0.79938
+CV error (K=8): 0.86884
+CV error (K=9): 0.94260
+```
+
+From this we can see that the lowest CV is with with K=1, and K=2 is only slightly worse. This
+suggests that there isn't much population grouping in our dataset. It's always good to look at
+how your groups are distributed amongst your samples and see if it makes sense biologically. To
+do that, we have plot the Q values for each K. We're going to use R for this, so open the R 
+studio server window on indri.
+
+```R
+library(ggplot2)
+library(dplyr)
+library(readr)
+library(tidyr)
+
+#Get sample names in order from the .fam 
+sample_names <- read_table("lab_8/vcf/chinook.g5.m2.c5.d5.20miss.subsample.nchrs.fam",col_names = F) %>%
+  select(X1) %>%
+  rename(sample.id=X1)
+all_data <- tibble()
+
+#Loop for each K value
+for (k in 1:10){
+  #Read the Q file 
+  data <- read_delim(paste0("lab_8/vcf/chinook.g5.m2.c5.d5.20miss.subsample.nchrs.",k,".Q"),
+                     col_names = paste0("Q",seq(1:k)),
+                     delim=" ")
+  #Add in sample names
+  data$sample.id <- sample_names$sample.id
+  #Add in the K value
+  data$k <- k
+  
+  #Convert the wide table to a long table, which is easier to plot
+  data %>% gather(Q, value, -sample.id,-k) -> data
+  #Bind together all the outputs of each K value
+  all_data <- rbind(all_data,data)
+}
+```
+Before we do the plotting, lets get some information about the samples. Thankfully,
+all the information is in the sample name. e.g. Otsh_Chilcotin_2018_10.merged.dedup.bam.
+-Otsh <- species abbreviation
+-Chilcotin <- population
+-2018 <- sampling year
+-10 <- sample number.
+
+We probably will want to plot by year and population, so lets extract that into our table.
+
+```R
+all_data <- all_data %>%
+  separate(sample.id, c("species","pop","year","spacer"), "_", remove=F ) %>%
+  separate(spacer, c("sample_n"),remove=T)
+```
+And now, lets plot for K=2
+```R
+all_data %>%
+  filter(k == 2) %>%
+  ggplot(.,aes(x=fct_reorder(sample.id,pop),y=value,fill=factor(Q))) + 
+  geom_bar(stat="identity",position="stack") +
+  theme(axis.text.x = element_text(angle = 60, hjust = 1))
+```
+![](../figs/lab_8.1.png)
+
+This looks interesting! We're sorting the x axis by the population label, so 
+samples are ordered by population. We can see that the first half samples are mostly group 1
+and the second half are mostly group 2. This is evidence that those populations reflect
+actual genetically separated populations because admixture is rediscovering the groupings,
+without being given the labels. We can also explore what the groupings look like at higher levels
+of K.
+![](../figs/lab_8.2.png)
+
+## Questions
+- Make a nice version of the plot for K=2. Update the axis labels, colors, theme, title and sample names.
+- Make a version of the plot with K=2, where samples are ordered by their ancestry proportion.
+- Try running admixture with K=10 again and see if you get identical scores. What does this mean?
+
+Next, lets do a PCA on the samples to see how they are related. We're going to use plink for this, 
+so go back into the terminal.
+
+```bash
+plink --vcf chinook.g5.m2.c5.d5.20miss.subsample.nchrs.vcf --out chinook.g5.m2.c5.d5.20miss.subsample.nchrs --pca --allow-extra-chr --double-id --autosome-num 95
+```
+This outputs a .eigenvec and a .eigenval file. The .eigenvec file has the scores for the
+first 20 Principal Components, which we want to plot. We have to load this
+into R to plot it.
+
+```R
+pca_data <- read_table("lab_8/vcf/chinook.g5.m2.c5.d5.20miss.subsample.nchrs.eigenvec",
+         col_names = c("sample.id","spacer",paste0("PC",1:20)))
+```
+In this case, the data file does not have a header, so we have to tell it what the columns are named.
+The command "paste0("PC",1:20)", makes the names PC1, PC2 ... PC20. 
+
+Now we want to make a scatter plot of these scores.
+```R
+pca_data %>%
+  ggplot(.,aes(x=PC1,y=PC2)) +
+  geom_point()
+```
+![](../figs/lab_8.3.png)
+
+## Questions
+- Make a version of this plot, with population as point color, and year as point shape.
+- Add ellipses around the population groups to show their range of values.
+
+Lastly, we want to calculate FST between our populations. 
+
+
+