# Datação molecular (from Stephen Nylinder`s BEAST course, Manaus 2016)

Set up a standardized analysis model for the Concat.nex data. Identify the clades (A,B,H) and (D, F) in the taxa panel. Set the substitution model to HKY+G with estimated frequencies. Set an uninformative prior on the root (uniform 10-100), and then add a lognormal clock with estimated rate. Define a standard B/D tree prior with default B and D distributions. MCMC: 2M generations, sample rate: 2000 generations.

- For the root height, set a uniform distribution with a minimum fossil age of 10 Ma and a hard upper boundary of 100 Ma (corresponding to the fossil age plus the oldest known age of the group the organisms belong to). If you get a –Inf error, just try executing the file again. Summarize the trees in TreeAnnotator and check the node heights and their uncertainties across the tree. (“Subjective Bayesian–approach”). What does the posterior distribution on the root look like? Why?
- Same as (a), but change the root prior to an exponential distribution with offset 10.0 and a mean of 24.4 (95% Confidence interval = 10-100). This represents a declining probability of older ages. Compare the root height and all node uncertainties with (a). Are they larger or smaller?
- Same thing again, but this time change the root prior to a normal distribution with mean 10.0 and SD 40.2, and then truncate the distribution between lower: 10.0 and upper: 100. Look at the shape of the distribution and compare to (a) and (b). Run the analysis and compare the node uncertainties with (a) and (b).
- If (a), (b) and © are different ways of modeling the uncertainty of a clade to which a fossil belongs, which one feels like the most “realistic” to you?
- Change the root back to (a). Add an informative prior on the clock rate (exp. mean=0.0001). Run and compare traces with (a). What happens with the posterior for the clock rate (ucld.mean)? What happens to the root age distribution? Explain the difference.
- Use the same settings as in (a), but change the clock model to an uncorrelated exponential distribution. Run the analysis and compare the logfiles with (a). Look at the root height, the clade ages, and the clock rates. Explain the effect. Remember, you only changed the clock model, the priors on the clock rates are uninformative in both (a) and (f).
- Same thing as in (f), but change the clock model to a Random Local Clock. Effects? What about computation time?
- For clade (D, F), the origin of the clade can be correlated to a geological event dated to the split between two separate landmasses. The beginning of the split is dated to approximately 3 Ma, and the landmasses were fully separated at 1 Ma. By 0.1 Ma the landmasses were far apart. Where do you place this calibration (stem/crown?), and how would you model the uncertainty of the event? Based on the provided knowledge only, what is your prediction on the age and estimated level of uncertainty of the root?

P.S Did you run any of the analyses on empty data to check you got your priors back?