Sunday, 25 December 2016

It's been a while...

Since starting my PhD back in October 2013, I've managed to keep up a steady stream of posts. The last four months of my PhD did not make for great blog posts, however. I wrote chapters, handed them to supervisors, waited, and then made corrections and sent them back. The process was steady and uneventful. Having said that, there were events in my life that sometimes made writing and motivation difficult. I had to move house, a friend died, unpleasant events from the past reared their ugly heads. Of course these things will happen when you least need them to, so you have to be prepared for them by planning and being organised. Happily, I handed in my thesis on the 31st of October.

A few tips:
  • Give your supervisors as much time as possible to read stuff, as they have other commitments aside from you.
  • After sending your supervisors things to read, regularly remind them politely to read your stuff. Don't leave it till the last minute and then blame them or get all huffy and disrespectful.
  • Do the corrections one at a time, as they come back, on your working copy. You can occasionally (not too often) send around a revision. If you wait until you have all comments back before applying them, you will delay your submission date!
  • Let your supervisors know if there is any part in particular that you would like feedback on.
  • A completed thesis is a big file - don't compile chapters until you really have to!
  • Warn the printers that you will be printing your thesis, so that they can advise you if they are expecting a big workload on that day.
I took the advice to start writing early, and have been writing regularly since the start of my PhD. That includes writing for fun, writing blog posts, and writing science communication articles. I think that has helped me to get into the right mindset, and not procrastinate too much or be afraid to write.

This blog has been helpful in consolidating my thoughts, and logging my progress. Many of the posts now seem silly: I was so worried about aspects of my progress, not making enough progress, procrastinating too much. None of that mattered in the end. Of course, my thesis hasn't been marked yet, and I haven't defended it, so the outcome is still uncertain. But I managed to produce within 3 years and 1 month a thesis that my supervisors are happy with - despite the fact that I procrastinated, travelled, had issues, had a life, had hobbies and interests. In fact, I will dare to suggest that taking time out to do healthy human things assisted the process of writing. I have really enjoyed doing my PhD. My only regret is that I wish that I had collaborated more on projects with other researchers, but one must set up quite a few collaborations in order to increase the likelihood that some of them work.

I will add to this blog when I have defended my thesis, but until then, adios!

Saturday, 9 July 2016

The 20th International Congress of Arachnology

Four months before I started my PhD, I went to my first ever conference: the 19th International Congress of Arachnology (ICA) in Kenting, Taiwan. I presented a talk, and connected with a bunch of fellow idiopid researchers to share ideas. That conference had a massive influence on the start of my academic career. It also made me realise what a truly wonderful bunch of people the arachnologists are, and what an exciting world I was getting into.

Now, four months before I plan on submitting my thesis, I have just had an exhilarating week at the 20th Congress, in Golden, Denver. It was the largest assembly of arachnologists the world has ever known, and there were some inspiring presentations and discussions. My head is still reeling from it all. It was perfect timing for me; I now have some ideas about research fields and grants to apply for, and also some pointers on how to improve my thesis. Although right now I have my head buried firmly in the writing up process (hence my silence for the last couple of months), it was well worth coming. Life doesn't end when a thesis is handed in! There is plenty to do to build my career. Conferences and networking are fuel for a baby scientist's growth, especially in a close-knit community such as the arachnologist community.

I won't bore you with a day-by-day text wall, but here are a few cool things that I picked up from the human and arachnid components of the conference.

1. My usual strategy of orbiting the crowd and then diving in to hit a weak spot has, for the last few conferences, been made more successful by befriending someone similar to me on the first day. One looks a little less predatory and a little more social if one has a friend or two.

2. Having said that, I am always surprised that there are plenty of people who are fine with being on their own, and they don't seem like losers or unpopular people. It is fine at conferences to be by yourself sometimes, as long as you don't do it too much (when the conference is over, you don't want to have missed opportunities).

3. The best time to network is at mealtimes, when you find a space, sit down and munch while listening to people talking about their research. They are pretty much a captive audience and you can get to know them pretty well.

4. Different countries have totally different research dynamics and cultures. Each has its merits and its disadvantages.

5. Amblypygi (an order of arachnids that look terrifying at first but are harmless) are underappreciated powerhouses of sensory perception, neural processing, and cuteness.

6. Jumping spiders have eyes at the end of long tubes, and in translucent species you can see the tubes move around as the spider scans its surroundings. They only have clear colour vision on a small part of their retina, so they need the ability to move their eyes and focus on different parts of their environment.

7. Sometimes (often, maybe), technology progresses faster than our ability to use or understand it and its outputs.

8. Even analyses using rudimentary techniques, or those which produce inconclusive results, are informative and of some use, so long as their limitations are taken into consideration.

Anyway, I need to get all my thoughts together and get back into thesis mode. Conferences really take it out of a person!

Tuesday, 12 April 2016

Results!

I have been cruising around in my warm fuzzy cocoon of writing (which is so nice and entirely predictable, although pretty slow going) and making trees (which are improvements upon my already pretty finished phylogenies, so I can't go wrong). Until a few weeks ago, I was gently (but increasingly obviously) avoiding any other kind of analysis. There are big scary computer programs that I have to fight with, and I feel so tired after grappling with BEAST. But also I was afraid that anything I did would just retrieve the information that my data were inadequate and I'd need to start again. Before analysis, I don't know whether my results are useful or useless. They happily exist on my computer as big files filling up my hard drive and making me feel accomplished. But a PhD is not assessed by how much hard drive space your data takes up. A PhD is assessed by the defence or viva and a thesis, which has to be written, and cannot be written without results.


Analysing my results determines the usefulness of my data...opening the box determines the fate of the cat.
Luckily, I managed to get over this elephant in my own personal room. I started gently, downloading and exploring the programs that I had to use, and running through tutorials using the examples given. Then I tried to make my data look like the examples. Then I ran my data through the program and expected failure. The first few times, there were hiccups - usually a grammatical error, such as a space in the wrong place. After correcting them, I got some pretty cool and exciting results (well, I find them exciting, and perhaps one or two other people in the world would too, after I've explained to them at some length why they are exciting), and managed to make figures out of them. I was pleasantly surprised that my data meant something.

Now data analysis is addictive, and I keep trying to improve on previous runs of analyses and tests. I do not believe that data analysis can ever be completed; only abandoned.

If you're in the above situation, I would encourage you to face your fear and take the plunge so you can get on with your life, and also to save every version of your results and keep it somewhere safe on your computer, cos it's awkward when it gets deleted or overwritten by mistake.

Wednesday, 23 March 2016

Clean lines and organisation

I haven't posted here for a couple of months, mostly because this blog has been helping me to write consistently even when I had no thesis writing to do. Now, I am writing every day anyway, so I don't need to write for the sake of writing. However, this blog has been so useful to me in terms of tracking my progress and realising how insignificant most seemingly catastrophic events have turned out to be, that I cannot abandon it. I would definitely recommend any new PhD student to write a blog, even if nobody ever reads it.

Every PhD blog has at least one post about advice for new students. New students get a LOT of advice. Much of it is useful, but much of it is also useless. Each student and project is unique, and they need to figure most stuff out on their own. The generic advice often given by universities (write a little every day, find a good work-life balance, talk to your supervisors etc) usually holds true and is good to stick to.

However, there are two things that I would add to the university spiel. Firstly, get some bloody exercise. The bodies of PhD students become stagnant, decrepit, and downright unhealthy from three years of sitting at a desk indoors and prioritising study over everything else. Exercise helps you to avoid health issues that can stem from this lifestyle. It helps you to integrate yourself back into society if and when that time should come. But your body also houses your brain, which needs looking after if it is to remain sharp and useful. Yes, study is the most important thing in your life right now, but you can't study without taking care of the things that help you study, like your brain. Taking time out to exercise makes you feel good, helps you sleep well, and forces your brain to divert its attention away from how best to compoverise the distribution of the arcbenders in R. It forces you to focus on yourself.

The second important thing that I have learned is about organisation. I have a diary, a to-do list, and a three-month plan in addition to the mandatory PhD timetable that I constructed for my proposal (and the slightly less naiive one that I made for my 18-month report). To start with, I figured each task would take a small amount of time and I could progress in a modular way. However, over the time I have been here, things have tended to not work or blend into each other, and it all took a lot of time. To start with, this really panicked me and I thought I was not working hard enough. But that is a self-centred view and one that entirely disregards the nature of science and the universe. It is all new stuff that I am doing, and some of it will go wrong. I would encourage new students to be as organised as you can possibly be, and more organised than you have ever been in their lives - you need to know what you have done, what is left to do, and each little step leading up to completion. You need to know exactly what the next steps are. Every PhD is composed of data collection, data analysis, and writing; these steps tend to bleed into each other and that is ok. In fact, it is vital for the PhD to work - when you start analysis you will realise there is a bit more data to collect. You also need to be writing the entire time, not just after you've done the analysis. So although you need to be extremely organised, you also need to be aware that stuff goes wrong and just because you say you will have a completed phylogeny in two weeks does not mean that it will be so. Everything takes a lot longer than you expect.

Right, on to finishing the calibration of my phylogeny (something I meant to complete about a year ago). Wish me luck ;)

Friday, 18 December 2015

Lessons learned in BEAST

At the moment, I am correcting an introduction to one of my chapters, analysing some ecology data, and running BEAST analyses in the hope that I will get nice, convergent trees. If you do lots of things at once, you do them slowly, especially if you are staying with your parents for a bit. But I have finally managed to sort out my CO1 tree so that the analysis converges and I don't get negative branch lengths (which look like trapdoor spiders are evolving backwards - try removing partitioning or using a random clock to remedy it). Now onto CytB.

I have kept a diary of hints and tricks about BEAST that I have learned along the way. It is now illustrated with many and varied trace plots.


Running BEAST takes a long time, and you don't want a power cut during the process (though sometimes putting your computer to sleep pauses the run). To start with, I run only 10,000,000 generations and try to make that work. You can tell if an analysis has worked because your trace plot for the posterior prior looks like a fuzzy caterpillar that has been straightened out (i.e. it isn't moving in any general direction other than left to right). The tighter the fuzziness, the better. The ESS values (estimated sample size) are >200 for every prior. And the tree looks like a phylogenetic tree, rather than a straight line, an invisible tree, or a weeping willow.

The first thing to do to a set of sequences that have been aligned (and put in the correct reading frame, if needed) is to test for appropriate substitution models using JModeltest. There are many different model testers, but JModeltest is very good, versatile and thorough, and I have investigated Beast 2 (which can model test using Bayesian algorithms rather than maximum likelihood) but ultimately came back to JModeltest because of its simplicity and suitability for answering my particular questions. Then, one takes one's sequence alignment and tests for partitioning (if the alignment has no gaps) using PartitionFinder (which is dead easy to use as long as you can work Java, which in my experience likes to play up a lot). If an alignment requires partitioning, it means that one or more bases in each codon evolves under different rules and times than the other bases. You can edit your alignment in a NEXUS file using the charset command to specify partitioning for Beauti.

The program BEAST is only for analysis. You set up the XML file in Beauti to analyse in BEAST. Of course, PartitionFinder will tell you that your dataset evolves under different models than what JModeltest says, because it is testing each of the three bases in the codons separately. I have wracked my brains over how to deal with this, because JModeltest acts as though all the base pairs evolve under the same model, so it averages the models out. There are lots of solutions to try: select the model output by JModeltest for all partitions; select the models output by PartitionFinder for their respective partitions, or select one of each. You can always try all of these options, but the difference is usually slight because the models are usually very similar and BEAST is very clever. I find that selecting the models output by PartitionFinder works best first, but if there are important parameters estimated by JModeltest, you can put those into Beauti under the Priors tab (but don't at first; you may not need to).

When first running the analysis, set only the model that you are using (under "sites"). Do not input any of the values for parameters estimated by JModeltest. You can limit the obscurity of the model by restricting the number of models that JModeltest tests for to 40 (select 5 under "number of substitution schemes"). Then you will only get models that can be implemented easily in BEAST (this is lazy though - you should really test more thoroughly and edit the BEAST XML file, but it doesn't make a huge difference and it takes people like me a very very long time to work out what needs changing, though it's easy enough when you work out how, but then BEAST runs for a few generations and crashes and you can't work out where you made a mistake).

When first running the analysis, set the clock to lognormal relaxed.

Set the MCMC chain to run for 10,000,000 generations.

If you have an outgroup, enforce it under the Taxa tab.

Then run it in BEAST!

When the run is complete, check the trace file. First check the ucld.stdev prior - if the mean of the estimates is more than one, then a strict clock is not appropriate (in other words, continue with the relaxed clock for now, or try a random clock). If the mean is <1, try a strict clock. If you need to try another clock, do it now and don't change anything else. See what difference (if any) it makes to your ESS.

The ESSs are good and the posterior caterpillar is heading in the right direction, but it appears to be melting in parts which isn't ideal.
This ucld.stdev mean is <1, so I tried random and strict clocks (neither of which improved the ESS, so I stuck with a lognormal relaxed clock - for now).


If you still have low ESS, change the offending priors to normal distributions around the mean, and the mean value should be the one estimated by JModeltest. Run it again. Change one thing at a time only, each time you run the analysis. Checking or unchecking the "Estimate" box next to where you set the clock also makes a huge difference to ESS.

This is a summary of what I have learned so far, in the last few weeks, about BEAST. My learning is a work in progress, so the above is hardly a substitute for advice from people who know what they are talking about. Try it at your own risk! It is a tricky program and help is not easy to find unless you can interpret computer speak (a useful skill in evolutionary biology). But it is very clever and very useful. Also, while an analysis is running, it's like waiting for Father Christmas to come.

If you're reading this because you need some tips with BEAST, message me if you want me to clarify anything (I tried to keep the post short!). If you're reading this for fun...well, surely not.

Good luck, merry Christmas, and I will see you in the new year!

Wednesday, 18 November 2015

The Watch

I thought it said in every tick:
I am so sick, so sick, so sick;
O Death, come quick, come quick, come quick.
-The Watch, by Frances Cornford.
 Time is ticking. By October 2016, my three years will be up. By February 2017, my visa and stipend will run out. I can sustain myself with part-time work if my stipend runs out, but to renew my visa I need to prove that I have money in the bank. Which I will not.

The last months of a PhD student's study can be hell, and it's usually harder for international students than it is for locals. Firstly, you have to take what data and results you have (usually a lot less than you were expecting), analyse it and write it up before going through cycle after cycle of edits and criticism. Secondly, your stipend runs out and you are expected to live off feral pigeons and non-toxic packaging material. You also have to pay university fees. Thirdly, you have to contemplate your extremely uncertain future (being qualified no longer means you will get a good job). Fourthly, you will be leaving your life and friends behind soon. Fifthly, you face deportation if your visa runs out before you're finished.

Nobody really tells you this when you're starting your PhD, but it doesn't take long to work it out.

I have every intention of finishing my thesis by October 2016. I'm also applying to become a resident, because I like New Zealand a lot and I'd love to secure a postdoc here (an unrealistic goal, but one worth shooting at). I love doing my PhD and haven't yet run into the exhausting stress that I am told will come. If this project was set to run for twenty years, I'd be so happy, and I'd probably want it to run for another twenty years after that because there is so much left to find out. But I only have till next year. I'm going to re-evaluate my timetable for the next 11 months so that I can fit everything in. This is what I have left to do:

  • Data collection
    • I recently got more funding (thanks to the Brian Mason Trust), so can do one more massive pitfall trapping effort in the winter to collect more males. They will be useful for my DNA versus morphology in species delimitation research. I also have four males from Quail Island to sequence.
    • I must visit Otago Museum to look at their extensive Cantuaria collection, and compare morphology with the specimens that I have collected.
    • Investigating ArcGIS layers to add more data about where Cantuaria populations are found.
  • Data analysis
    • I'm still fiddling with BEAST to get decent trees, but it's getting much easier.
    • I've started using SPLITSTREE to make phylogenetic networks and see whether there's any hybridisation going on.
    • Applying a molecular clock.
    • I haven't yet looked at GenGIS to start phylogeography research.
    • Generalised linear mixed model in R to analyse habitat selection data.
    • Delimit and describe species.
  • Writing
    • I need to get up to date with all my chapters!
    • When I have done all of the above, I can finish writing it up.
    • Then I need to send it to my friends to edit out stupid mistakes.
    • Then I need to send it to my supervisors to edit out mistakes.
    • Then it gets read by my assessors.
There are other important things that I need to be getting on with, too; I am also looking for (and nurturing) postdoc opportunities, and training falconers and falcons so that I can make some money and New Zealanders can learn the difference between a falcon (endangered) and a harrier hawk (not endangered in the slightest). I want people to care a bit more about the wildlife in this country, rather than just appear to care about it.

At some point, I also want to write a book on BEAST for beginners, written in English rather than computer language, but if you're reading this and want to steal my idea, please do - just make sure it's written for beginners and not people with a degree in advanced computer physics or whatever. I believe in a future world where first-year PhD students can understand what they are doing when they plug numbers into BEAST, and not be told "Of course you are getting a misconbobulated flatuole constant. You forgot to destatify the gayn trigger" or other such stuff which makes no sense to those of us with a biology background.

So I have a fair bit to do, which won't realistically get started in earnest before January. But I'll do what I can until then!

Tuesday, 6 October 2015

Two years in: funding and data analysis

I'm two years into my PhD, which started officially on October the 1st 2013. This time last year, I had completed the following:

Completed my proposal and seminar
Collected female specimens from throughout NZ
Been handed some males from the public, and pinpointed good places to set pitfall traps
2 conferences with presentations
Begun sequencing for phylogeny
Found someone to help me with genetic and ecology fieldwork early next year.

I was a bit sad that I hadn't completed my objectives, and that real life got in the way of me devising my perfect routine. Over the past year, I got used to this. Everything takes longer than I think, and my initial objectives were rather unrealistic. Money, friends, and hobbies are all things that get in the way of doing my PhD, but they are also things that keep me sane, which is quite important. But instead of trying to cut them out of my life, which I have been trying to do but failing miserably, I have decided to live with them and take them into consideration when planning things. That has been much kinder to my blood pressure. Every day, I prioritise my PhD above everything else, but after I have done a bit of work I can do the other things that are screaming for my attention. I'm on track, I think, so it seems to be working. I have, however, become really bad at time management and answering my emails, because I have periods during the day when I want to work on my PhD and the rest of the world be damned. Plus I have discovered that one can "flag" one's emails to prioritise dealing with them. I can do the flagging part fine; it's the dealing with them that I usually forget.

My project went over its funding allocation for this year, which has meant that I had to stop lab work. It was good in a way, because I needed to stop anyway. It was getting to the point where I was trying crazy and superstitious ways to squeeze sequences out of extracts that probably can be sequenced using some method somewhere, but weren't worth trying every possible combination of every parameter and ingredient. After completing my last 96 sequencing attempts (as usual, most didn't work), I helped out on a field trip with some undergrads. While out there, we caught some male trapdoor spiders which would be really good to sequence. I still have them. I really hope I get enough money to sequence them before October next year. They are from an island and would be a really interesting piece to add to the puzzle. I completed my environmental data collection too (I think). So the only data I have left to collect is morphological.

Now I am doing DNA data analysis, which involves downloading programs which don't work, and trying to get them to work. I just cracked one yesterday, and went to use one that I have used since my honours project and know really well, but it needs downloading again, and it won't install, and it requires Java, and Java won't install. This used to really stress me out but now I feel weirdly zen about it all because it's familiar. The feeling when something finally works is incredible. I think when this PhD is over I'd like to write a book on basic molecular techniques and analysis for people like me. I get this feeling that everyone instinctively knows how to work these programs except for me, and then someone comes up to me and asks how to do something basic and I realise it isn't just me.

All I have to do is phylogenetic and niche modelling data analysis, then finish my thesis. A year's work, easily, hopefully?

Anyway, here's to the next and final year. Cheers!