Does coronavirus 2019-nCoV contain genetic "inserts" from HIV?

Copyright © 2020 — Creative Commons Attribution-NonCommercial 4.0 International License

There's a story going around that the new coronavirus must be genetically engineered because it contains stretches of genetic material identical to HIV.

This is a complicated topic, and I'm going to try for a simple explanation (with the caveat that I have no expertise in this field).

A preliminary draft of a paper was made available online on Jan 31:

Uncanny similarity of unique inserts in the 2019-nCoV spike protein to HIV-1 gp120 and Gag

The gist was that the genetic sequence of the new coronavirus contains fairly short "inserts" that match similar short sections of some unrelated viruses including HIV.

There are several very important things to note:

  1. The paper was not yet peer-reviewed. It hasn't been accepted for publication anywhere.
  2. Once it became available online, it received very harsh instant peer-review.
  3. By the next day, the authors withdrew the article, pending a complete re-write (which is not yet available).
  4. The authors never claimed that the similarity to HIV was evidence of genetic engineering.

Here is the statement posted by the authors when they withdrew the paper:

This is a preliminary study. Considering the grave situation, it was shared in BioRxiv as soon as possible to have creative discussion on the fast evolution of SARS-like corona viruses. It was not our intention to feed into the conspiracy theories and no such claims are made here. While we appreciate the criticisms and comments provided by scientific colleagues at BioRxiv forum and elsewhere, the story has been differently interpreted and shared by social media and news platforms. We have positively received all criticisms and comments. To avoid further misinterpretation and confusions world-over, we have decided to withdraw the current version of the preprint and will get back with a revised version after reanalysis, addressing the comments and concerns. Thank you to all who contributed in this open-review process.

Here is a good example of the technical criticism of the paper, from one of the comments:

...the so called unique insertions in the Wuhan isolates compared to the SARS isolate are cherry picked. ... All of the insertions sites coincides with positions variable across homologs, which make sense in that these positions are important for host interactions. This is not "uncanny", it's simply how selection works. As for the so called "identity" with HIV gag proteins, again, as pointed out by others, is spurious. Both HIV and coronaviruses are RNA virus and are hypermutable. The fact that positions important for host-virus interactions, i.e. where the new insertions were found, can be variable in the new infectious Wuhan isolate is expected and there is no evidence suggests that this is a result of human manipulation. ...

Okay, that's a bit technical. So here's my layman's understanding of the controversy:

It's like checking somebody's words to see if they are plagiarized from another author.

Imagine you're writing a plagiarism-detection program. You give it lots and lots of existing texts for comparison. Then you take the text you're testing, and let your program search for phrases in common with the database of existing texts.

The problem comes in interpreting the results. If you find the phrase "Wherefore art thou Romeo?", you know for certain that it was taken from Shakespeare. But suppose you find "Murder cannot be hid long" or "Go with me to the king" or "I will fetch my gold". Your simple program will say there's an uncanny similarity with Shakespeare, but anybody could have independently come up with those same phrases. There are only so many ways in English to say "I will fetch my gold", so the coincidence is not suspicious at all.

But in order to see that, you have to go beyond blind statistical analysis. This, allegedly, is what the authors of the paper did not do.

Even if they are correct, and the "uncanny" genetic bits did come from HIV, that does not prove genetic engineering. Viruses make a habit of exchanging genetic material with whatever is available, especially other similar viruses. If a person were infected with both coronavirus and HIV, such a transfer could have occurred naturally.

Anyway, as I said, this is complicated stuff. Maybe the authors uncovered something interesting, maybe they tricked themselves into seeing a pattern that isn't there. We'll have to wait for some educated minds to figure it out.

The one thing that's for certain: at the moment, this paper does NOT provide evidence that the new coronavirus was manmade.