What is the appropriate informed consent for Reference Materials from the Genome in a Bottle consortium?

In the Reference Material (RM) Selection and Design working group, there was considerable discussion about selecting genomes with consents that are appropriate for use as RMs from the consortium.

Two primary issues were raised:
(1) The risk of re-identification of the individual through use of genomic and other information will be higher for a genome chosen as a national RM than for genomes used in population genetics or other research.
(2) How extensive should the consent for commercialization be (e.g., for commercial use, commercial redistribution of derived products, etc.)?

Because of the extensive data available for the HapMap/CEPH/1000 Genomes sample NA12878 and her pedigree, much discussion has revolved around whether her consent (most recently for the HapMap Project) is appropriate for a NIST RM. Personal Genome Project samples have also been proposed as attractive genomes due to their broad open consent for re-identification and commercialization, as well as other materials such as iPSCs and tissues.

Since the meeting, we've had a number of discussions by email and in person, which we've included below. In this forum, we hope to make this a transparent, public discussion to get input from all interested parties so that we can make the best decision in consultation with our NIST IRB, legal staff, and others.

Email discussion so far, with most recent emails at the top:


Dear Colleagues --
Thank you all for your input and discussion regarding the propriety of the consent for NA12878 to be used as the first NIST Whole Human Genome Reference Material. After discussing the HapMap consent with Jean McEwan and Lisa Brooks at NHGRI, we think that it is probably best to use PGP samples for NIST Reference Materials. The primary concerns with NA12878 involve (1) the high profile of this as the first NIST genomic Reference Material, leading to a greater risk of re-identification than originally anticipated by the consent and (2) the lack of consent for commercial redistribution and other possible uses in the future, such as creation of induced pluripotent stem cells.
Fortunately, we can still learn much from analyses of the existing data for NA12878, and will certainly apply these lessons to the NIST Reference Materials. We are still really interested in this discussion and in your input. If there is consensus that we should move forward with PGP samples, we hope to select at least the first one or two trios from the PGP project, and will start the process to gain IRB approval here at NIST.
Please feel free to respond directly to this mail, to cc others as appropriate, or contact us directly if you have concerns or other opinions.
Best regards,
Marc Salit


On Aug 25, 2012, at 8:32 AM, george church wrote:
I agree. A transparent public record (as you mentioned at the Aug 16-17 meeting) sounds like a good idea.


On Aug 24, 2012, at 5:00 PM, Salit, Marc L. wrote:
Ditto Linda Beth's thanks -- this is a good analysis that will help us ask the right questions and make the right decisions.

Any further thoughts from anyone?

We'll keep this email list apprised of anything we learn from further discussion at NIST -- and unless I hear objection to it, we'll plan to make this email chain public on the very-soon to be spun-up genomeinabottle.org consortium site, so everyone knows what's going on. I think this conversation should be transparent and open for discussion on that site.

Best regards -
Marc Salit


On Aug 24, 2012, at 9:15 AM, Schilling, Linda Beth wrote:
Thanks so much for all the additional insights. This will be very helpful.

Linda Beth
Linda Beth Schilling
Senior Coordinator and Policy Advisor for Human & Animal Subjects Research at NIST
Office of Special Programs
Laboratory Programs Office
National Institute of Standards and Technology


From: george church
Sent: Friday, August 24, 2012 7:12 AM
Subject: Re: NA12878 consent?

Below are some comments for NIST legal folks to consider, including some from Dan Vorhaus http://www.genomicslawreport.com/index.php/author/dvorhaus/

The three issues for 12878 are: 1) consent for non-research commercial use. 2) Explicit consent for re-identification, 3) removing samples (not just data) after withdrawal of the HapMap participant (or child, in case of death of participant, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2106154). As was said by several people at the Aug 16-17 meeting, this is higher profile project and less controllable than previous research use -- and general sentiment to see this as a new project and keep it from getting off on the wrong foot for an expedient that will probably seem tiny soon, since the technology is changing so swiftly. So why not reconsent 12878 (or children, if deceased) to specifically address these three issues? If it is hard to get reconsent, then that is another red flag.

LB: "The possibility of accumulated genetic information being used eventually to identify people is very general for genomic research, but does not prevent use of samples, by NIH policy."

GC: Yes somewhat general, but not totally, since some protocols do squarely address re-identification (and enable frequent recontact). Also, recent NIH policy (e.g. for dbGAP samples) aims at requiring researchers to promise to keep samples away from people who might re-identify and/or identify high penetrance traits. This is hard enough with research-use, but will be even harder for the proposed much more widespread use.

LB: The one possible sticky point is “The Repository does not let anyone sell material from samples or cell lines.” This was meant to prevent secondary distribution by companies, not necessarily to prevent a government agency from distributing a standard.

GC: I agree that this is sticky. What "was meant" is not aligned with what is said. "Anyone" includes governments and their employees. Even if we stick to spirit rather than letter of the law, and even if we can guarantee that all DNA recipients refrain from redistribution, nevertheless the non-research uses (for example as part of diagnostic clinics leading to abortions) may not meet expectations of the HapMap participant.

HapMap Consent Provisions & DBV Commentary

"It also will not include any information that could identify who the individual people or families are." (pg 1)

[Note the explicit and absolute promise of de-identification, although this is qualified by the disclaimer on pg 2.]

Because the database will be public, people who do identity testing, such as for paternity testing or law enforcement, may also use the samples, the database, and the HapMap, to do general research. However, it will be very hard for anyone to learn anything about you personally from any of this research because none of the samples, the database, or the HapMap will include your name or any other information that could identify you or your family." (pg 2)

[The de-identification language is, as we have discussed, problematic. It suggests that any re-identification is very unlikely, and does not discuss any of the potential consequences should such identification occur. I am not entirely familiar with how these samples are being proposed to be used but, presumably, if one or more is being used as a "national reference standard genome" then the risk of re-identification increases simply due to frequency of use and the potential for greater interest in breaking anonymity. Note also that the language here is all framed around "research" uses.]

"The Repository will send the cell lines to researchers around the world to create the HapMap and to use in many future genetic studies as described in this form. The researchers will have to follow all U.S. and international laws and guidelines that apply to research. All studies using the cell lines from the Repository will have to be approved by the Institutional Review Board (IRB) of the Repository." (pg 2)

[As above, the discussed uses are all research in nature, with no mention of anything other than an IRB-approved research study.]

"The Repository does not let anyone sell material from samples or cell lines. However, information from genetics research sometimes helps companies make products to diagnose or treat diseases. If information from your family’s cell lines leads to making a product, it would probably contribute only in a very small way. Also, because the cell lines will not have names on them, neither the researchers nor anyone at the Repository would know if your samples were even used. So you will not get any additional payment for having your sample used in this project." (pg 2-3)

[This appears to me to be a fairly explicit ban on the direct sale of materials, including cell lines. What it does not prohibit is secondary commercial uses, for instance the commercialization of a diagnostic or therapeutic emerging out of a research laboratory where the underlying research was performed using the HapMap consented sample. I think that, per Lisa's email, "commercial research" is really only allowed in the sense that it is disclosed as an unavoidable byproduct of primary scientific research performed in non-commercial settings. I do think that is a distinction.

To help see the distinction more clearly, it might be useful to compare the HapMap's language to the explicit authorization for third party commercial use provided by the PGP's consent, Section 8.4, which reads in relevant part: "However, information and materials that you provide, including DNA sequence data and cell lines derived from your tissue samples or specimens, may be made available to third parties for research, patient care, commercial or other purposes, and these third parties may commercially profit from the data or other information that you contribute to the PGP."

On balance, I think the language in this section, as well as the focus elsewhere in the document on exclusively research uses, suggests that a participant's expectation would almost certainly be that his or her samples and resultant cell lines would not be made directly available for commercial research, even if they might incidentally or indirectly further commercial objectives.]

"How will you protect my privacy? We will protect your privacy carefully, just as we have always done in the past. The only people who will know your name or any other personal identifying information will be the clinic coordinator, the physician, and the principal investigator for the Utah Genetic Reference Project at the University of Utah. We will not give this information to anybody else. While the University of Utah will keep your new, signed consent form, nobody else will see it. The sample stored at the Repository and used for the HapMap will not have your name on it. Although it will have a code number, nobody except us will know the name of the person the code number is linked to. So nobody at the Repository or who studies your sample will know that it came from you." (3)

[In the section devoted specifically to privacy, no mention is made of the possibility that privacy might be compromised, either intentionally or accidentally, furthering the assessment that, despite the earlier and arguably insufficient disclaimer, the participant would be led to believe that there are no meaningful privacy or related concerns associated with participation.]

"What are the risks of having my sample used for this project? If your family’s samples are used, lots of genetic information from your samples will be put in the database, and lots of people will be able to look at it for any purpose. However, there are only a couple of ways anybody could trace the information back to you. One is if they thought your information might be in the database, got another sample from you, did many tests on that sample, and then compared the
genetic information from those tests with the information in the database. The other is if somebody compared the information in the database with genetic information known to be from you that was in another database and figured out who you were. The risk of either of these things happening is very small, but it may grow in the future."

[This section at least acknowledges that re-identification is possible, but asserts that its risk is exceedingly small. It also acknowledges that this risk profile may change. I understand that NIH's current policy is to permit research premised upon de-identification, even when there is a risk of de-identification. However, as mentioned above, additional consideration may need to be given to this issue if this or similar samples might be used in a manner that creates greater visibility and, potentially, risk of re-identification than was perhaps initially contemplated.]

/HapMap Consent Provisions & DBV Commentary


On Aug 23, 2012, at 4:38 PM, Salit, Marc L. wrote:

Thanks Mark -

We'll hold off on celebrating for now -- we haven't put anything through our legal folks for approval yet. The HapMap reconsent looks reasonably appropriate; there is at least one concern I've heard upon a more careful read. We're not certain that we understand all the implications of the terms of withdrawal in that document (see attached page 4 "Can I change my mind…" and checkbox 2 on the consent signature page). We'll be looking at that, and the history of the original and re-consents for this genome, when we meet with our colleagues at NHGRI.

We hope to know more in the next week or so.

Best regards,


On Aug 23, 2012, at 2:28 PM, Mark Depristo wrote:

Hi Marc and Justin,

Thank you for the update. Your conclusion is very reassuring.




On Wed, Aug 22, 2012 at 12:05 PM, Salit, Marc L. wrote:

Hi Lisa --

Thanks for the information and the invite to chat with you and Jean -- we'll take you up on that. We've spoken with Linda Beth Schilling here at NIST, who's championing the IRB process for biologicals here at NIST to give her an update on the questions that arose at last week's meeting, and we're now comfortable with the HapMap consent for NA12878. which we'll plan to use as our pilot reference material.

At this point, the PGP samples are very attractive for the balance of the reference material portfolio, as there is a broad consent for commercial use, and primary tissues and IPSCs available for some of them.

We'd like to follow up with Jean and you to learn more of the history of the re-consenting of the HapMap samples from the CEPH/NIGMS collection (this mostly so we can answer questions at NIST), and to review any ELSI considerations you might see for our reference material project.

Best regards,
Marc and Justin

Marc Salit, Ph.D.
Leader, Multiplexed Biomolecular Science Group
Materials Measurement Laboratory
100 Bureau Drive, Stop 8313
Gaithersburg, MD 20899-8313


On Aug 21, 2012, at 2:15 PM, Brooks, Lisa (NIH/NHGRI) [E] wrote:

The CEU consent form is at

All the CEU samples were reconsented specifically to be in the HapMap project and future projects, and to have their data released publicly on the web. The cell lines may be used for RNA and protein studies. Commercial research and forensic research are allowed.

The form says that „However, it will be very hard for anyone to learn anything about you personally from any of this research because none of the samples, the database, or the HapMap will include your name or any other information that could identify you or your family.‰
The possibility of accumulated genetic information being used eventually to identify people is very general for genomic research, but does not prevent use of samples, by NIH policy.

The one possible sticky point is „The Repository does not let anyone sell material from samples or cell lines.‰ This was meant to prevent secondary distribution by companies, not necessarily to prevent a government agency from distributing a standard.

Jean McEwen, the NHGRI ELSI expert who oversaw the HapMap consent process, is out this week but will return on July 27. We should talk with her when she returns.

Best regards, Lisa.


From: Mark Depristo
Sent: Tuesday, August 21, 2012 12:45 PM
Subject: NA12878 consent?

Hi all,

At the Genomes in a Bottle meeting last week George Church (CC'd) suggested that NA12878 might not consented for commercial activities. This seems in direct opposition to her being included in HapMap and 1000 Genomes, as well as her cell lines being sold by Coriell. Is her cell line restricted to research purposes only? If so this is a great surprise to me. It's critical to resolve this issue as there's some discussion that NA12878 could not be used as a national reference standard genome because of this issue.


Mark A. DePristo, Ph.D.
Associate Director, Medical and Population Genetics Analysis
Broad Institute of MIT and Harvard